TEST. 06 Inside Subquadratic · The SubQ launch, with receipts
Real company.
Unverified claims.
A new Miami AI lab, twenty-nine million in seed funding, an SEC Form D on file, an eighteen-and-a-half-million-dollar GPU contract on a NASDAQ-listed counterparty, and a launch deck that claims a twelve-million-token context window at a thousandth of frontier compute. None of the eye-popping numbers have been independently reproduced. This page is the receipts.
On May 5, 2026, Subquadratic emerged from stealth with the first new "post-Transformer" architecture pitch since DeepSeek Sparse Attention. The company is real. The funding is real. The press cycle is real. The benchmark numbers are partially third-party verified, mostly self reported, and entirely run-once. Below: the verifiable, the unverified, and why the AI community is pattern matching this launch to Magic.dev.
§ I TEST. 06.1 · The company
What is on file. And what comes with an asterisk.
Before any benchmark argument, the threshold question is whether the company exists at all. Subquadratic clears that bar comfortably. It filed an SEC Form D in February 2026, incorporated as Subquadratic AI, headquartered in Miami. It has a CEO with a twenty-five-year operating record, a CTO with a verifiable engineering background, a live careers page on Rippling, and a research presence at AAAI and ICLR 2026. The company is not a shell.
The uncomfortable parts are not the existence questions. They are the composition questions. Most of the founding team is not ML-research-credentialed in the usual sense. The lead investors lean consumer and marketplace, not foundation-model. The eleven PhD researchers reportedly on staff are unnamed. Each detail in isolation is innocent. The accumulation is what registers.
The honest split is two columns. Real receipts on the left. Real asterisks on the right. Both columns are accurate. Neither cancels the other.
Receipts · verifiable on file
- SEC Form D filedFebruary 2026. Public record via StreetInsider. Confirms the offering exists.
- $29M seed roundReported by The New Stack at a ~$500M valuation. Closed before stealth-exit.
- CEO Justin DangelTwenty-five-year operator. Founded Voter.com (1998), Goji (2008), co-founded Firefly Health and Ready Responders (GV / Founders Fund backed).
- CTO Alex WhedonEx-Meta software engineer. Former Head of Generative AI at TribeAI. NeurIPS / AAAI / ICLR 2026 attendee.
- $19.6M GPU contractTwenty-four-month bare-metal Blackwell B300 rental from Digi Power X (NASDAQ: DGXX). Signed April 20, 2026. Effective May 15, 2026. 15% upfront.
- Live hiring pipelineRippling careers page open through May 2026. Roles include technical copywriter, sales, ML research.
- Research happy hoursHosted at AAAI 2026 and ICLR 2026 in Rio under the Subquadratic banner before stealth exit.
Asterisks · what's missing
- No ML-research CEOJustin Dangel's record is healthcare, insurance, consumer. No publication record, no prior AI work.
- 11 PhD researchers, unnamedSelf-reported headcount with backgrounds at Meta, Google, Oxford, Cambridge, ByteDance, Adobe, Microsoft, BYU. Individual identities not publicly enumerated.
- Investor base is consumerLead names: Justin Mateen (Tinder, JAM Fund), Javier Villamizar (ex-SoftBank), Grant Gittlin (Lasagna), Jaclyn Rice Nelson. No a16z, no Sequoia, no Founders Fund, no Index, no Khosla.
- Closed weights, no arXivWill not open-weights "in the near term." A "technical paper" is referenced in marketing but not findable on arXiv as of May 2026.
- Single third-party verifierProduction benchmarks confirmed by one unnamed third party. The bigger research numbers are self-reported.
- Stock-moving infra revealThe Digi Power X contract moved DGXX +19% on announcement. The disclosure benefits both counterparties.
- "Run once" caveatSubquadratic's own paper notes each comparator was evaluated a single time per setting "due to high inference cost."
- Incorporation
- Subquadratic AI · Miami, FL
- SEC Form D filed
- February 2026
- Stealth exit
- May 5, 2026
- Seed round
- $29M · ~$500M valuation (The New Stack)
- Headcount (self-reported)
- ~11 PhD researchers + execs + sales
- GPU contract
- $19.6M · 24 months · Blackwell B300 (DGXX)
- Open weights
- None planned in near term
- arXiv paper
- Not found as of May 2026
§ II TEST. 06.2 · The numbers
The headline panel. Then the asterisk on each line.
- Architecture (homepage wording)
- "Fully sub-quadratic sparse-attention" · O(n) claim, not just subquadratic
- Throughput
- 150 tokens per second
- Cost framing
- "1/5 of competing LLMs"
- Context size, in code
- Python 3.13 standard library ≈ 5.1M tokens
- Context size, in PRs
- ~1,050 React PRs (six months) ≈ 7.5M tokens
- SubQ Code claim
- "~25% lower bill, 10× faster exploration" vs Claude Code / Codex / Cursor
- API surface
- OpenAI-compatible endpoints · streaming · tool use
- Technical report
- "coming soon" (still not on arXiv as of file date)
The cluster of numbers above is what triggered the "AI Theranos" reaction on X. Each one sounds extraordinary on first read. Each one carries a different status under inspection. The third-party verifier confirmed two production scores. Everything else, including the most quoted figures, sits in the self-reported, single-run column. That distinction matters more than any individual headline.
The cleanest read of the benchmark sheet is a side-by-side bar chart of MRCR v2, the multi-reference long-context retrieval benchmark that frontier models currently disagree on most. SubQ's production score on MRCR v2 (third-party verified) is 65.9. Their research-config score on MRCR v2 (self-reported) is 83.0. The delta between those two numbers (seventeen points on the same benchmark for the same family of models) is itself larger than the gap between most frontier models' end-to-end results.
The prior is not that the production number is wrong. The prior is that any seventeen-point self-vs-third-party gap on a reproducible long-context benchmark gets independently reproduced before it is treated as load bearing. No reproduction yet exists.
Frontier models on MRCR v2 cluster in a wide band. GPT-5.5 leads the named comparators at 74. Claude Opus 4.7 sits at 32.2 and Gemini 3.1 Pro at 26.3, in part because MRCR v2 is unusually hostile to general-purpose long-context behavior. The headline result for SubQ's production model (65.9) is below the top of that band but well above the rest, which is plausible for a long-context-specialized architecture. The headline result for SubQ's research config (83) is nine points above the strongest frontier reference, on a benchmark where the differences between leading models are usually measured in single digits.
The four other widely-cited scores carry similar caveats:
The other four scores, with status
RULER 128K = 95.0 · production model, third-party
verified. Plausible. Frontier models cluster around 94 to 95 at this
length; Claude Opus 4.6 reportedly sits at 94.8.
RULER 128K = 97.1 · research config, self-reported.
Above frontier but on a benchmark that is increasingly saturated;
not by itself extraordinary.
SWE-Bench Verified = 81.8 to 82.4 · self-reported,
Subquadratic concedes "harness as much as model." The number reflects
the agent scaffolding around the model as much as the model itself.
NIAH 92.1% at 12M tokens · self-reported.
Needle-in-a-haystack is the easiest long-context benchmark by
construction; a sparse-attention design that handles position
encoding correctly should clear it. Impressive, not extraordinary.
52.2x faster than FlashAttention at 1M · speedup
claim plausible for sparse vs. dense at 1M. The relevant question
is whether quality holds at scale, not the speedup itself.
~1000x compute reduction at 12M · self-reported.
Mathematically possible if attention is truly linear. Magic.dev
claimed almost the same number in 2024 and never delivered an
externally validated model.
- RULER 128K (production, third-party)
- 95.0
- RULER 128K (research, self)
- 97.1
- MRCR v2 (production, third-party)
- 65.9
- MRCR v2 (research, self)
- 83.0
- SWE-Bench Verified (self)
- 81.8 to 82.4 · "harness as much as model"
- NIAH at 12M tokens (self)
- 92.1%
- Speedup vs FlashAttention at 1M
- 52.2× (self)
- Run count per benchmark
- 1
§ III TEST. 06.3 · The architecture
SSA, in one diagram. And the graveyard around it.
Sparse attention is not new. Longformer, BigBird, Reformer, and DeepSeek Sparse Attention all narrow the cost of attention by letting each query look at only a subset of keys. The hard part is not "be sparse." The hard part is choosing which keys to keep cheaply enough that the choosing step is itself subquadratic, and keeping quality competitive with dense at frontier scale, and not smuggling a quadratic layer back in to make the whole thing work. Three constraints. Most prior work has hit two.
SubQ claims all three. It calls its mechanism "Subquadratic Selective Attention," pitched as content-dependent and pure (no hybrid quadratic layers). If true, it would be genuinely novel and structurally different from the prior families. If not, it is a marketing reframing of well-studied techniques. Without an arXiv preprint or open weights, the public has the marketing but not the mechanism.
The reason this matters is that the field has a long ledger of claims that did not survive scrutiny. Mamba and RWKV are linear but underperform dense at frontier scale. Mamba-attention hybrids are quadratic in the limit because of their attention layers. Kimi Linear and DeepSeek Sparse Attention are quadratic in the implementations that actually ship, with only constant-factor speedups. The widely-cited LessWrong post on this topic is the prior that technically literate observers are starting from. SubQ's marketing is essentially a direct rebuttal of that prior. Rebuttals require evidence the public can examine.
The subquadratic-attention graveyard
Mamba (2023) · selective state-space model.
Linear in sequence length. Underperforms dense attention at
frontier scale. Survives in research but not as a frontier
replacement.
RWKV · recurrent linear-attention variant.
Same story. Not used in any frontier model except as a hybrid.
Mamba-attention hybrids · quadratic in the
limit, because the attention layers themselves remain quadratic.
The "linear" branding describes the part of the model that is
not the bottleneck.
Kimi Linear · advertised linear, implemented
with constant-factor speedups in practice.
DeepSeek Sparse Attention · the "indexer trap."
Sparsifying attention is straightforward; an indexer that picks
which tokens to attend to without itself becoming O(N²) is
not. DSA hit this directly.
- Architecture name
- Subquadratic Selective Attention (SSA)
- Type
- Content-dependent sparse, non-hybrid
- Selection step
- Claimed subquadratic (no "indexer trap")
- Hybrid quadratic layers
- None (per marketing)
- Public mechanism
- Marketing only · no arXiv preprint
- Open weights
- Not planned
§ IV TEST. 06.4 · The Magic.dev parallel
The cautionary tale, side by side.
In August 2024, Magic.dev announced a one-hundred-million-token context model called LTM-2-mini, claimed roughly thousand-times efficiency advantage, and raised more than five hundred million dollars including from Eric Schmidt and Jane Street. As of early 2026, twenty-one months later, there is no public evidence the model is used outside Magic itself. No externally usable product. No arXiv preprint. No third-party benchmarks. No reproductions.
The structural similarity to SubQ's pitch (long context, ~1,000x claim, no open weights, partnership with a GPU provider as the main public proof point, headline numbers no third party can run) is what triggered the "AI Theranos" framing on launch day. The parallel is not a verdict. It is a prior. The two timelines, side by side, make the resemblance hard to miss.
Where the parallel is not exact
Subquadratic is on better footing than Magic.dev was, in three
ways. 1. Production benchmarks are at least
third-party verified, if by a single unnamed party. Magic never
offered that. 2. The CEO has a long verifiable
operating record. Magic.dev's leadership was less established at
announcement. 3. SubQ is at least planning an API
with early-access sign-ups, which gives the public a path to
independent reproduction in the medium term. Magic positioned
itself as an internal-tool play from the start.
The parallel is on the shape of the claims and the public verifiability of those claims, not on the legitimacy of the companies.
- Magic.dev announce
- August 2024
- Magic.dev claim
- 100M context · ~1,000x efficiency
- Magic.dev funding
- $500M+
- Magic.dev externally validated model
- None as of May 2026
- SubQ announce
- May 5, 2026
- SubQ claim
- 12M context · ~1,000x compute reduction
- SubQ funding
- $29M seed · ~$500M valuation
- SubQ public reproductions
- None as of May 6, 2026
§ V TEST. 06.5 · The smell test
Each claim, scored. Plausible, suspicious, or implausible.
Verdicts below are not allegations. They are read priors. Each claim has a status (third-party verified, self-reported, or single-run only) and a smell-test rating against what is structurally plausible for the architecture being described. Nothing here would be load bearing if the technical paper landed on arXiv tomorrow. As of the file date, it has not.
Production model, third-party verified. RULER is approaching saturation at 128K for top frontier models; a small lead for a long-context-specialized model is believable.
Third-party verified. Below GPT-5.5's 74. Internally consistent and structurally plausible for a long-context-specialized architecture beating general-purpose models on the benchmark hardest on them.
Plausible for a sparse architecture vs. dense FlashAttention at 1M. Speedup is exactly what sparse attention is supposed to do. The relevant question is whether quality holds, not whether the speedup is real.
Self-reported. Nine points above GPT-5.5. The seventeen-point gap from the same team's production score (65.9) is itself wider than most differences between frontier models on the same benchmark. Wants reproduction.
Within two points of mid-tier frontier. The company concedes the result is "harness as much as model." Note: subq.ai itself lists Opus 4.7 at 87.6 in its comparison band, which is five points above SubQ on the same benchmark. The marketing language ("outperforms Opus 4.7") describes a different benchmark, not SWE-Bench.
The subq.ai homepage lists each comparator with a number that SubQ does not beat on at least one of the cited benchmarks. SubQ's 81.8 SWE-Bench is below Opus 4.7's 87.6. SubQ's 65.9 MRCR v2 is below GPT-5.5's 74.0. The "outperforms" framing is technically true on a cherry-picked-per-model basis, not on a same-benchmark basis.
Self-reported. Needle-in-a-haystack is the easiest long-context benchmark and is widely considered close to solved at any context if position encoding is correct. 92% at 12M is impressive; not extraordinary for a sparse-attention design.
Self-reported, single-run, on a closed model with no public weights. Mathematically possible if attention is truly linear, but Magic.dev claimed almost the same number twenty-one months ago and never delivered an externally validated model.
Marketing-style cost claims dependent on internal pricing, not architectural facts. Cost ratios depend on what you bill for and how you bill it. Useful for headlines, not for engineering decisions.
The pattern is consistent. Production-mode, third-party-verified numbers are believable and roughly in line with what you would expect from a long-context specialist. Research-mode, self-reported, single-run numbers are exactly the cluster the field has learned to treat as marketing until a public reproduction lands. Both reads can be true at the same time. They are.
- Plausible (third-party or low-risk)
- 3 of 9 cited claims
- Suspicious or caveat-bound
- 6 of 9 cited claims
- Outright implausible on current evidence
- 0 of 8
- Independent reproductions
- None as of May 6, 2026
- arXiv paper
- Not found
§ VI Methodology, sources, caveats
Method
Five sections, each pinning a SubQ claim or claim cluster to a verification status: third-party verified, self-reported, or single-run only. The two-column receipts panel in § I, the bench chart in § II, the side-by-side timeline in § IV, and the scorecard in § V are designed to make verification status legible at a glance.
The diagrams are hand-coded SVG, no charting library. The page is static HTML with vanilla JavaScript only for the dateline and a one-time bar-chart reveal animation. No analytics. No tracking.
Sources
Primary launch coverage: SiliconANGLE (May 5, 2026) and The New Stack (Frederic Lardinois, May 5, 2026). SubQ's own page is at subq.ai/introducing-subq.
Funding and incorporation: SEC Form D Subquadratic AI, Feb 2026, via StreetInsider. GPU contract: Digi Power X (NASDAQ: DGXX) press release April 20, 2026, reproduced by Stock Titan, ProactiveInvestors, and Data Center Dynamics. Magic.dev parallel: their public Aug 2024 LTM-2-mini announcement.
Subquadratic-attention prior art: the widely-circulated LessWrong post "Debunking claims about subquadratic attention" is the position SubQ's pitch is implicitly rebutting.
Caveats
This is a snapshot. The picture could change quickly. An arXiv preprint, an open API, or a public reproduction could land within days and reorder the verdict. Treat the page as a read on the May 5 to 6, 2026 launch window, not a final judgment.
Two RULER 128K numbers (95.0 and 97.1) and two SWE-Bench numbers (81.8 and 82.4) appear in coverage; these likely reflect different model configurations (production vs. research), but the public materials do not always clearly disambiguate.
The "$500M valuation" comes from one outlet (The New Stack) and is not corroborated by the SEC Form D, which reports the size of the offering but not the valuation directly.
What this lab is not
Not a hit piece. Subquadratic is a real company with a credentialed operating team, real funding, a real GPU contract, and at least partial third-party verification of its production benchmark scores. The objection is to the unverified headline numbers, not to the company's existence or intent.
Not a defense either. The published numbers should be treated as marketing pending third-party verification on a public API endpoint or an arXiv paper. As of the file date, only one unnamed third party has confirmed the production scores; the bigger research-mode numbers have no independent check.
Real company. Real funding. Real GPUs. Real benchmark numbers under one third-party check. And on top of that: a research-mode configuration, single-run, with claims on context, compute, and retrieval that no outsider can examine.
The honest read is the boring one. Subquadratic sits in the "real company, overhyped claims, reserve judgment" bucket. Closer to Magic.dev than to OpenAI or Anthropic. It is not a Theranos-style fraud on current evidence. It is also not yet a frontier AI lab with reproducible breakthroughs. It is somewhere in between, and the next few weeks of independent reproduction (or the absence of it) will decide which way it leans.
The launch was loud. The receipts are quiet. Watch the receipts.
FAQ
Is Subquadratic a real company?
Yes. Subquadratic AI is a Miami-based startup that filed an SEC Form D in February 2026 and emerged from stealth on May 5, 2026 with a reported $29M seed at a roughly $500M valuation. It has a credentialed CEO and CTO, an active hiring pipeline, an SEC filing on record, and a $19.6M, 24-month GPU rental contract with the NASDAQ-listed Digi Power X. It is not a Theranos-style shell or vaporware fraud. It is a brand-new, unproven, closed-weights lab.
What is SubQ?
SubQ is the brand name for Subquadratic's first model. The production preview, SubQ 1M-Preview, claims a one-million-token context window. A research configuration is claimed to extend to twelve million tokens, with fifty million targeted for Q4 2026. The architecture is called Subquadratic Selective Attention (SSA), pitched as a content-dependent sparse attention with a selection step that itself is subquadratic.
Are the SubQ benchmark numbers real?
Some are third-party verified. The production model's RULER-128K score of 95.0 and MRCR v2 score of 65.9 were confirmed by a single unnamed third party. The bigger numbers (research-mode MRCR v2 of 83.0, the 1,000-times compute reduction at 12M tokens, the 92.1 percent needle-in-a-haystack at 12M) are self-reported, single-run, on a closed model with no public weights, no arXiv paper, and no independent reproduction as of May 2026.
Why is the AI community skeptical of SubQ?
Three reasons. First, prior subquadratic-attention claims (Mamba, RWKV, Kimi Linear, DeepSeek Sparse Attention) have generally been either truly subquadratic but underperforming at frontier scale, or quadratic in actual implementation. Second, Magic.dev made a structurally identical pitch in August 2024 (one-hundred-million-token context, roughly thousand-times efficiency, more than five hundred million in funding) and has not produced an externally validated model in the twenty-one months since. Third, SubQ's own technical paper acknowledges every benchmark was run only once.
Is this a hit piece on Subquadratic?
No. The conclusion of this lab is that Subquadratic is a real, well-funded, recently launched closed-weights AI startup whose benchmark claims are unusually aggressive and have not yet been independently reproduced. It is not a Theranos-style fraud on current evidence, but the published numbers should be treated as marketing pending third-party verification on a public API endpoint or an arXiv paper. The company sits in the "real company with overhyped claims, reserve judgment" bucket: closer to Magic.dev than to OpenAI or Anthropic.