Decoding Discontinuity

Decoding Discontinuity

TurboQuant and the Memory Stock Sell-Off: Why the Panic Outpaced the Paper

Why this efficiency gain is ultimately bullish for the memory chokepoint and the entire inference economy.

Raphaëlle d'Ornano's avatar
Raphaëlle d'Ornano
Mar 31, 2026
∙ Paid
Photo by Alexander Mils via Unsplash

A Google blog post about compressing AI memory went viral. Within 48 hours, the market capitalization of memory semiconductors evaporated by over $100 billion. TurboQuant helps solve a core problem of the Agentic era: making long-context LLM inference efficient. But the algorithm compresses only the inference-time cache, not the model weights, training data, or storage.

Keep reading with a 7-day free trial

Subscribe to Decoding Discontinuity to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2026 Raphaëlle d'Ornano · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture