Google TurboQuant breakthrough shakes memory chip stocks amid AI shift

Yesterday 10:20
Google TurboQuant breakthrough shakes memory chip stocks amid AI shift
By: Dakir Madiha
Zoom

Google unveiled a new set of compression algorithms that sharply reduce the memory footprint of large language models, triggering an immediate selloff in memory and storage chip stocks.

The system includes TurboQuant, PolarQuant, and Quantized Johnson Lindenstrauss. These tools focus on the key value cache, a component used during AI inference to store frequently accessed data. Tests on open source models such as Gemma and Mistral showed that TurboQuant reduced memory usage to as little as 3 bits without additional training. The method also delivered up to eight times faster attention computation on Nvidia H100 GPUs compared with non quantized systems.

The announcement raised concerns among investors about reduced demand for memory hardware as AI systems become more efficient.

TurboQuant operates through a two step process. PolarQuant first converts standard data vectors into polar coordinates, replacing axis based values with radius and angle. This approach avoids the heavy normalization cost seen in traditional methods because angular distributions remain stable. Quantized Johnson Lindenstrauss then applies a one bit error correction layer. It projects residual quantization errors into a lower dimensional space without adding meaningful memory overhead.

Google Research described the approach as an optimal solution to memory overhead in vector quantization. The technique could extend beyond language models to large scale vector search systems.

Markets reacted quickly. Shares of major memory and storage firms fell despite gains in the Nasdaq 100. SanDisk posted the steepest drop, followed by declines in Micron, Western Digital, and Seagate. Equipment makers Lam Research and Applied Materials also recorded losses.

Morgan Stanley called the development a major shift in AI cost structures. The bank compared its potential impact to previous breakthroughs in the field and noted that it could benefit cloud providers and AI platforms. However, analysts said the long term effect on hardware demand may remain neutral to slightly positive. The technology applies only during inference and does not reduce training requirements. Lower deployment costs could expand AI adoption and support overall demand.

Google plans to present TurboQuant at the International Conference on Learning Representations in Rio de Janeiro from April 23 to 27. PolarQuant will be introduced at AISTATS 2026. The research was led by Amir Zandieh and Vahab Mirrokni in collaboration with KAIST and New York University.