DeepSeek Launches V3.2-Exp: Revolutionizing Long Text Processing with DSA
Chinese AI startup DeepSeek has launched an experimental model, V3.2-Exp, featuring a novel sparse attention mechanism called DeepSeek Sparse Attention (DSA). The model, available on HuggingFace and supported by vLLM, promises significant improvements in training and inference for long contexts.
DSA, introduced in late September 2025, works by selecting relevant parts of long texts. This reduces the computing power required, making it more efficient. The model has shown promising results on benchmarks. On the Codeforces benchmark, V3.2-Exp scored 2121 points, slightly higher than the previous V3.1-Terminus model's 2046 points. However, on the MMLU-Pro benchmark, both models scored identically at 85.0 points, indicating that DSA maintains output quality while improving efficiency.
DeepSeek has also released open-source kernels like TileLang, DeepGEMM, and FlashMLA to help developers make the most of sparse attention technology. The inference code for local use is available, but it requires adjustments for GPU configuration and expert settings.
DeepSeek's V3.2-Exp model, with its innovative DeepSeek Sparse Attention, offers a significant leap in handling long texts efficiently. While maintaining the output quality of its predecessor, it scores slightly higher on certain benchmarks. The open-source kernels released by DeepSeek promise to further boost the use of this efficient technology.
Read also:
- Web3 gaming platform, Pixelverse, debuts on Base and Farcaster networks
 - Humorous escapade on holiday with Guido Cantz:
 - Expands Presence in Singapore to Amplify Global Influence (Felicity)
 - Amazon customer duped over Nvidia RTX 5070 Ti purchase: shipped item replaced with suspicious white powder; PC hardware fan deceived, discovers salt instead of GPU core days after receiving defective RTX 5090.