DeepSeek’s aftermath: Lessons to learn as the dust settles
The Chinese AI company DeepSeek took the technology industry, and Wall Street, by storm with its language model achieving a reported 10x higher efficiency than AI industry leaders. You have seen the news and might be getting sick of the endless articles tagging onto it, but I would like to offer a different perspective. DeepSeek claimed it used a cluster of 2,048 Nvidia H800 GPUs (as stated in their technical report). If this is true, they are using a lot less computing power than other leading AI players. That news hurt Nvidia’s stock price badly, as the implication is this reduces the need for compute… But is that really the case? Is this as obvious as it seems? And why haven’t some other tech companies reacted like the markets did?
As the dust settles on all the media around DeepSeek, surely there are plenty of lessons to be learned. One is that DeepSeek built and trained their models on top of already open-source models making use of investments already made by others. All that past spent compute can never be retrospectively restricted in the space of available open global trained models.
The fact that DeepSeek was able to achieve a similar performance as leading AI players with less hardware resources has given rise to a discussion comparing compute needs. If you can achieve this performance with less hardware resources and open-source models, do you even need more computing power? Well, even using model distillation, having access to limited compute resources required DeepSeek to heavily optimize their software (again, in their technical report).
According to DeepSeek, they used various techniques to optimize software to the limited hardware they had access to, helping them achieve these performance gains with less computing power. Of course, nothing comes for free and tailoring software to hardware typically reduces flexibility. As with everything else in life, you need to find the right balance.
To read the full article, click here
Related Semiconductor IP
- E-Series GPU IP
- Arm's most performance and efficient GPU till date, offering unparalled mobile gaming and ML performance
- 3D OpenGL ES GPU (Graphics Processing Unit)
- Highest performance automotive GPU IP, with revolutionary functional safety technology
- High-performance 2D (sprite graphics) GPU IP combining high pixel processing capacity and minimum gate count.
Latest Blogs
- Cadence Extends Support for Automotive Solutions on Arm Zena Compute Subsystems
- The Role of GPU in AI: Tech Impact & Imagination Technologies
- Time-of-Flight Decoding with Tensilica Vision DSPs - AI's Role in ToF Decoding
- Synopsys Expands Collaboration with Arm to Accelerate the Automotive Industry’s Transformation to Software-Defined Vehicles
- Deep Robotics and Arm Power the Future of Autonomous Mobility