DeepSeek’s aftermath: Lessons to learn as the dust settles
The Chinese AI company DeepSeek took the technology industry, and Wall Street, by storm with its language model achieving a reported 10x higher efficiency than AI industry leaders. You have seen the news and might be getting sick of the endless articles tagging onto it, but I would like to offer a different perspective. DeepSeek claimed it used a cluster of 2,048 Nvidia H800 GPUs (as stated in their technical report). If this is true, they are using a lot less computing power than other leading AI players. That news hurt Nvidia’s stock price badly, as the implication is this reduces the need for compute… But is that really the case? Is this as obvious as it seems? And why haven’t some other tech companies reacted like the markets did?
As the dust settles on all the media around DeepSeek, surely there are plenty of lessons to be learned. One is that DeepSeek built and trained their models on top of already open-source models making use of investments already made by others. All that past spent compute can never be retrospectively restricted in the space of available open global trained models.
The fact that DeepSeek was able to achieve a similar performance as leading AI players with less hardware resources has given rise to a discussion comparing compute needs. If you can achieve this performance with less hardware resources and open-source models, do you even need more computing power? Well, even using model distillation, having access to limited compute resources required DeepSeek to heavily optimize their software (again, in their technical report).
According to DeepSeek, they used various techniques to optimize software to the limited hardware they had access to, helping them achieve these performance gains with less computing power. Of course, nothing comes for free and tailoring software to hardware typically reduces flexibility. As with everything else in life, you need to find the right balance.
To read the full article, click here
Related Semiconductor IP
- E-Series GPU IP
- Arm's most performance and efficient GPU till date, offering unparalled mobile gaming and ML performance
- Highest performance automotive GPU IP, with revolutionary functional safety technology
- High performance GPU for cloud gaming with DirectX support
- Arm’s latest flagship GPU is based on the new 5th Gen GPU architecture, bringing the next generation of visual computing to mobile
Latest Blogs
- A Repeatable Framework for Hardware Security Assurance
- Inside the SiFive Performance™ P570 Gen 3: High Performance Efficiency for Next-Generation Consumer and Commercial Applications
- What the steam engine can teach us about modern chip design
- Automotive silicon in the era of AI, functional safety, and cybersecurity
- JPEG XS Officially Joins GenICam, The Machine Vision Standard Managed By EMVA