Small Language Models: Efficient Arm Computing Enables a Custom AI Future
As AI pivots from the colossal to the compact, small language models (SLMs) offer tailored solutions with reduced costs and increased accessibility
Increasingly in the world of AI, small is big.
Large language models (LLMs) have driven the early innovation in generative AI in the past 18 months, but there’s a growing body of evidence that the momentum behind unfettered scaling of LLMs – now pushing trillions of parameters to train on – is not sustainable. Or, at the very least, the infrastructure costs to push this approach to AI further are putting it out of reach for all but a handful. This class of LLM requires a vast amount of computational power and energy, which translates into high operational costs. Training GPT-4 cost at least $100 million, illustrating the financial and resource-heavy nature of these projects.
Not to mention, these LLMs are complex to develop and deploy. A study from the University of Cambridge points out companies might spend over 90 days to deploy a single machine learning model. This long cycle hampers rapid development and iterative experimentation, which are crucial in the fast-evolving field of AI.
These and other challenges are why the development focus is shifting towards small language models (SLMs or sometimes small LLMs), which promise to address many of these challenges by being more efficient, requiring fewer resources, and being easier to customize and control. SLMs like Llama, Mistral, Qwen, Gemma, or Phi3 are much more efficient at simpler, focused tasks like conversation, translation, summarization, and categorization as compared to sophisticated or nuanced content generation and, as such, consume a fraction of the energy for training.
To read the full article, click here
Related Semiconductor IP
- LPDDR6/5X/5 PHY V2 - Intel 18A-P
- MIPI SoundWire I3S Peripheral IP
- LPDDR6/5X/5 Controller IP
- Post-Quantum ML-KEM IP Core
- MIPI SoundWire I3S Manager IP
Related Blogs
- How Arm is making it easier to build platforms that support Confidential Computing
- Arm GPUs built on new 5th Generation GPU architecture to redefine visual computing
- How Standards Are Unleashing the Power of DPUs for Cloud Computing
- Reviewing different Neural Network Models for Multi-Agent games on Arm using Unity
Latest Blogs
- ML-DSA explained: Quantum-Safe digital Signatures for secure embedded Systems
- Efficiency Defines The Future Of Data Movement
- Why Standard-Cell Architecture Matters for Adaptable ASIC Designs
- ML-KEM explained: Quantum-safe Key Exchange for secure embedded Hardware
- Rivos Collaborates to Complete Secure Provisioning of Integrated OpenTitan Root of Trust During SoC Production
