LLMs On The Edge
Nearly all the data input for AI so far has been text, but that's about to change. In the future, that input likely will include video, voice, as well as other types of data, causing a massive increase in the amount of data that needs to be modeled and the compute resources necessary to make it all work. This is hard enough in hyperscale data centers, which are sprouting up everywhere to handle the training and some inferencing, but it's even more of a challenge in bandwidth- and power-limited edge devices. Sharad Chole, chief scientist and co-founder of Expedera, talks with Semiconductor Engineering about the tradeoffs involved in making this work, how to reduce the size of LLMs, and what impact this will have on engineers working in this space.
Related Semiconductor IP
- NPU IP Core for Edge
- NPU
- NPU IP Core for Mobile
- Specialized Video Processing NPU IP
- NPU IP Core for Data Center
Related Videos
- Ask the Experts: AI at the Edge
- Podcast: BrainChip’s IP for Targeting AI Applications at the Edge
- Weebit Nano on why the next NVM is ReRAM
- LLM Inference on RISC-V Embedded CPUs