LLMs On The Edge
Nearly all the data input for AI so far has been text, but that's about to change. In the future, that input likely will include video, voice, as well as other types of data, causing a massive increase in the amount of data that needs to be modeled and the compute resources necessary to make it all work. This is hard enough in hyperscale data centers, which are sprouting up everywhere to handle the training and some inferencing, but it's even more of a challenge in bandwidth- and power-limited edge devices. Sharad Chole, chief scientist and co-founder of Expedera, talks with Semiconductor Engineering about the tradeoffs involved in making this work, how to reduce the size of LLMs, and what impact this will have on engineers working in this space.
Related Semiconductor IP
- NPU IP Core for Edge
- NPU
- Specialized Video Processing NPU IP for SR, NR, Demosaic, AI ISP, Object Detection, Semantic Segmentation
- NPU IP Core for Mobile
- NPU IP Core for Data Center
Related Videos
- Ask the Experts: AI at the Edge
- Podcast: BrainChip’s IP for Targeting AI Applications at the Edge
- Weebit Nano on why the next NVM is ReRAM
- Amir Panush, CEO of Ceva, on 20-Billion-Device Milestone, Edge AI, and NPUs
Latest Videos
- Powering the AI Supercycle: Design for AI and AI for Design - Anirudh Devgan
- Scaling AI from Edge to Data Center with SiFive RISC-V Vectors
- Paving the Road to Datacenter-Scale RISC-V
- Enhancing Data Center Architectures with PCIe® Retimers, Redrivers and Switches
- How UCIe 3.0 Redefining Chiplet Architecture: From Protocol to Platform