Parsing the Mindboggling Cost of Ownership of Generative AI
By Lauro Rizzatti, VSORA 
 EETimes (November 2, 2023)
The latest algorithms, such as GPT-4, pose a challenge to the current state-of-the-art processing hardware, and GenAI accelerators aren’t keeping up. In fact, no hardware on the market today can run the full GPT-4.
Current large language model (LLM) development focuses on creating smaller but more specialized LLMs that can run on existing hardware is a diversion. The GenAI industry needs semiconductor innovations in computing methods and architectures capable of delivering performance of multiple petaFLOPS with efficiency greater than 50%, reducing latency to less than two second per query, constraining energy consumption and shrinking cost to 0.2 cent per query.
Once this is in place–and it is only matter of time–the promise of transformers when deployed on edge devices will be fully exploited.
To read the full article, click here
Related Semiconductor IP
- LPDDR6/5X/5 PHY V2 - Intel 18A-P
- MIPI SoundWire I3S Peripheral IP
- LPDDR6/5X/5 Controller IP
- Post-Quantum ML-KEM IP Core
- MIPI SoundWire I3S Manager IP
Related White Papers
- MIPI in next generation of AI IoT devices at the edge
- Revolutionizing Consumer Electronics with the power of AI Integration
- The benefit of non-volatile memory (NVM) for edge AI
- Revolutionizing AI Inference: Unveiling the Future of Neural Processing
Latest White Papers
- Attack on a PUF-based Secure Binary Neural Network
- BBOPlace-Bench: Benchmarking Black-Box Optimization for Chip Placement
- FD-SOI: A Cyber-Resilient Substrate Against Laser Fault Injection—The Future Platform for Secure Automotive Electronics
- In-DRAM True Random Number Generation Using Simultaneous Multiple-Row Activation: An Experimental Study of Real DRAM Chips
- SPAD: Specialized Prefill and Decode Hardware for Disaggregated LLM Inference