Parsing the Mindboggling Cost of Ownership of Generative AI
By Lauro Rizzatti, VSORA
EETimes (November 2, 2023)
The latest algorithms, such as GPT-4, pose a challenge to the current state-of-the-art processing hardware, and GenAI accelerators aren’t keeping up. In fact, no hardware on the market today can run the full GPT-4.
Current large language model (LLM) development focuses on creating smaller but more specialized LLMs that can run on existing hardware is a diversion. The GenAI industry needs semiconductor innovations in computing methods and architectures capable of delivering performance of multiple petaFLOPS with efficiency greater than 50%, reducing latency to less than two second per query, constraining energy consumption and shrinking cost to 0.2 cent per query.
Once this is in place–and it is only matter of time–the promise of transformers when deployed on edge devices will be fully exploited.
Related Semiconductor IP
- RISC-V CPU IP
- AES GCM IP Core
- High Speed Ethernet Quad 10G to 100G PCS
- High Speed Ethernet Gen-2 Quad 100G PCS IP
- High Speed Ethernet 4/2/1-Lane 100G PCS
Related White Papers
- Designing AI enabled System with SOTIF (Safety Of The Intended Functionality)
- MIPI in next generation of AI IoT devices at the edge
- Revolutionizing Consumer Electronics with the power of AI Integration
- The benefit of non-volatile memory (NVM) for edge AI
Latest White Papers
- New Realities Demand a New Approach to System Verification and Validation
- How silicon and circuit optimizations help FPGAs offer lower size, power and cost in video bridging applications
- Sustainable Hardware Specialization
- PCIe IP With Enhanced Security For The Automotive Market
- Top 5 Reasons why CPU is the Best Processor for AI Inference