Vision Transformers Change the AI Acceleration Rules
Transformers were first introduced by the team at Google Brain in 2017 in their paper, “Attention is All You Need“. Since their introduction, transformers have inspired a flurry of investment and research which have produced some of the most impactful model architectures and AI products to-date, including ChatGPT which is an acronym for Chat Generative Pre-trained Transformer.
Transformers are also being employed for vision applications (ViTs). This new class of models was empirically proven to be viable alternatives to more traditional Convolutional Neural Networks (CNNs) in the paper “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale“, published by the team at Google Brain in 2021.
Vision transformers are of a size and scale that are approachable for SoC designers targeting the high-performance, edge AI market. There’s just one problem: vision transformers are not CNNs and many of the assumptions made by the designers of first-generation Neural Processing Unit (NPU) and AI hardware accelerators found in today’s SoCs do not translate well to this new class of models.
What makes Vision Transformers so special?
To read the full article, click here
Related Semiconductor IP
- HDMI 2.1 Quad-Pixel Tx Controller
- Quadruple Capacitor Switch
- 3-6 GHz to 0.075-3 GHz Quadrature former
- Quadrature Amplitude Modulation: Modulator and Demodulator
- 25 to 1750 MHz quadrature mixer
Related Blogs
- CNNs and Transformers: Decoding the Titans of AI
- From ChatGPT to Computer Vision Processing: How Deep-Learning Transformers Are Shaping Our World
- Vision Transformers Have Already Overtaken CNNs: Here’s Why and What’s Needed for Best Performance
- How Chip Startups Are Changing the Way Chips Are Designed
Latest Blogs
- Embedded Security explained: Advanced Encryption Standard (AES)
- Cadence Demonstrates PCIe 8.0 PHY at PCI-SIG DevCon 2026
- Cadence Achieves Successful Silicon Validation of 1st IP Test Chips on Intel 18A
- From Classical CAN and CAN FD to CAN XL: Functional Safety and Security for Next-Generation In-Vehicle Communication
- Accelerating Embedded Memory Performance with 16-bit xSPI PSRAM IP