Push-Button NoCs for SoCs
SoC designers need a “one-stop shop” tool to explore the solution space and address all their interconnect challenges.
By Andy Nightingale, Arteris IP
Unique Implementation Challenges
Today’s system-on-chip (SoC) devices may be composed of hundreds of functional blocks known as intellectual property (IP) blocks. Each of these IPs can contain hundreds of millions of transistors. Standard IPs like processors, memory, codecs and communication functions are usually acquired from third-party vendors. By procuring conventional IP, design teams can focus on the in-house development of special functions like inference engines for artificial intelligence (AI) and machine learning (ML) applications. These internally developed IPs are the “secret sauce” that will differentiate this SoC from competitive offerings.
SoC developers are faced with unique implementation challenges with respect to communication between all the IPs. The issues start with the fact that multiple interface protocols have been defined and adopted by the SoC industry (OCP, APB, AHB, AXI, STBus, DTL, etc.), and every IP may employ a different protocol. Also, each IP’s interface may have a different datapath width and frequency. Legacy bus-based and crossbar switch implementations cannot handle the complexity of modern SoC architectures. The solution is to use a network-on-chip (NoC). What many people fail to realize is that a NoC is an IP, albeit one that spans the entire SoC.
Image courtesy of Arteris, Inc.
Sockets, Switches, Buffers and Pipeline Registers
In the case of a NoC, functions called sockets are associated with each IP interface. Each socket acts as an intermediary between its IP and the NoC. Some IPs act as initiators (generators of data) while others act as targets (receivers of data). In many cases, the same IP can assume both roles at different times. At the initiator end, the socket will translate the IP’s protocol into a common packetized and serialized transport mechanism. When the packet arrives at its destination, the target’s socket will translate it back into the protocol favored by the target IP.
Vast numbers of packets can be in flight at the same time. In addition to the sockets, which are placed near their associated IP blocks, the NoC also involves switches, buffers and pipeline registers, whose placement is a complex activity determined by many factors.
Exploring the Solution Space
Circa the 1960s and 1970s, many software developers—especially those creating the precursors to what we would now regard as embedded systems—captured their programs in low-level symbolic code called assembly language. Each processor had its own assembly language, and there was a strong correspondence between the mnemonic instructions in the language and the processor’s underlying instruction set architecture (ISA).
Even when the C programing language started to make its presence felt in the mid-1970s, many programmers working with microprocessor units (MPUs) and microcontroller units (MCUs) preferred to use assembly language because they believed that the resulting machine code would run faster, be smaller and more memory efficient.
What they failed to realize was that a C programmer could create the same application in a fraction of the time. The implications of this speed went far beyond engineer productivity. A developer writing in assembly language often had time to create only a single implementation. This meant that, after considering the task at hand, they had to pick the optimum solution the first time. By comparison, a programmer working in the C domain could quickly and easily explore the solution space, creating, comparing and contrasting a variety of implementations to determine the optimum solution for each application.
The same situation applies in the world of NoCs. Some development teams decide to handcraft custom NoCs on a project-by-project basis. In addition to being time-consuming and prone to error, this means they are obliged to debug and verify the NoC and the rest of the design simultaneously. Since they are implementing the NoC by hand, they are required to make certain architectural decisions up front. Should the foundational NoC topology be a star, ring, torus, tree, mesh or a combination thereof? For those creating a NoC, it’s very hard to change course and becomes more challenging to rectify issues later in the development cycle.
Push-Button NoCs
The alternative is to employ a robust, proven, off-the-shelf third-party NoC IP solution, such as the de facto industry standard FlexNoC Interconnect IP from Arteris IP. This is classed as “push-button” because the user informs the FlexNoC GUI of the numbers and locations of the IP blocks, the needed interface protocols and the desired NoC topology. With this input, a single press of a button results in the generation of the corresponding NoC in register transfer language (RTL).
When it comes to exploring the solution space, FlexNoC is in a class of its own. Let’s assume that the user has instructed the FlexNoC GUI to implement a ring as the topology. The next step will be to generate a verification testbench. Once again, this is a push-button operation because the FlexNoC system already has access to all the required information. In this case, it will instantiate transactors and targets and use them to generate the desired test vectors.
The system’s capability to generate various traffic profiles is unlike running code on a CPU or GPU IP. The profiles are associated with the different types of IP running separate applications like short bursts of reads and writes when accessing caches, or long bursts of reads when loading texture maps into GPUs.
FlexNoC also supports the automatic insertion of probes that allow all the packets of data moving around the NoC to be monitored in real-time. The results from the testbench can be displayed in the form of a graphical “heat map” indicating any points of congestion.
It may be that the originally selected ring topology is not ideally suited to this application. Would a tree structure work better? This is the clever part because the IP blocks don’t change, which means the associated verification testbench need not change. All that is required is to instruct the FlexNoC GUI to employ a tree topology and generate this new NoC realization at the push of a button.
But Wait, There’s More!
Everything presented here has barely touched on the various possibilities. For example, it’s possible to have multiple NoCs in the same SoC, each with its own topology. For instance, a mesh may be used in a multi-tile inferencing engine and a hierarchical tree topology in the rest of the device. It’s also feasible to opt for source synchronous communications and virtual channel (VC) links and to broadcast multicast packets to accommodate extreme bandwidth applications.
High-bandwidth links may be short and wide to facilitate throughput, while lower-bandwidth links may be long and thin to minimize congestion. FlexNoC offers support for flexible power domain partitioning, which enables the powering down of unused portions of the NoC, and smart clock gating that clocks only the logic actively processing a packet, without introducing any extra latency.
Image courtesy of Arteris, Inc.
Developing today’s SoCs is challenging, but the complexities can be mitigated by employing a robust, proven NoC IP solution. SoC designers need a “one-stop shop” NoC tool to explore the solution space and address all their interconnect challenges. One such solution is the predominant industry standard FlexNoC interconnect IP from Arteris IP.
Andy Nightingale, VP of product marketing at Arteris IP, has over 35 years of experience in the high-tech industry, including 23 years spent on various engineering and product management positions at Arm.
Related White Papers
- How NoCs ace power management and functional safety in SoCs
- SignatureIP's iNoCulator Tool - a Simple-to-use tool for Complex SoCs
- Select the Right Microcontroller IP for Your High-Integrity SoCs
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience