Scalable Cache Coherency

Overview

Scalable cache coherency solution for many-core architecture:
* up to 1024 fully coherent cores.
* optimized protocol with false sharing prevention
* configurable for heterogeneous multi-cores with distributed L2 and adaptive L3 caches.

Key Features

  • Cache coherent NoC
    • Efficient Multi-die implementation support
  • RWT(Relaxed-Write-Through) protocol
  • Private L1 & shared L2 Cache controllers
  • Downstream interface of L2 cache compatible with AMBA CHI/ACE
  • Adaptive L3 cache
  • Atomics support equivalent to recent ARM/RISCV platforms
    • SWAP, ADD, AND, OR, XOR, MAX (signed), MAX Unsigned, MIN (signed), MIN Unsigned, LL(Linked Load)/SC (Store Conditional)
    • 32 bits/64 bits integer support
  • 2D implementation on 28nm FDSOI
  • 3D implementation on the INTACT silicon prototype: A High-Performance Processor with 6 Chiplets of 16 cores each -3D stacked on an active interposer.

Benefits

  • Quasi-linear speedup with respect to the number of cores.
  • Area-efficient with a Coherency area impact <2% for 34 Mbyte caches for 96 cores
  • Energy-efficient with a Coherency traffic < 1% of the power budget of all the interconnect system
  • High Scalability :
    • Up to 1024 fully coherent cores
    • Distributed directory with efficient Sharing set representation in Home Nodes
    • 4-channel NoC with only 2 channels caring data
  • Performant Fine-grained parallelism
    • Low hardware cost coherency protocol
    • Low Latency exchange
    • Prevent false sharing problems
    • Adaptive data redundancy

Block Diagram

Scalable Cache Coherency Block Diagram

Applications

  • Solution dedicated for many-core and accelerator architectures for :
    • Automotive and Transport
    • Embedded systems and set-top boxes
    • Health and Well being
    • High-end Computing

Technical Specifications

×
Semiconductor IP