Formalising CXL Cache Coherence
By Chengsong Tan, Alastair F. Donaldson, John Wickerson (Imperial College, London, UK)
Abstract.
We report our experience formally modelling and verifying CXL.cache, the inter-device cache coherence protocol of the Compute Express Link standard. We have used the Isabelle proof assistant to create a formal model for CXL.cache based on the prose English specification. This led to us identifying and proposing fixes to several problems we identified as unclear, ambiguous or inaccurate, some of which could lead to incoherence if left unfixed. Nearly all our issues and proposed fixes have been confirmed and tentatively accepted by the CXL consortium for adoption, save for one which is still under discussion. To validate the faithfulness of our model we performed scenario verification of essential restrictions such as “Snoop-pushes-GO”, and produced a fully mechanised proof of a coherence property of the model. The considerable size of this proof, comprising tens of thousands of lemmas, prompted us to develop new proof automation tools, which we have made available for other Isabelle users working with similarly cumbersome proofs.
1. Introduction
Compute Express Link (CXL) is an emerging standard that provides cache coherence across multiple devices connected along a PCIe bus. Inter-device cache coherence is a boon to computer architects because it allows multiple devices to communicate with each other while transferring a minimal amount of data between them. CXL has the potential to be faster than other memory expansion methods and save stranded memory in cloud computing clusters.
CXL is not the first standard for inter-device cache coherence, but it is the first to enjoy broad support across the computer industry, with backers including Alibaba, AMD, Arm, Broadcom, Cisco, Dell, Ericsson, Google, Hewlett Packard, Huawei, IBM, Intel, Meta, Microsoft, Nvidia, Oracle, Qualcomm, Samsung, Synopsys, Xilinx, and many others.
The CXL standard is large, complex, new, and is set to form a trusted pillar of datacenter computers for years to come. As such, now is the ideal time to study the standard intensively. Does it contain inconsistencies? Is the wording unambiguous throughout? And perhaps most importantly: does it actually provide its stated guarantee of inter-device cache coherence?
We report here on our efforts to answer those questions.
To read the full article, click here
Related Semiconductor IP
- Process/Voltage/Temperature Sensor with Self-calibration (Supply voltage 1.2V) - TSMC 3nm N3P
- USB 20Gbps Device Controller
- SM4 Cipher Engine
- Ultra-High-Speed Time-Interleaved 7-bit 64GSPS ADC on 3nm
- Fault Tolerant DDR2/DDR3/DDR4 Memory controller
Related White Papers
- Achieving cache coherence in a MIPS32 multicore design
- Using OCP and Coherence Extensions to Support System-Level Cache Coherence
- Learning Cache Coherence Traffic for NoC Routing Design
- Processor forum examines embedded cache, architectures
Latest White Papers
- Fault Injection in On-Chip Interconnects: A Comparative Study of Wishbone, AXI-Lite, and AXI
- eFPGA – Hidden Engine of Tomorrow’s High-Frequency Trading Systems
- aTENNuate: Optimized Real-time Speech Enhancement with Deep SSMs on RawAudio
- Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference
- Hardware Acceleration of Kolmogorov-Arnold Network (KAN) in Large-Scale Systems