Formalising CXL Cache Coherence

By Chengsong Tan, Alastair F. Donaldson, John Wickerson (Imperial College, London, UK)

Abstract.

We report our experience formally modelling and verifying CXL.cache, the inter-device cache coherence protocol of the Compute Express Link standard. We have used the Isabelle proof assistant to create a formal model for CXL.cache based on the prose English specification. This led to us identifying and proposing fixes to several problems we identified as unclear, ambiguous or inaccurate, some of which could lead to incoherence if left unfixed. Nearly all our issues and proposed fixes have been confirmed and tentatively accepted by the CXL consortium for adoption, save for one which is still under discussion. To validate the faithfulness of our model we performed scenario verification of essential restrictions such as “Snoop-pushes-GO”, and produced a fully mechanised proof of a coherence property of the model. The considerable size of this proof, comprising tens of thousands of lemmas, prompted us to develop new proof automation tools, which we have made available for other Isabelle users working with similarly cumbersome proofs.

1. Introduction

Compute Express Link (CXL) is an emerging standard that provides cache coherence across multiple devices connected along a PCIe bus. Inter-device cache coherence is a boon to computer architects because it allows multiple devices to communicate with each other while transferring a minimal amount of data between them. CXL has the potential to be faster than other memory expansion methods and save stranded memory in cloud computing clusters.

CXL is not the first standard for inter-device cache coherence, but it is the first to enjoy broad support across the computer industry, with backers including Alibaba, AMD, Arm, Broadcom, Cisco, Dell, Ericsson, Google, Hewlett Packard, Huawei, IBM, Intel, Meta, Microsoft, Nvidia, Oracle, Qualcomm, Samsung, Synopsys, Xilinx, and many others.

The CXL standard is large, complex, new, and is set to form a trusted pillar of datacenter computers for years to come. As such, now is the ideal time to study the standard intensively. Does it contain inconsistencies? Is the wording unambiguous throughout? And perhaps most importantly: does it actually provide its stated guarantee of inter-device cache coherence?

We report here on our efforts to answer those questions.

To read the full article, click here

×
Semiconductor IP