QiMeng: Fully Automated Hardware and Software Design for Processor Chip

By Rui Zhang 1, Yuanbo Wen 1, Shuyao Cheng 1, Di Huang 1, Shaohui Peng 2, Jiaming Guo 1, Pengwei Jin 1, Jiacheng Zhao 1, Tianrui Ma 1, Yaoyu Zhu 1, Yifan Hao 1, Yongwei Zhao 1, Shengwen Liang 1, Ying Wang 1, Xing Hu 1, Zidong Du 1, Huimin Cui 1, Ling Li 2,3, Qi Guo 1, Yunji Chen 1,3
1 State Key Lab of Processors, Institute of Computing Technology, CAS 
2 Intelligent Software Research Center, Institute of Software, CAS 
3 University of Chinese Academy of Sciences

Abstract 

Processor chip design technology serves as a key frontier driving breakthroughs in computer science and related fields. With the rapid advancement of information technology, conventional design paradigms face three major challenges: the physical constraints of fabrication technologies, the escalating demands for design resources, and the increasing diversity of ecosystems. Automated processor chip design has emerged as a transformative solution to address these challenges. While recent breakthroughs in Artificial Intelligence (AI), particularly Large Language Models (LLMs) techniques, have opened new possibilities for fully automated processor chip design, substantial challenges remain in establishing domain-specific LLMs for processor chip design.

In this paper, we propose QiMeng, a novel system for fully automated hardware and software design of processor chips. QiMeng comprises three hierarchical layers. In the bottom-layer, we construct a domain-specific Large Processor Chip Model (LPCM) that introduces novel designs in architecture, training, and inference, to address key challenges such as knowledge representation gap, data scarcity, correctness assurance, and enormous solution space. In the middle-layer, leveraging the LPCM’s knowledge representation and inference capabilities, we develop the Hardware Design Agent and the Software Design Agent to automate the design of hardware and software for processor chips. Currently, several components of QiMeng have been completed and successfully applied in various top-layer applications, demonstrating significant advantages and providing a feasible solution for efficient, fully automated hardware/software design of processor chips. Future research will focus on integrating all components and performing iterative top-down and bottom-up design processes to establish a comprehensive QiMeng system.

I. INTRODUCTION

As the fundamental hardware platform for computing systems, processors and chips undertake critical functions including instruction execution, data processing, and resource management. These processors and chips power diverse devices ranging from personal computers, servers, smartphones, and Internet of Things (IoT) equipment, forming the technological foundation of modern digital economies. Processor chip design represents both a strategically important industry for national economic development and a cutting-edge research field that drives progress in computer science. As a highly complex and systematic task, processor chip design requires tight hardwaresoftware co-design to achieve functional requirements, along with optimizing performance, power, and area (PPA). These requirements make processor chip design one of the most challenging research topics across both industrial and academic domains.

The evolution of information technology has revealed three fundamental limitations in current processor chip design methodologies: constrained fabrication technological, limited resource, and diverse ecosystem. In the fabrication technological aspect, as semiconductor fabrication nears physical limits below 7nm nodes, phenomena such as quantum tunneling and short-channel effects become increasingly problematic, rendering conventional fabrication technology-based performance scaling ineffective, thereby necessitating design methodology innovations. From a resource perspective, conventional design flows demand extensive expertise and labor-intensive designverification iteration to ensure functional correctness while balancing competing design objectives such as PPA. This results in protracted development timelines and substantial costs. In the ecosystem aspect, emerging applications in Artificial Intelligence (AI), cloud, and edge computing require specialized architectures with customized foundational software support. Thus, conventional chip design approaches cannot meet the ecosystem challenge efficiently due to their inherent lengthy time and substantial cost requirements. To sum up, these challenges underscore the urgent need for novel design paradigms that can deliver enhanced performance, improved efficiency, and reduced costs while meeting diverse application requirements.

Automated processor chip design, which aims to automate the entire design and verification pipeline of processor chips, presents a promising solution to overcome the abovementioned limitations. By leveraging AI methodologies, automated processor chip design exhibits the potential to surpass manual design and achieve better performance under identical fabrication technology. Additionally, the automated processor design approach is capable of dramatically reducing manual intervention, significantly improving design efficiency while shortening development cycles and lowering costs. Furthermore, it enables rapid customization of chip architectures and software stacks tailored to specific application domains, addressing the growing demand for specialized computing solutions.

Recent breakthroughs in Large Language Models (LLMs) and Multi-Agent systems have created new opportunities for automated processor chip design. State-of-the-art LLMs such as DeepSeek-V3 [1], DeepSeek-R1 [2], Qwen3 [3], GPT4o [4], and Gemini 2.5 Pro [5] have demonstrated remarkable capabilities in question answering, planning, and reasoning, exhibiting the potential of artificial general intelligence (AGI). After post-training on domain-specific data, domainspecialized LLMs can be obtained and have shown impressive results across scientific disciplines such as computational biology [6], materials science, and chemistry [7]. More advanced LLM-based agents integrate cognitive abilities with the tooluse skill of LLMs to autonomously plan and execute complex workflows [8]. These developments of LLMs and agents suggest new pathways toward fully automated processor chip design.

Nevertheless, due to the distinctive nature of processor chip design, applying LLMs and agents to automated processor chip design faces four principal challenges: knowledge representation gap, data scarcity, correctness guarantee, and enormous solution space. First, the knowledge representation gap: critical processor chip design data employs graph structures, such as abstract syntax trees (ASTs), data flow diagrams (DFGs), and control flow diagrams (CFGs). Graph data exhibits an inherent semantic gap with the sequential text that LLMs typically process, constraining the capacity for domain knowledge representation and limiting the processor chip design capabilities of LLMs. Second, the data scarcity: unlike the vast petabytescale text corpora available on the Internet for training generalpurpose LLMs, processor chip design data are orders of magnitude smaller, with merely terabyte-scale in open-source communities like GitHub, severely constraining the development of domain-specialized LLMs for processor chip design. Third, the correctness guarantee: processor design demands rigorous verification standards, which fundamentally conflict with the probabilistic nature of LLMs. For example, Intel’s Pentium 4 processor required 99.99999999999% accuracy in functional verification [9]. Finally, the enormous solution space: processor design spans multiple abstraction stages from foundational software to physical layouts, thus, modeling the design space directly at the raw bitstream level suffers from a dimensionality explosion. For example, the solution space for a 32-bit CPU reaches 1010540 . This enormous solution space poses extreme challenges for deriving both functionallycorrect and performance-optimized processor designs. transformative paradigm, we propose QiMeng1 , a novel system for fully automated hardware and software design for processor chips. Consisting of three layers, QiMeng constructs a Large Processor Chip Model (LPCM) as a domain-specialized LLM for processor chip design in the bottom-layer and further creates both Hardware Design Agent and Software Design Agent based on LPCM in the middle-layer, enabling automated hardware and software design, respectively. Finally, the two agents support various processor chip design applications in the top-layer, as shown in Figure 1.

Fig. 1.  QiMeng consists of three layers, a domain-specialized Large Processor Chip Model (LPCM) in the bottom-layer, Hardware Design Agent and Software Design Agent enabling automated hardware and software design based on LPCM in the middle-layer, and various processor chip design applications in the top-layer.

In QiMeng, to overcome the above-mentioned four challenges, LPCM is meticulously designed to incorporate domainspecialized knowledge and fundamental competencies of processor chip design. LPCM distinguishes itself from generalpurpose LLMs through unique innovations in its architecture, training, and inference. Regarding architecture, LPCM employs a multi-modal structure, enabling the comprehension and representation ability of graph data inherent to the processor chip domain, which addresses the critical challenge of the knowledge representation gap. For training, it is critical to automatically generate extensive domain-specific data of processor chip design. For each abstraction stage of processor chip design, domain-specific data is systematically collected, and single-stage automated design models are independently trained. These models are subsequently cascaded to autonomously generate extensive cross-stage aligned data for processor chip design. Leveraging this aligned data, LPCM can be trained to learn domain knowledge from the hierarchical design process, effectively mitigating the data scarcity challenge. During inference, two feedback-driven mechanisms are implemented. By constructing correctness feedback from automated

functional verification, LPCM is able to autonomously repair erroneous results and ensure the validity of generated outputs, addressing the challenge of ensuring correctness in processor design. Concurrently, leveraging performance feedback from automated performance evaluation, LPCM is capable of decomposing the solution space and pruning the lowperformance subspaces. Thus, LPCM can effectively reduce the dimensionality of the solution space and enable efficient exploration of high-performance design solutions, overcoming the challenge of the enormous solution space.

Based on LPCM, QiMeng develops two specialized agents, a Hardware Design Agent and a Software Design Agent, dedicated to the automated design of hardware and software for processors and chips. The Hardware Design Agent adopts a dual-loop mechanism, consisting of an outer module decomposition feedback loop based on performance optimization and an inner module generation feedback loop empowered by automated verification and repair. This dual-loop mechanism facilitates end-to-end automated design from functional specifications to physical layouts, unifying conventional disjointed stages such as logic design, circuit design, and physical design. Thus, Hardware Design Agent enables a fully integrated, cross-stage collaborative design paradigm that is expected to surpass conventional human design, potentially achieving superior performance under identical fabrication technology. Meanwhile, the Software Design Agent also employs a dualloop mechanism, consisting of an outer performance optimization feedback loop guided by LLM and an inner function adaptation feedback loop based on automated verification and repair. Software Design Agent autonomously achieves seamless functional adaptation and performance optimization of foundational software for target processor chips, addressing the dynamic and escalating demands of modern applications.

Leveraging the Hardware Design Agent and Software Design Agent, various applications can be developed to address diverse real-world use cases of processor chip design. For automated hardware design, significant milestones have been accomplished, including automated front-end design and automated HDL generation. In automated software design, achievements include automated OS configuration optimization, automated compiler tool-chain design, automated tensor program transcompiler, and automated high-performance library generation. These applications have driven the implementation of key components within QiMeng, establishing a solid foundation for its full realization. Moving forward, we will construct QiMeng through a three-phase approach, transitioning from top-down to bottom-up, ultimately achieving a self-evolving framework. Initially, in the top-down phase, the implementation of diverse automated design applications in top-layer will provide two agents in middle-layer with design expertise and generate extensive domain-specific data to enhance the capabilities of the underlying LPCM. Subsequently, in the bottom-up phase, the improved LPCM, the hardware and software design agents will be applied across a broader spectrum of processor chip design applications in a bottom-up fashion. Ultimately, in the iteration phase, an iterative cycle integrating top-down and bottom-up approaches will be established to enable the self-evolution of QiMeng, progressively advancing its fully automated processor chip design capabilities while extending its applicability to support increasingly diverse and complex scenarios.

Aiming to present a comprehensive framework for fully automated hardware and software design for processor chips, this work introduces QiMeng, along with its roadmap, design methodology, and applications. This paper is structured as follows: Section II provides the motivation of QiMeng and its roadmap; Section III elaborates the design of LPCM, encompassing architecture, training and inference; Section IV details the Hardware Design Agent and Software Design Agent; Section V showcases diverse applications enabled by key components of QiMeng; Section VI surveys related research in automated processor chip design; Section VII concludes with insights into future research trajectories.

To read the full article, click here

×
Semiconductor IP