Success or failure in designing microchips depends heavily on steps known as floorplanning and placement. These steps determine where memory and logic elements are located on a chip. The locations, in turn, strongly affect whether the completed chip design can satisfy operational requirements such as processing speed and power efficiency. So far, the floorplanning task, in particular, has defied all attempts at automation. It is therefore performed iteratively and painstakingly, over weeks or months, by expert human engineers. But in a paper in Nature, researchers from Google (Mirhoseini et al.1) report a machine-learning approach that achieves superior chip floorplanning in hours.
Modern chips are a miracle of technology and economics, with billions of transistors laid out and interconnected on a piece of silicon the size of a fingernail. Each chip can contain tens of millions of logic gates, called standard cells, along with thousands of memory blocks, known as macro blocks, or macros. The cells and macro blocks are interconnected by tens of kilometres of wiring to achieve the designed functionality. Given this staggering complexity, the chip-design process itself is another miracle — in which the efforts of engineers, aided by specialized software tools, keep the complexity in check.
The locations of cells and macro blocks in the chip are crucial to the design outcome. Their placement determines the distances that wires must span, and thus affects whether the wiring can be successfully routed between components and how quickly signals can be transmitted between logic gates. Optimization of chip placement has been extensively studied for at least six decades2,3. Seminal innovations in the mathematical field of applied optimization, such as a method known as simulated annealing4, have been motivated by the challenge of chip placement.
Because macro blocks can be thousands or even millions of times larger than standard cells, placing cells and blocks simultaneously is extremely challenging. Modern chip-design methods therefore place the macro blocks first, the floorplanning step. Standard cells are then placed in the remaining layout area. Just placing the macro blocks is incredibly complicated: Mirhoseini et al. estimate that the number of possible configurations (the state space) of macro blocks in the floorplanning problems solved in their study is about 102,500. By comparison, the state space of the black and white stones used in the board game Go is just 10360.
Viable floorplanning solutions must leave empty regions on the chip to achieve all of the subsequent steps — placement of the standard cells, routing of the wiring and maximizing of the chip’s processing speed. However, the optimizations of logic circuitry inherent in these steps can increase the total area taken up by standard cells by 15% or more. Human engineers must therefore iteratively adjust their macro-block placements as the logic-circuit design evolves. Each of these iterations is carried out manually, and takes days or weeks.
The computer industry has famously been driven by Moore’s law — the number of components per chip has roughly doubled every two years. This rate of advancement corresponds to an increase in the number of components on a chip of about one per cent per week. The failure to automate floorplanning is therefore problematic — not only because of the associated time costs, but also because it limits the number of solutions that can explored within chip-development schedules.
But everything changed on 22 April 2020. On that day, Mirhoseini et al. posted a preprint5 of the current paper to the online arXiv repository. It stated that “in under 6 hours, our method can generate placements that are superhuman or comparable” — that is, the method can outperform humans in a startlingly short period of time. Within days, numerous semiconductor-design companies, design-tool vendors and academic-research groups had launched efforts to understand and replicate the results.
Mirhoseini and colleagues trained a machine-learning ‘agent’ that can successfully place macro blocks, one by one, into a chip layout. This agent has a brain-inspired architecture known as a deep neural network, and is trained using a paradigm called reinforcement learning. At any given step of floorplanning, the trained agent assesses the ‘state’ of the chip being developed, including the partial floorplan that it has constructed so far, and then applies its learnt strategy to identify the best ‘action’ — that is, where to place the next macro block.
The technical details of this approach, such as how to represent the chip-design and partial-floorplanning solutions, were developed with the overarching goal of finding a general, transferable solution to the macro-placement problem. In other words, the trained agent should succeed even when confronted with chip designs that it has not previously encountered, drawn from a wide range of applications and markets. The authors report that, when their agent is pre-trained on a set of 10,000 chip floorplans, it is already quite successful when used in a ‘one shot’ mode on a new design: with no more than six extra hours of fine-tuning steps, the agent can produce floorplans that are superior to those developed by human experts for existing chips. Moreover, the agent’s solutions are very different from those of trained human experts (Fig. 1).
Arthur C. Clarke famously noted6 that “any sufficiently advanced technology is indistinguishable from magic”. To long-time practitioners in the fields of chip design and design automation, Mirhoseini and colleagues’ results can indeed seem magical. In the past year, experts worldwide have contemplated questions such as, ‘How is it that the agent can initially place each macro block in turn so effectively that the chosen placement is used in the final, manufactured chip design?’
The authors report that the agent places macro blocks sequentially, in decreasing order of size — which means that a block can be placed next even if it has no connections (physical or functional) to previously placed blocks. When blocks have the same size, the agent’s choice of the next block echoes the choices made by ‘cluster-growth’ methods7, which were previously developed in efforts to automate floorplan design, but were abandoned several decades ago. It will be fascinating to see whether the authors’ use of massive computation and deep learning reveal that chip designers took a wrong turn in giving up on sequential and cluster-growth methods.
Another much-debated question has been, ‘How does the agent’s choice of macro-block placements survive subsequent steps in the chip-design process?’ As mentioned earlier, human engineers must iteratively adjust their floorplans as the logic-circuit design evolves. The trained agent’s macro-block placements somehow evade such landmines in the design process, achieving superhuman outcomes for timing (ensuring that signals produced in the chip arrive at their destinations on time) and for the feasibility and efficiency with which wiring can be routed between components. Moreover, Mirhoseini and colleagues’ use of simple metrics as proxies for key parameters of the chip design works surprisingly well — it will be interesting to understand why these proxies are so successful. The authors’ intention to make their code available is invaluable in this light.
The development of methods for automated chip design that are better, faster and cheaper than current approaches will help to keep alive the ‘Moore’s law’ trajectory of chip technology. Indeed, for technical leaders and decision-makers in the chip industry, the most important revelation in Mirhoseini and colleagues’ paper might be that the authors’ floorplan solutions have been incorporated into the chip designs for Google’s next-generation artificial-intelligence processors. This means that the solutions are good enough for millions of copies to be printed on expensive, cutting-edge silicon wafers. We can therefore expect the semiconductor industry to redouble its interest in replicating the authors’ work, and to pursue a host of similar applications throughout the chip-design process.