AI Chip development is a good example. The investment we were making in our general-purpose CPU processors.
From the 1960s until the 2010s, engineering innovations that shrink transistors doubled the number of transistors on a single computer chip roughly every two years, a phenomenon known as Moore’s Law. Computer chips became millions of times faster and more efficient during this period.
The transistors used in today’s state-of-the-art chips are only a few atoms wide. But creating even smaller transistors makes engineering problems increasingly difficult or even impossible to solve, causing the semiconductor industry’s capital expenditures and talent costs to grow at an unsustainable rate. As a result, Moore’s Law is slowing—that is, the time it takes to double transistor density is growing longer. The costs of continuing Moore’s Law are justified only because it enables continuing chip improvements, such as transistor efficiency, transistor speed, and the ability to include more specialized circuits in the same chip.
The economies of scale historically favoring general-purpose chips like central processing units have been upset by rising demand for specialized applications like AI and the slowing of Moore’s Law-driven CPU improvements. Accordingly, specialized AI chips are taking market share from CPU.
AI chips include graphics processing units (GPUs), field-programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs) that are specialized for AI. General-purpose chips like central processing units (CPUs) can also be used for some simpler AI tasks, but CPUs are becoming less and less useful as AI advances.
Like general-purpose CPUs, AI chips gain speed and efficiency (that is, they are able to complete more computations per unit of energy consumed) by incorporating huge numbers of smaller and smaller transistors, which run faster and consume less energy than larger transistors. But unlike CPUs, AI chips also have other, AI-optimized design features. These features dramatically accelerate the identical, predictable, independent calculations required by AI algorithms.
They include executing a large number of calculations in parallel rather than sequentially, as in CPUs; calculating numbers with low precision in a way that successfully implements AI algorithms but reduces the number of transistors needed for the same calculation; speeding up memory access by, for example, storing an entire AI algorithm in a single AI chip; and using programming languages built specifically to efficiently translate AI computer code for execution on an AI chip.
Different types of AI chips are useful for different tasks. GPUs are most often used for initially developing and refining AI algorithms; this process is known as “training.” FPGAs are mostly used to apply trained AI algorithms to real-world data inputs; this is often called “inference.” ASICs can be designed for either training or inference.
Because of their unique features, AI chips are tens or even thousands of times faster and more efficient than CPUs for training and inference of AI algorithms. State-of-the-art AI chips are also dramatically more cost-effective than state-of-the-art CPUs as a result of their greater efficiency for AI algorithms. An AI chip a thousand times as efficient as a CPU provides an improvement equivalent to 26 years of Moore’s Law-driven CPU improvements.
Cutting-edge AI systems require not only AI-specific chips, but state-of-the-art AI chips. Older AI chips—with their larger, slower, and more power-hungry transistors—incur huge energy consumption costs that quickly balloon to unaffordable levels. Because of this, using older AI chips today means overall costs and slowdowns at least an order of magnitude greater than for state-of-the-art AI chips.
These cost and speed dynamics make it virtually impossible to develop and deploy cutting-edge AI algorithms without state-of-the-art AI chips. Even with state-of-the-art AI chips, training an AI algorithm can cost tens of millions of U.S. dollars and take weeks to complete. In fact, at top AI labs, a large portion of total spending is on AI-related computing. With general-purpose chips like CPUs or even older AI chips, this training would take substantially longer to complete and cost orders of magnitude more, making staying at the research and deployment frontier virtually impossible. Similarly, performing inference using less advanced or less specialized chips could involve similar cost overruns and take orders of magnitude longer.