TSMC to Build Supercomputing AI Chips, Ramps Wafer-Scale Computing
Over the past year, companies like Cerebras have made headlines for their use of wafer-scale processing. TSMC wants to grow this area of its business and plans to build out its InFO_SoW (Integrated Fan-Out Silicon on Wafer) technology in order to build supercomputer-class AI processors in the future.
TSMC has already contracted with Cerebras to build its wafer-scale processors, but the company has an eye on the broader market as well and believes wafer-scale processing will prove appealing to other customers beyond Cerebras. The company has stated it will build these chips on 16nm technology.
Understanding what TSMC is building here requires parsing more than few acronyms. Integrated Fan-Out is a packaging technology TSMC has offered for several years. Typically, a wafer is cut into dies before being bonded to a package, with the package being larger than the physical die.
For companies that need absolutely minimal die size, this arrangement isn’t ideal. There’s an alternate technique, known as wafer-level processing, which eliminates the size discrepancy by packaging the die while it’s still part of the wafer. This results in substantial space savings, but it limits the number of electrical connections available to the chip.
InFO works around this limitation by combining a more traditional die-cutting process with additional steps to preserve most of the size advantage wafer-level processing (WLP) creates. Dies are cut in the conventional way, but then remounted on a second wafer, with additional space left between each die for connectivity. 3Dincites has written a deep dive into InFO_SoW based on presentations TSMC gave at ECTC 2020. The point of InFO_SoW is to take the advantages InFO provides and extend them to wafer-sized processing blocks.
One of the theoretical advantages of wafer-scale processing is tremendous connectivity at minimal power consumption. The slide below illustrates some of the differences, including the dramatic reduction in PDN (Power Distribution Network) impedance. These claims echo statements about wafer-scale processing reported by research teams last year who investigated the idea as a plausible means of scaling CPU performance in the modern era.
InFO_SoW is only one advanced packaging technique TSMC is offering. The slide below shows how its various packaging options compare against each other in terms of power efficiency and vertical interconnect pitch.
According to TSMC, it can deliver a 2x bandwidth density improvement and a 97 percent reduction in impedance, while lowering interconnect power consumption by 15 percent.
Not bad, provided it works. Also, the TDP is amazing. Even though I know that number is for an entire 12-inch wafer, a 7,000W TDP is an eye-opener. The “chip” TSMC is building for Cerebras contains 400,000 cores and 1.2 trillion transistors.
Packaging Is the New Hotness
If you think about it, most of the advances AMD and Intel have championed in recent years have been interconnect and packaging improvements. Chiplets and their associated packaging requirements, plus the development and evolution of HyperTransport Infinity Fabric have been major topics of conversation for AMD. Intel, meanwhile, has talked up its efforts with EMIB, Foveros, and Omni-Path.
The reason everyone is focused on packaging these days is that it’s become increasingly difficult to wring better performance out of transistors via die shrinks and process node improvements. Improving packaging technology is one of the ways companies are trying to boost performance without running afoul of the laws of physics.
These wafer-scale processors aren’t ever going to be something you install in a home; the estimated cost of a Cerebras wafer is two million dollars. What interests me about wafer-scale processing is the idea that the cloud could finally establish a genuine advantage over any single desktop installation, no matter how powerful. In theory, wafer-scale processing + cloud computing could be a game-changer for computing, provided we can work out the latency issues.
Feature image by Cerebras.