ARM just showed 2021’s smartphone CPUs, led by the powerful Cortex-X1
ARM has announced the next generation of smartphone processors, set to deliver up to 20 percent or greater performance than the prior generation. It includes something special: a new Cortex-X1 design, an optimized version designed for “ultimate performance.”
In the smartphone industry, ARM designs its Cortex CPUs, Mali GPUs, and Ethos machine-learning processors, then licenses them to companies like Qualcomm. (This differs from the PC chip business, where AMD and Intel keep their designs proprietary—and Intel manufactures its own CPUs as well.) Those chip designers, in turn, are permitted to customize and enhance them, depending upon their license terms.
The new cores—the Cortex-A78, the Mali-G78, and the Ethos-N78—will debut in smartphones shipping in 2021, ARM executives said. The company is promising that the Cortex-A78 will deliver 20 percent greater sustained performance over the prior generation; the Mali-G78, 25 percent better overall performance; and the Ethos-N78, 25 percent more performance efficiency.
Then there’s the ARM Cortex-X1, which ARM is promising will deliver 30 percent peak performance over the prior Cortex-A generation. This, according to ARM, represents a new category of “off roadmap” performance, requiring specific engineering collaboration with partners. It sounds like we’ll be hearing more about the first fruits of the Cortex-X1 partnership within the coming weeks.
“It answers the question of how much performance can be pushed for this generation when you’re not so constrained by the usual power area constraints,” Paul Williamson, vice president and general manager of ARM’s client line of business, said of the Cortex-X1. “It’s really targeting flagship smartphones and larger-screen devices. And given the silicon area and dissipated power, it’s not really something we expect to see in every device.”
Smartphone makers have a choice between using the new ARM cores to maximize performance, or to deliver better battery life with the same performance as the prior generation. This is actually the angle Williamson took when explaining the new cores, as the three deliver “more out of the same power budget as last year,” he said.
More on the Cortex-A78
Williamson said the A78 was specifically designed for the demands of 5G, with use cases that included how fast applications launch, and how responsive webpages are when scrolling. “Sustained performance in a device with limited power will avoid power throttling in really high-performance applications,” Williamson said. “So you’ll get less lag and less framerate drops.”
Like the prior Cortex-A77, the Cortex-A78 will consist of what ARM calls its big.LITTLE octacore architecture, with four high-performance A78 cores and four A55 cores optimized for long battery life. ARM said that a Cortex-A78 core running at 3GHz would deliver 20 percent more sustained, single-core performance than the Cortex-A77 core running at 2.6GHz, assuming 1 watt per core. The performance is based on simulated estimates.
Alternatively, a phone maker could clock the A78 to consume half the power at the same performance as the A77, Williamson said. ARM believes that the octacore Cortex-A78 layout will require 15 percent less die space than the Cortex-A77, leading to smaller phones.
Williamson said ARM is also moving ahead with its “Built on Cortex” technology program, which it established with some of its partners in 2016. “We’ve collaborated closely with a small number of key partners to deliver a new performance level that’s going beyond our traditional roadmap,” he said.
The fruits of that partnership are what it calls the Cortex-X Custom Program, and with it the Cortex-X1. “With this program, they [ARM partners and phone makers] can create devices that don’t compromise on the power and efficiency to deploy cores that deliver an all-out performance point,” Williamson said.
ARM says the Cortex-X1 offers 30 percent more peak single-core performance than the previous Cortex-A generation—a bit more than the 20-percent improvement offered by the Cortex-A78 in general. It’s designed for “ultimate performance,” Williamson said. He said he expects partners using the Cortex-X1 to announce their phones later this year.
ARM’s licensing terms require those companies to use the Cortex-X1 brand, which shouldn’t be an issue. An ARM representative noted that while a licensee like Qualcomm builds its Snapdragon smartphone processors on ARM—even branding its own CPU cores as “Kryo”—the company typically discloses exactly which ARM cores they’re built upon. “We view this as a win/win on both sides,” she said.
More on Mali-G78 and -G68
ARM’s Mali-G78 graphics processor includes several specific improvements: an increase to 24 cores, a 30-percent reduction in power for a key math unit, and performance optimizations—specifically around complex gaming scenes involving smoke, grass, and trees. It’s the most powerful GPU ARM’s made on its Valhall [stet] architecture, ARM said. Overall, there’s a 25-percent performance boost from the prior generation, the Mali-G77.
“Games are getting more complex, people are expecting console-like performance with Fortnite and [PlayerUnknown’s Battlegrounds] being played more often,” Williamson said. “Mobile-enabled gamers wants to take the immersive experience on the go. And for that you need a high-performance GPU.”
ARM also said that it will offer a version of the Mali-G78 known as the Mali-G68, specifically for less expensive phones. Although the G68 will keep all of the same features of the G78, it will have just 6 cores, rather than 24.
In addition to gaming performance, ARM says machine-learning capabilities have improved by about 15 percent inside the G78. That’s useful in AI-driven applications like face-unlocking and various camera modes, including AI-driven “portrait” modes that highlight the subject and blur the background.
More on Ethos-N78
ARM also has a dedicated machine-learning core, the Ethos-N78, which it’s optimized for more efficient data movement.
ARM’s Ethos-N78 improves performance efficiency by 25 percent per square millimeter, and the company has increased the MAC capacity to a peak of 10 teraflops per second. “That means we’re doing more work in the same area, or the same work in less area with respect to previous-generation devices,” Williamson said.