ARM is currently the most widely used CPU architecture in the world.

It may not power your computer, but it almost certainly powers your smartphone or tablet. Furthermore, Apple now uses ARM architecture on their in-house CPU designs.

The reason for that is immediately clear.

ARM CPUs can have amazing performance while also managing to keep power consumption low. So low, in fact, that they can fit inside thin smartphones with passive cooling. A big chunk of that power efficiency is attributed to the big.LITTLE core setup, which is used by Apple, Qualcomm, MediaTek, and other CPU makers.

But what exactly is big.LITTLE, and why is it so important?

What Is ARM's big.LITTLE CPU Architecture?

Qualcomm Snapdragon processor

The way most modern CPUs work is that they usually have multiple cores that the system divides tasks between. Normally, these multi-core CPUs feature identical cores capable of tackling the same instructions and reaching the same clock speeds. All tasks, big or small, are handled and distributed between these cores. Not so with ARM big.LITTLE CPUs.

While ARM CPUs with "conventional" core setups exist, big.LITTLE-based CPU designs feature two core "clusters" with differently-designed cores for different tasks. In these kinds of CPUs, we'll often see "high performance" cores designed to take on demanding tasks, and "power-efficient" cores, which handle more conventional tasks. These high-performance cores are usually high-specced and power-hungry and reach markedly higher clock speeds, whereas power-efficient cores are weaker, lower-clocked, and consume way less power.

In a smartphone, these "conventional tasks" include texting, e-mail, calls, audio, and more, which comprise the majority of the commonplace tasks a smartphone must do. These are meant to be offloaded to the power-efficient cores, while the bigger high-performance cores are left for other, more demanding tasks, like mobile gaming and web browsing. The system uses global task scheduling, or heterogeneous multi-processing (HMP), to distribute workloads between all the different CPU cores.

The advantage is twofold. Since the everyday tasks are handled by the smaller cores, which are less power-hungry by nature, these CPUs typically consume considerably less power. They also have better performance since more demanding tasks have a cluster of CPU cores all for themselves. You're getting a CPU that both performs better and is also more power-efficient.

A Revolutionary Idea With a Rocky History

Photo of a CPU graphic

Before the arrival of big.LITTLE, all multi-core ARM CPUs featured an arrangement of identical cores, just like x86 CPUs. big.LITTLE was first introduced in October of 2011, and it was introduced together with two new core designs, the Cortex-A7 and the Cortex-A15. In that proposed system, both core designs could be paired: the Cortex-A15 would act as the big core, while the Cortex-A7 would act as the small core. From there, upcoming core designs from ARM Holdings would all be compatible with big.LITTLE as silicon manufacturers deemed it fit.

One of the first CPUs to launch with this core design was Samsung's Exynos 5 Octa 5410, which powered the Samsung Galaxy S4 in 2013. It featured four Cortex-A7 cores clocked at 1.2 GHz and four Cortex-A15 cores clocked at 1.6 GHz for a total of 8 cores.

The way the scheduler for these earlier big.LITTLE CPUs worked, however, was pretty clumsy. These earlier schedulers used "clustered switching" to address whole clusters at once. If the load on the whole processor is low, it'll use the low-power cores, but if it increases, it'll transition the entire workload over to the big cores. It's definitely one way to do it, but in retrospect, it was a fairly inefficient one.

Related: What Is an ARM Processor? Everything You Need to Know

Then, we saw the in-kernel switcher. Here, big cores are paired with small cores, and both are addressed by the scheduler as a single "virtual core." Depending on whether a virtual core was given a low or high load, it'll switch between using the small core and the big core.

From there, we ended up with heterogeneous multi-processing. Here, each core can be addressed individually. The scheduler knows which cores are big and which cores are small and proceeds to distribute workloads from there, distributing lower loads to the power-efficient cores and bigger loads to the high-performance ones.

How Has big.LITTLE Changed the CPU Landscape?

ARM processors already had a good reputation for providing a decent balance between performance and power efficiency. However, on these CPUs, a low power draw is essential. After all, these processors are used in smartphones, and smartphones are small, have slim bodies, and don't feature any kind of active cooling, so thermal constraints are very low, and CPUs need to sip power to meet them.

big.LITTLE, though, was huge because it was able to improve both performance and power efficiency simultaneously. Nowadays, most, if not all, ARM CPUs are based on a big.LITTLE-based design, even Apple phones. Now, Intel is set to take a page or two of this architecture on its x86 processors going forward: Alder Lake processors will introduce the concept of heterogeneous computing to the PC landscape.

The advantages are simply too huge to deny.

What Is DynamIQ?

features-we-want-high-end-smartphone-4
Image Credit: Qualcomm

DynamIQ is a new core architecture announced by ARM in May 2017 and serves as a successor of sorts to big.LITTLE. DynamIQ is meant to take what big.LITTLE does with heterogeneous computing one step further, allowing more flexibility and better scaling.

Whereas big.LITTLE was limited to only two clusters, DynamIQ increases the maximum number of cores per cluster to 8, allows for multiple core designs in a single cluster, and allows for up to 32 clusters per CPU. In addition, DynamIQ provides more precise per-core voltage regulation and better L2 cache speeds. In short, it's pretty similar to big.LITTLE and takes its basic concept forward, except it now allows for more flexibility in having multiple clusters and core designs.

An example of a DynamIQ processor is the Snapdragon 888, Qualcomm's flagship chip for the year 2021. In big.LITTLE processors, it is common to see clusters of big and small cores.

However, in the Snapdragon 888, there is a "primary core," one Cortex-X1 core clocked at 2.84 GHz, then a more typical high-performance cluster (now a secondary tier), comprising three Cortex-A78 cores clocked at 2.42 GHz. Finally, the power-efficient cores are four Cortex-A55 cores clocked at 1.8 GHz. It's an octa-core setup, but it uses three different core designs meant to tackle different tasks.

A Complete Industry Shake-Up

It's safe to say that the introduction of big.LITTLE and the concept of heterogeneous computing has completely turned around the CPU game. ARM CPUs nowadays trade bouts with the biggest processors in the x86 side of the pond while also keeping power consumption and battery life to a minimum, and it's all thanks to big.LITTLE and its successor, DynamIQ.

We're really excited about the future of ARM processors going forward.