How Does CPU Cache Work and What Are L1, L2, and L3?
Computer processors have advanced quite a bit over the last few years, with the size of transistors getting smaller every year, and advancements hitting a point where Moore’s Law is quickly becoming redundant.
When it comes to processors, it’s not just the transistors and frequencies that count, but the cache as well.
You might have heard about cache memory when CPUs (Central Processing Units) are being discussed. However, we don’t pay too much attention to these numbers, nor are they really the primary highlight of the advertisements of these CPUs.
So exactly how important is CPU cache, and how does it work?
What Is CPU Cache?
To put it simply, a cache is just a really fast type of memory. As you might know, a computer has multiple types of memory inside it. There is a primary storage, like a hard disk or an SSD, which stores the bulk of the data—the operating system and all the programs.
Next up, we have the Random Access Memory, commonly known as the RAM. This is much faster than the primary storage.
Lastly, the CPU has even faster memory units within itself, which we know as the cache.
The memory in a computer has a hierarchy, based upon the speed, and cache stands at the top of this hierarchy, being the fastest. It is also the closest to where the central processing occurs, being a part of the CPU itself.
Cache is a Static RAM (SRAM), as compared to the system RAM, which is a Dynamic RAM (DRAM). Static RAM is one that can hold data without needing to be constantly refreshed, unlike DRAM, which makes SRAM ideal to be used for cache.
How Does CPU Cache Work?
As you might already be aware, a program is designed as a set of instructions, to be run by the CPU. When you run a program, these instructions have to make their way from the primary storage to the CPU. This is where the memory hierarchy comes into play.
The data first gets loaded up into the RAM and is then sent to the CPU. CPUs these days are capable of carrying out a gigantic number of instructions per second. To make full use of its power, the CPU needs access to superfast memory. This is where the cache comes in.
The memory controller does the job of taking the data from RAM and sending it to the cache. Depending upon which CPU is in your system, this controller can either be on the North Bridge chipset on the motherboard or inside the CPU itself.
The cache then carries out the back and forth of data within the CPU. The hierarchy of memory exists within the cache, as well.
(If you’re interested in knowing how the CPU itself works, check out our article explaining the basics of CPU .)
The Levels of Cache: L1, L2, and L3
CPU cache is divided into three main ‘Levels’, L1, L2, and L3. The hierarchy here is again according to the speed, and thus, the size of the cache.
L1 (Level 1) cache is the fastest memory that is present in a computer system. In terms of priority of access, L1 cache has the data the CPU is most likely to need while completing a certain task.
As far as the size goes, the L1 cache typically goes up to 256KB. However, some really powerful CPUs are now taking it close to 1MB. Some server chipsets (like Intel’s top-end Xeon CPUs) now have somewhere between 1-2MB of L1 cache.
L1 cache is also usually split two ways, into the instruction cache and the data cache. The instruction cache deals with the information about the operation that the CPU has to perform, while the data cache holds the data on which the operation is to be performed.
L2 (Level 2) cache is slower than L1 cache, but bigger in size. Its size typically varies between 256KB to 8MB, although the newer, powerful CPUs tend to go past that. L2 cache holds data that is likely to be accessed by the CPU next. In most modern CPUs, the L1 and L2 caches are present on the CPU cores themselves, with each core getting its own cache.
L3 (Level 3) cache is the largest cache memory unit, and also the slowest one. It can range between 4MB to upwards of 50MB. Modern CPUs have dedicated space on the CPU die for the L3 cache, and it takes up a large chunk of the space.
Cache Hit or Miss and Latency
The data flows from the RAM to the L3 cache, then the L2, and finally L1. When the processor is looking for data to carry out an operation, it first tries to find it in the L1 cache. If the CPU is able to find it, the condition is called a cache hit. It then proceeds to find it in L2, and then L3.
If it doesn’t find the data, it tries to access it from the main memory. This is called a cache miss.
Now, as we know, the cache is designed to speed up the back and forth of information between the main memory and the CPU. The time needed to access data from memory is called Latency. L1 has the lowest latency, being the fastest, and closest to the core, and L3 has the highest. The latency increases by a lot when there is a cache miss. This is because the CPU has to get the data from the main memory.
As computers get faster and better, we are seeing a decrease in latency. We have low latency DDR4 RAM now, and super fast SSDs with low access times as the primary storage, both of which significantly cut down on the overall latency. If you want to know more about how RAM works, here is our quick and dirty guide to RAM .
Earlier, cache designs used to have the L2 and L3 caches outside the CPU, which had a negative effect on the latency.
However, the advancements in fabrication processes related to CPU transistors have made it possible to fit billions of transistors in a smaller space than before. As a result, more room is left for cache, which lets the cache be as close to the core as possible, significantly cutting down latency.
The Future of Cache
Cache design is always evolving, especially as memory gets cheaper, faster, and denser. Intel and AMD have had their fair share of experimentation with cache designs, with Intel even experimenting with an L4 cache. The CPU market is moving forward faster than ever now.
With that, we are bound to see cache design keep up with the ever-growing power of CPUs.
Additionally, there is a lot being done to cut down the bottlenecks that modern computers have. Reducing memory latency is perhaps the single biggest part of it. The industry is working towards solutions for the same, and the future looks really promising.
You might have even heard about Intel Optane, which can be used as a sort of a hybrid external cache. If you haven’t, check out our article exploring the potential applications of Intel Optane .