Technology Explained

How Does CPU Cache Work and What Are L1, L2, and L3?

Palash Volvoikar 13-12-2019

Computer processors have advanced quite a bit over the last few years, with the size of transistors getting smaller every year, and advancements hitting a point where Moore’s Law is quickly becoming redundant.


When it comes to processors, it’s not just the transistors and frequencies that count, but the cache as well.

You might have heard about cache memory when CPUs (Central Processing Units) are being discussed. However, we don’t pay too much attention to these numbers, nor are they really the primary highlight of the advertisements of these CPUs.

So exactly how important is CPU cache, and how does it work?

What Is CPU Cache?

To put it simply, a cache is just a really fast type of memory. As you might know, a computer has multiple types of memory inside it. There is a primary storage, like a hard disk or an SSD, which stores the bulk of the data—the operating system and all the programs.

Next up, we have the Random Access Memory, commonly known as the RAM. This is much faster than the primary storage.


Lastly, the CPU has even faster memory units within itself, which we know as the cache.

The memory in a computer has a hierarchy, based upon the speed, and cache stands at the top of this hierarchy, being the fastest. It is also the closest to where the central processing occurs, being a part of the CPU itself.

Cache is a Static RAM (SRAM), as compared to the system RAM, which is a Dynamic RAM (DRAM). Static RAM is one that can hold data without needing to be constantly refreshed, unlike DRAM, which makes SRAM ideal to be used for cache.

How Does CPU Cache Work?

As you might already be aware, a program is designed as a set of instructions, to be run by the CPU. When you run a program, these instructions have to make their way from the primary storage to the CPU. This is where the memory hierarchy comes into play.


The data first gets loaded up into the RAM and is then sent to the CPU. CPUs these days are capable of carrying out a gigantic number of instructions per second. To make full use of its power, the CPU needs access to superfast memory. This is where the cache comes in.

The memory controller does the job of taking the data from RAM and sending it to the cache. Depending upon which CPU is in your system, this controller can either be on the North Bridge chipset on the motherboard or inside the CPU itself.

The cache then carries out the back and forth of data within the CPU. The hierarchy of memory exists within the cache, as well.

(If you’re interested in knowing how the CPU itself works, check out our article explaining the basics of CPU What Is A CPU and What Does It Do? Computing acronyms are confusing. What is a CPU anyway? And do I need a quad or dual-core processor? How about AMD, or Intel? We're here to help explain the difference! Read More .)


The Levels of Cache: L1, L2, and L3

CPU cache is divided into three main ‘Levels’, L1, L2, and L3. The hierarchy here is again according to the speed, and thus, the size of the cache.

L1 (Level 1) cache is the fastest memory that is present in a computer system. In terms of priority of access, L1 cache has the data the CPU is most likely to need while completing a certain task.

As far as the size goes, the L1 cache typically goes up to 256KB. However, some really powerful CPUs are now taking it close to 1MB. Some server chipsets (like Intel’s top-end Xeon CPUs) now have somewhere between 1-2MB of L1 cache.

L1 cache is also usually split two ways, into the instruction cache and the data cache. The instruction cache deals with the information about the operation that the CPU has to perform, while the data cache holds the data on which the operation is to be performed.


Intel Skylake cache design
Image Credit: Intel

L2 (Level 2) cache is slower than L1 cache, but bigger in size. Its size typically varies between 256KB to 8MB, although the newer, powerful CPUs tend to go past that. L2 cache holds data that is likely to be accessed by the CPU next. In most modern CPUs, the L1 and L2 caches are present on the CPU cores themselves, with each core getting its own cache.

L3 (Level 3) cache is the largest cache memory unit, and also the slowest one. It can range between 4MB to upwards of 50MB. Modern CPUs have dedicated space on the CPU die for the L3 cache, and it takes up a large chunk of the space.

Cache Hit or Miss and Latency

The data flows from the RAM to the L3 cache, then the L2, and finally L1. When the processor is looking for data to carry out an operation, it first tries to find it in the L1 cache. If the CPU is able to find it, the condition is called a cache hit. It then proceeds to find it in L2, and then L3.

If it doesn’t find the data, it tries to access it from the main memory. This is called a cache miss.

Now, as we know, the cache is designed to speed up the back and forth of information between the main memory and the CPU. The time needed to access data from memory is called Latency. L1 has the lowest latency, being the fastest, and closest to the core, and L3 has the highest. The latency increases by a lot when there is a cache miss. This is because the CPU has to get the data from the main memory.

As computers get faster and better, we are seeing a decrease in latency. We have low latency DDR4 RAM now, and super fast SSDs with low access times as the primary storage, both of which significantly cut down on the overall latency. If you want to know more about how RAM works, here is our quick and dirty guide to RAM A Quick and Dirty Guide to RAM: What You Need to Know RAM is a crucial component of every computer, but it can be confusing. We break it down in easy-to-grasp terms you'll understand. Read More .

Earlier, cache designs used to have the L2 and L3 caches outside the CPU, which had a negative effect on the latency.

However, the advancements in fabrication processes related to CPU transistors have made it possible to fit billions of transistors in a smaller space than before. As a result, more room is left for cache, which lets the cache be as close to the core as possible, significantly cutting down latency.

The Future of Cache

Cache design is always evolving, especially as memory gets cheaper, faster, and denser. Intel and AMD have had their fair share of experimentation with cache designs, with Intel even experimenting with an L4 cache. The CPU market is moving forward faster than ever now.

With that, we are bound to see cache design keep up with the ever-growing power of CPUs.

Additionally, there is a lot being done to cut down the bottlenecks that modern computers have. Reducing memory latency is perhaps the single biggest part of it. The industry is working towards solutions for the same, and the future looks really promising.

You might have even heard about Intel Optane, which can be used as a sort of a hybrid external cache. If you haven’t, check out our article exploring the potential applications of Intel Optane Is Intel Optane Memory Cheap DDR3 RAM? Wondering what Intel's Optane memory is all about? Is it cheap RAM, or something more? Here's what you need to know. Read More .

Related topics: Computer Memory, Computer Parts, CPU.

Affiliate Disclosure: By buying the products we recommend, you help keep the site alive. Read more.

Whatsapp Pinterest

Leave a Reply

Your email address will not be published. Required fields are marked *

  1. That Guy
    January 19, 2020 at 1:09 pm

    Do you know what redundant means?

  2. Hung
    December 3, 2019 at 12:11 am

    Error in the article: Hard disk/SSD is NOT primary storage. Hard disk/SSD would be secondary storage. Primary storage is the RAM itself. Just some terminology, but important regardless.

  3. Zhong
    February 1, 2019 at 4:12 am

    Are you going to write a article about memory CAS latency?