A CPU cache is a cache used by the central processing unit of a computer to decrease the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most often used main memory locations. As long as most memory accesses are to cached memory locations, the average latency of memory accesses will be closer to the cache latency than to the latency of main memory.
When the processor needs to read or write a location in main memory, it first checks whether that memory location is in the cache. This is accomplished by comparing the address of the memory location to all tags in the cache that may contain that address. If the processor finds that the memory location is in the cache, we say that a cache hit has occurred; otherwise we talk of a cache miss. In the case of a cache hit, the processor immediately reads or writes the information in the cache line. The proportion of accesses that result in a cache hit is known as the hit rate, and is a measure of the effectiveness of the cache.
In the case of a cache miss, most caches assign a new entry, which comprises the tag just missed and a copy of the data from memory. The reference can then be applied to the new entry just as in the case of a hit. Misses are relatively slow because they require the data to be transferred from main memory. This transfer incurs a delay since main memory is much slower than cache memory, and also incurs the overhead for recording the new data in the cache before it is delivered to the processor.