Fast CPU vs Cache

I’m trying to understand how a contemporary CPU works and sharing my work here.

Let’s start with something simple: compare computing a value vs. fetching it from RAM. Then, let’s try to understand if this result is “permanent” or context-dependent.

This is my CPU, and I’m running popOS:

Accessing a value

#include <stdlib.h>
#include <stdio.h>
#include <time.h>

// Function to get current time in nanoseconds using clock_gettime
long long get_time_ns() {
    struct timespec tp;
    clock_gettime(CLOCK_MONOTONIC, &tp);
    return (long long)tp.tv_sec * 1000000000LL + tp.tv_nsec;
}

int main() {
    // Now declare an integer and a pointer to it
    int myInt = 123;
    int* myIntPtr = &myInt;

    // Start timing
    long long start_time = get_time_ns();

    // Access the integer here, to measure the time it might take (potentially including a cache miss penalty)
    printf("Integer value: %d\n", *myIntPtr);

    // Stop timing
    long long end_time = get_time_ns();

    // Calculate and print the elapsed time in nanoseconds
    printf("Time taken to access the integer: %lld ns\n", end_time - start_time);


    return 0;
}

Running it:

Integer value: 123
Time taken to access the integer: 39355 ns

Computing a value

We only add an addition (122 + 1); the rest of the code is roughly the same.

#include <stdlib.h>
#include <stdio.h>
#include <time.h>


long get_time_ns() {
    struct timespec tp;
    clock_gettime(CLOCK_MONOTONIC, &tp);
    return (long long)tp.tv_sec * 1000000000LL + tp.tv_nsec;
}


int main() {

    // Now declare an integer and a pointer to it
    int myInt = 0;
   
    int* myIntPtr = &myInt;

    // Start timing
    long long start_time = get_time_ns();
    myInt = 1 + 122;

    // Access the integer here, to measure the time it might take (potentially including a cache miss penalty)
    printf("Integer value: %d\n", *myIntPtr);

    // Stop timing
    long long end_time = get_time_ns();

    // Calculate and print the elapsed time in nanoseconds
    printf("Time taken to access the integer: %lld ns\n", end_time - start_time);

    return 0;
}

I compiled and checked the assembly code to ensure the addition stayed in the code.

Integer value: 123
Time taken to access the integer: 39695 ns

NB. I executed these results over 100 times to confirm the results.

Conclusion: it’s slightly longer to compute a value than accessing it.

VOIDING THE PIPELINE

CPU has a pipeline. The pipeline predicts which instructions will be needed and prefetch it and its data. Is the performance assumption still valid in this case?

We can’t easily reset the instruction cache, but we can reset the data cache fairly easily by writing random data in the CPU. For instance:

    // Allocate a large array to fill the CPU cache
    char* cacheFiller = (char*)malloc(CACHE_SIZE);
    if (!cacheFiller) {
        perror("Failed to allocate memory");
        return 1;
    }

    // Access the array to ensure it's loaded into the cache, potentially evicting other data
    for (int i = 0; i < CACHE_SIZE; i++) {
        cacheFiller[i] = (char)(i & 0xFF);
    }

If I reset the cache before fetching the value:

Integer value: 123
Time taken to access the integer: 28574 ns

It’s faster than the version with no cache! And if we add:

Integer value: 123
Time taken to access the integer: 26350 ns

This time, computing is faster than fetching the value.

Conclusions

We’ve unearthed a few interesting questions.

Why is the same computation significantly faster once the CPU cache is reset?
Why computing the value is faster than fetching it from RAM
Is it specific to my machine or is it the same on all machines?
What happens when the computations are not trivial? Do these results hold?

Deviant/Abstraction

Fast CPU vs Cache

Accessing a value

Computing a value

VOIDING THE PIPELINE

Conclusions

Like this:

Discover more from Deviant/Abstraction

Leave a ReplyCancel reply

Fast CPU vs Cache

Accessing a value

Computing a value

VOIDING THE PIPELINE

Conclusions

Share this:

Like this:

Discover more from Deviant/Abstraction

Leave a ReplyCancel reply

Discover more from Deviant/Abstraction