キャッシュ内の「ブロックサイズ」の概念

私は、ダイレクトマップキャッシュとセットアソシエイティブキャッシュの概念を学び始めました。
私はいくつかの非常に基本的な疑問を持っています。ここに行く。

アドレスは32ビット長で、64Byteのブロックサイズと512フレームの32KBキャッシュを持っていますが、実際に「ブロック」内にどれくらいのデータが格納されていますか?私がメモリ位置からの値からロードする命令を持っていて、その値が16ビット整数ならば、64Byteブロックのうちの1つが16ビット(2Bytes)整数値だけを格納するのでしょうか?ブロック内の他の62バイトは何ですか?私は今も16ビット整数値をロードする別のロード命令を持っている場合、この値はロードアドレスに応じて別のフレームの別のブロックに入ります(アドレスが前の命令の同じフレームにマップされていれば、ブロックは64バイトに2バイトしか記憶しない)。正しい?

これは非常にばかげた疑いのように思えば私を許してください。私は自分のコンセプトを正しく取りたいと思っています。

ベストアンサー

私はキャッシュを説明する誰かのためにこの電子メールを入力しましたが、あなたも役に立つと思うかもしれません。

You have 32-bit addresses that can refer to bytes in RAM. You
want to be able to cache the data that you access, to use them
later.

Let’s say you want a 1-MiB (220 bytes) cache.

What do you do?

You have 2 restrictions you need to meet:

  1. Caching should be as uniform as possible across all
    addresses. i.e. you don’t want to bias toward any particular kind
    of address.

    • How do you do this? Use remainder! With mod, you can evenly
      distribute any integer over whatever range you want.
  2. You want to help minimize bookkeeping costs. That
    means e.g. if you’re caching in blocks of 1 byte, you don’t want to
    store 4 bytes of data just to keep track of where 1 byte belongs
    to.

    • How do you do that? You store blocks that are bigger than just
      1 byte.

Let’s say you choose 16-byte (24-byte) blocks. That
means you can cache 220/24 = 216 =
65,536 blocks of data.

You now have a few options:

  • You can design the cache so that data from any memory block
    could be stored in any of the cache blocks. This would be
    called a fully-associative cache.
  • The benefit is that it’s the “fairest” kind of cache: all
    blocks are treated completely equally.
  • The tradeoff is speed: To find where to put the memory block,
    you have to search every cache block for a free space.
    This is really slow.
  • You can design the cache so that data from any memory block
    could only be stored in a single cache block.
    This would be called a direct-mapped cache.
  • The benefit is that it’s the fastest kind of cache: you do only
    1 check to see if the item is in the cache or not.
  • The tradeoff is that, now, if you happen to have a bad memory
    access pattern, you can have 2 blocks kicking each other out
    successively, with unused blocks still remaining in the cache.
  • You can do a mixture of both: map a single memory block into
    multiple blocks. This is what real processors do — they have N-way
    set associative caches.

Direct-mapped cache:

Now you have 65,536 blocks of data, each block being of 16
bytes.
You store it as 65,536 “rows” inside your cache, with each “row”
consisting of the data itself, along with the metadata (regarding
where the block belongs, whether it’s valid, whether it’s been
written to, etc.).

Question: How does each block in memory get mapped to each
block in the cache?

Answer: Well, you’re using a direct-mapped cache, using mod.
That means addresses 0 to 15 will be mapped to block 0 in the
cache; 16-31 get mapped to block 2, etc… and it wraps around as
you reach the 1-MiB mark.

So, given memory address M, how do you find the row number N?
Easy: N = M % 220/24.
But that only tells you where to store the data, not how
to retrieve it. Once you’ve stored it, and try to access
it again, you have to know which 1-MB portion of memory
was stored here, right?

So that’s one piece of metadata: the tag bits. If it’s in row N,
all you need to know is what the quotient was, during the
mod operation. Which, for a 32-bit address, is 12 bits big (since
the remainder is 20 bits).

So your tag becomes 12 bits long — specifically, the
topmost 12 bits of any memory address.
And you already knew that the lowermost 4 bits are used for the
offset within a block (since memory is byte-addressed, and
a block is 16 bytes).
That leaves 16 bits for the “index” bits of a memory address, which
can be used to find which row the address belongs to.
(It’s just a division + remainder operation, but in binary.)

You also need other bits: e.g. you need to know whether a block
is in fact valid or not, because when the CPU is turned on, it
contains invalid data. So you add 1 bit of metadata: the Valid
bit.

There’s other bits you’ll learn about, used for optimization,
synchronization, etc… but these are the basic ones. 🙂

コメントする

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です