77 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
0
answers
43
views
What protocol does the LLC directory uses to synchronize parallel RFO signals?
The MESI or MOESI protocols need the LLC directory in order to work... and the directory needs to synchronize parallel RFO + snoop-invalidation calls in order for it to work
(in TSO architectures that ...
0
votes
1
answer
102
views
When an atomic variable becomes visible to a thread other than the writing thread, is it also immediately globally visible?
Suppose I have three threads. If x was written by thread2 and x is visible to thread1, do I have the guarantee that the latest value of x is also visible to thread3? In other words, can the new value ...
-1
votes
1
answer
230
views
in chi cache coherence protocol, how does RN decide which READ transaction to send
In amba chi cache coherence protocol, the RN sends instructions (to HN) like ReadClean, ReadNotSharedDirty, ReadShared, ReadUnique, etc. But the CPU has sent only a READ instruction to the RN, so how ...
1
vote
0
answers
71
views
where is directory memory of dir controller of cache coherence protocol stored in real chip?
In the MESI_Three_Level protocol of GEM5 simulator, there are L0, L1, L2, dir and dma state machines. L0 and L1 cache controllers simulate private caches for processor cores, which are implemented ...
0
votes
1
answer
84
views
MESI Protocol State Transition if Index Bits are Same but Tags differ
I'm trying to solve a MESI Cache problem. I have four processors (P0, P1, P2, P3) each with 4 states set to Invalid. Offset bits are to be ignored. If I read on P0 on address 11010 with two index bits ...
0
votes
1
answer
139
views
Confusions about the state transition of MSI/MESI directory protocol in the book "A Primer on Memory Consistency and Cache Coherence, Second Edition"
All questions come from the book "A Primer on Memory Consistency and Cache Coherence, Second Edition".
The first question comes from "Table 8.1: MSI directory protocol—cache controller&...
0
votes
2
answers
148
views
Why race condition occurs when hardware has ensured coherency
#include <iostream>
#include <thread>
#include <vector>
#include <chrono>
#include <mutex>
using namespace std::chrono;
const int nthreads = 4;
const int64_t ndata = ...
0
votes
1
answer
39
views
What will happen if the processor perform a read operation while the cache line is still in the store buffer of another processor
Under the context of MESI protocol and the introducing of store buffer and invalidation queue, a write operation to a variable can be temporarily stored in the store buffer waiting for the related ...
0
votes
2
answers
166
views
MESI: why we need write-miss to move from shared to modified
The book "Computer Architecture", by Hennessy/Patterson, 6th ed, on page 394, includes an example with true sharing and false sharing misses with 2 processors.
here is the example from the ...
-1
votes
1
answer
144
views
What happens with the store "that lost race" to shared memory in x86 TSO memory model?
I know that x86 processors use TSO memory model and I am curious about one thing. I will explain it through example.
We have two processors (P1 and P2) where P1 stores X=1 to its store buffer and P2 ...
0
votes
2
answers
284
views
MOESI Protocol: What happens when Owned is dirty and other processors read the line in Shared?
I've been thinking about the "owned" state of the MOESI protocol. So let's say the following situation exists:
P0 has line A in O state.
P1 has line A in S state.
P0 writes to line A in its ...
4
votes
0
answers
233
views
How CPUs Use the LOCK Prefix to Implement Cache Locking and ensure memory consistency
In Java, adding the volatile keyword to a variable guarantees memory consistency (or visibility).
On the x86 platform, the Hotspot virtual machine implements volatile variable memory consistency by ...
0
votes
0
answers
121
views
Can CPU load data from another CPU's cache using LOCK CMPXCHG instruction in x86?
Let's say we have CPU-X and CPU-Y which have their own L1d caches. First, on CPU-X we execute simple read operation on memory location M that is stored in DRAM: after that CPU-X loads value stored in ...
0
votes
1
answer
248
views
How `memory_order_relaxed` is enough in TTAS spinlock for Arm64?
Consider the following implementation of spinlock (first link in google on query "c++ spinlock implementation"):
struct spinlock {
std::atomic<bool> lock_ = {0};
void lock() noexcept {...
2
votes
0
answers
101
views
Invalidation of an Exclusive cache line
What happens if a CPU receives an invalidation message for a cache line in the Exclusive state? Can this message enter the invalidation queue?
If so, what happens if the same CPU attempts to to that ...