Twitter | Pretraživanje | |
Liran Alon 16. pro
Q: Producer/Consumer ring is a common pattern for high perf comm between 2 CPU cores or CPU core & device. Thus, I expected Intel to have non-temporal store instruction that write to LLC without polluting L1/L2. Useful also with device DDIO. But MOVNT* also bypass LLC. Why? (1/3)
Reply Retweet Označi sa "sviđa mi se"
Liran Alon
i.e. Producer isn't expected to read descriptors it writes to submission queue. Thus, no need to load their cache-lines to producer's L1/L2. Which also hurts Consumer latency on reading them. Thoughts? (2/3)
Reply Retweet Označi sa "sviđa mi se" More
Liran Alon 16. pro
Odgovor korisniku/ci @Liran_Alon
Given Intel DDIO provides device with direct access to limited set of LLC ways, I would also expect to have a non-temporal store instruction that not only write directly to LLC, but can be hinted to write to DDIO-accessible LLC ways. E.g. To accelerate NIC/NVMe submissions. (3/3)
Reply Retweet Označi sa "sviđa mi se"