Twitter | Pretraživanje | |
Liran Alon 29. pro
Encountered a strange x86 cache-coherency inconsistency: Intel guarantees to flush WCBs on read/write UC mem but does so only for read. If true, Linux should have new flush_wcb_writeX() util that differ between CPU vendors? (1/3)
Reply Retweet Označi sa "sviđa mi se"
Liran Alon
This applies to some NIC drivers I recently reviewed. They have a feature that Tx desc is written to PCI BAR mapped as WC (Instead to mem) to avoid one DMA read. Thus, only on AMD they require wmb() before writing to doorbell (UC). For example, mlx4 BlueFlame feature. (2/3)
Reply Retweet Označi sa "sviđa mi se" More
Liran Alon 29. pro
Odgovor korisniku/ci @Liran_Alon
Having said that, I wonder if on these scenarios it's sufficiently ok to just wmb()+writeX_relaxed() on write to doorbell even though it exec unnecessary SFENCE on Intel. Because probably it cause implicit SFENCE on write to UC to be much faster? This is all very weird... (3/3)
Reply Retweet Označi sa "sviđa mi se"
Liran Alon 30. pro
Odgovor korisniku/ci @Liran_Alon
Also on ARM64, wmb()+writeX_relaxed() compared to writel() will change dma_wmb() to wmb() unnecessarily. As dma_wmb()==DMB(OSHST) is sufficient to flush WCBs. I'm not sure if write to doorbell (UC Device mem) does implicit wmb()==DSB(ST) anyway as in x86 Intel. ARM expert here?..
Reply Retweet Označi sa "sviđa mi se"