Twitter | Search | |
InstLatX64
x86/x64, SIMD, AVX512, "Aha!" moments
1,042
Tweets
0
Following
1,651
Followers
Tweets
InstLatX64 Oct 18
I forgot to mention in the Zen2 complaint list the FP3-using of FMAs. It is documented in Zen1 optim guide only, but it is effective in Zen2 too.
Reply Retweet Like
InstLatX64 Oct 18
Replying to @geofflangdale
Now or never. Even on a Zhaoxin-C4580 PDEP/PEXT takes a constant 11 clks only
Reply Retweet Like
InstLatX64 Oct 17
My expectations -Documented non-Zen2 feats:,, PKU, PCIDE -Dual int store -300+ ROB -Better L1,L2 latency to compensate bigger L3 -4x256b, more symmetrical FPU (2x256b (v)shift, store, perm, div) -fewer ucoded insts -0 crosslane penalties My Zen2 complaint list
Reply Retweet Like
InstLatX64 Oct 17
I looked for because in the Fam6 CPUID-Matrix there are a few missing items (e.g. 6Dh 6Fh). They should be the assigned but never released products e.g. CannonLakeSP, KnightsHill
Reply Retweet Like
InstLatX64 Oct 17
It seems in some AIC motherboard manuals a diagram hasn't been refreshed from to . (In the text it's ok.)
Reply Retweet Like
InstLatX64 Oct 15
6GHz 5950X ES in a Mac
Reply Retweet Like
InstLatX64 retweeted
Dayman Oct 15
Celerons finally get AVX2 support with TGL
Reply Retweet Like
InstLatX64 Oct 15
released Software Development Emulator 8.59.0 w/ (, , VEX-only little-core Gracemont), (, ) support From Alderlake dump is missing the HYBRID flag, the CPUID is 906A0 (=ADL-P in coreboot)
Reply Retweet Like
InstLatX64 Oct 14
Non-crypto use of VGF2P8AFFINEQB, vol8: It enables real parallel byte-histogramming, it can sustain the less-than-1 cycle/byte rate for any arbitrary input. It reaches 0.88 on an Turbo-Off Core i5-1135G7 using this gist of
Reply Retweet Like
InstLatX64 Oct 12
QuadCore Core i5-1135G7 806C1 (-UP3) CPUID, x64 InstLat, MemLat dump, CPUID, GPGPU panel (Gen12)
Reply Retweet Like
InstLatX64 Oct 11
Replying to @JethroGB @science_dot
Perhaps Jintide uses custom CPUID microcode
Reply Retweet Like
InstLatX64 Oct 11
4c/8t Core i5-1135G7 806C1 2.4GHz Lat|Tp with counter-VPMOVM2D/Q pairs (VPMOVM2D/Q L|T = 1|.33,.33,.5 )
Reply Retweet Like
InstLatX64 Oct 10
2x 24-Core Montage Jintide C2460 (Skylake-SP variant) 50654 CPUID dump: It is a Xeon, except the CPUID 80000002-4 brand string - perhaps the FMA detection code is in need of a refresh Intro SKUs
Reply Retweet Like
InstLatX64 Oct 10
10-core 16-thread Core i9-10900K () A0655 pcHT (pre-core HyperThreading) enabled CPUID dump added Core 0,1,2,7,8,9: HTT On Core 3,4,5,6: HTT Off
Reply Retweet Like
InstLatX64 Oct 9
Replying to @patrickschur_
Reply Retweet Like
InstLatX64 Oct 9
: RX-427BB (BaldEagle) 630F01 Stones CPUID is A10F0x (Zen3, SP5, thx, @patrikschur_ !) : Core i9-10900K (CometLakeS) A0655 Pentium Silver N5030 (GeminiLakeR) 706A8 Xeon Gold 5218 (CascadeLakeSP) 50656 (1 512b-FPU, thx, !) Commit:
Reply Retweet Like
InstLatX64 Oct 3
released the 41st edition of the ISA Extensions Reference with , , in
Reply Retweet Like
InstLatX64 Sep 17
Replying to @InstLatX64
Strange AVX2 -> AVX256 AVX512 -> AVX3
Reply Retweet Like
InstLatX64 Sep 17
This pdf has a comprehensive list of CPUID bits on p.116-123. It mentions a few new one (at least for me): FZM, MPRR, SGX_TEM, SGX_KEYS, ULI, DEDUP, HRESET, Fast REP*s, and - according to the XFAM bits - -related and
Reply Retweet Like
InstLatX64 Sep 15
released the "Intel Key Locker Specification" 343965-001US pdf
Reply Retweet Like