|
@Underfox3 | |||||
|
In this paper, researchers have presented SLEEF: a portable SIMD library which implements a vectorized version of all C99 real floating-point math functions, showing performance comparable to Intel SVML.
arxiv.org/pdf/2001.09258… pic.twitter.com/zt70pOrkbL
|
||||||
|
||||||
|
Underfox
@Underfox3
|
2. velj |
|
Project page:
sleef.org pic.twitter.com/N3xEd9aW8w
|
||
|
|
||
|
Filippo Spiga
@filippospiga
|
2. velj |
|
This library is also used by @PyTorch
|
||
|
|
||
|
Underfox
@Underfox3
|
2. velj |
|
Yeah, I had forgotten to give a special mention about its use in #PyTorch
Thanks! pic.twitter.com/7kxAICHUcU
|
||
|
|
||
|
Federico Ficarelli
@fficarelli
|
2. velj |
|
Really curious to see how this kind of portable SIMD abstractions could be adapted to target variable length vector ISAs where loop preambles and reminders are usually not needed in favor of predication tricks.
|
||
|
|
||
|
Filippo Spiga
@filippospiga
|
2. velj |
|
I can put you in contact with Francesco at @Arm ;-)
|
||
|
|
||
|
Gregory void★ Pakosz
@gpakosz
|
16 h |
|
Do I understanding right that calling Sleef_sin_u10() on a single double is still faster than calling the standard library's sin() function?
At first when I read "vectorized math library" I expected having to call functions on arrays
|
||
|
|
||