|
Joshua Saxe
@
joshua_saxe
|
|
Chief Scientist @ Sophos (views my own). Book: Malware Data Science. Recent paper: arxiv.org/pdf/1804.05020…. Interested in ML, cyber, social-sci, philosophy
|
|
|
865
Tweetovi
|
71
Pratim
|
1.311
Osobe koje vas prate
|
| Tweetovi |
| Joshua Saxe proslijedio/la je tweet | ||
|
iximeow
@iximeow
|
5. velj |
|
"what in the world, that's executable machine code?" you betcha, and there are more than a few tricks here to fit the logic into an ascii string: twitter.com/JohnLaTwC/stat…
|
||
|
|
||
| Joshua Saxe proslijedio/la je tweet | ||
|
Catherine Olsson
@catherineols
|
19 h |
|
a very ingenious use of non-robust features (a la Ilyas et al 2019 Features Not Bugs):
deliberately inject *new* synthetic non-robust features into your dataset
check for them later as evidence that your dataset was used twitter.com/facebookai/sta…
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
10 h |
|
Yeah. You'd have to use the features they provide or derive some from the metadata they provide, so you're constrained in that way... but obviously the dataset has the huge benefit of being available to everyone, to compare methods.
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
10 h |
|
This was a private dataset from a government source... Perhaps @mrphilroth's EMBER dataset would be a good context in which to explore this kind of large scale visualization work, though.
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
10 h |
|
Cool - from skimming this is exceptionally well written! What I was going for when I generated these plots was a method that didn't require any model fitting (hence random projections) and could scale O(n) as I added samples. I don't think this fits that bill but I will read.
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
11 h |
|
Another visualization of "malware mordor" (110k sample malware dataset), this time via printable string minhashes used as indices into a 2d histogram. Lots of cluster structure. If I was more inspired I'd photoshop a victim operating system getting grilled over these flames :) pic.twitter.com/YlZf5MUANB
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
11 h |
|
Two attempts to visualize the topology of "malware mordor" -- 110k malware samples random-projected onto a 2d surface, histogrammed to show concentration. Malware datasets contain huge volumes of near-duplicate binaries (~60%?). You can see that pretty clearly here. pic.twitter.com/7GwUeWlJAY
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
11 h |
|
Ridiculing, belittling, and diminishing others as a way to make a point has always been the shadow side of hacker culture, in my experience. I'm very grateful that I haven't seen that, or heard of it happening, in the ML-sec niche. We seem to have a good thing going...
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
17 h |
|
Thanks!
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
19 h |
|
Hey, you're building the next Recorded Future ;)
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
19 h |
|
Yeah, that's a good point. Hey, do you have a link to the talk?
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
22 h |
|
I can think of some but I'm not going to say them on Twitter.
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
23 h |
|
v0.2 of my security learning model thanks to feedback from @eugk @taosecurity and @jacnah63. It's part of what makes security so exhilarating that many conversations (e.g. strategy around designing a threat response operation) require every layer as part of the conversation. pic.twitter.com/bDuT0Znx2t
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
Interesting bleak thought. Not sure. I'm not a cybersecurity international relations person. I imagine over the next few decades norms will be established around this sort of thing...
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
WWII ordnance is still recovered in Europe, dangerous 8 decades later. The shelf life of expensive cyberweapons, in contrast, is months, or a few years. To what degree will the incentive to "use zero days before you lose them" bear on future military/intel history?
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
Malware analysis by poking the malware: We run malware, and then run it again with added stimulus (eg keystrokes). Events that *only* occur with stimulus are high-signal and telling. I did the viz part, the paper authors did everything else. researchgate.net/publication/26… pic.twitter.com/J2mUS5jWO3
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
This is great news! Especially after very few AI talks were accepted last year. twitter.com/OnInsecurity/s…
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
IMO respectful critique is the way. 90% of our community has gallows humor about their own marketing, understands it's bad, but does what works. We need to design a market in which dishonest signaling works worse than honest signaling; how do we do that is the discussion.
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
Yes that's just it. It's a market for lemons out there, with dishonest signalling distorting and polluting the conversation. @swagitda_'s not wrong to call that out. I'm interested in benchmarks, measurability, and other market shaping institutions that would improve matters.
|
||
|
|
||
|
Joshua Saxe
@joshua_saxe
|
5. velj |
|
Yes, agreed. If I implied you didn't understand the tech, and it came off as gendered, I'm really sorry. It's not what I meant, but that's not an excuse. That said, picking an easily identifiable startup, and then ridiculing them publicly, partly for humor, didn't feel kind.
|
||
|
|
||