|
Frank McSherry
@
frankmcsherry
|
|
Co-founder and Chief Scientist, Materialize, Inc.
"Science advances one funeral at a time" -Planck
|
|
|
2.280
Tweetovi
|
93
Pratim
|
3.057
Osobe koje vas prate
|
| Tweetovi |
| Frank McSherry proslijedio/la je tweet | ||
|
Vasia Kalavri
@vkalavri
|
24. sij |
|
It's North East Database Day #NEDBDay on Monday! Excited to meet the local community and discuss our recent & ongoing stream processing work.
Catch my talk right after lunch and our 2 posters at 4:30pm: mitdbg.github.io/nedbday/2020/#…
|
||
|
|
||
| Frank McSherry proslijedio/la je tweet | ||
|
Malte Sandstede
@MalteSandstede
|
24. sij |
|
While I won‘t be able to attend personally (apparently, there‘s an ocean in between), I‘m really excited to see a poster of my thesis on online distributed dataflow analysis with timely dataflow making it to #NEDBDay 2020 and being presented by Vasia! 🎉 What‘s not to like? 😊 twitter.com/vkalavri/statu…
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
23. sij |
|
Once you learn the correct answer, tell me and I'll go back in time and implement it in differential dataflow.
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
20. sij |
|
Super interested to hear more about it! pic.twitter.com/BnKAiOnQL4
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
20. sij |
|
I'll be talking at the North East Database Day (mitdbg.github.io/nedbday/2020/) on the 27th about what we are up to at @MaterializeInc. There is still a day or so to register, if you are within striking distance of Boston!
|
||
|
|
||
| Frank McSherry proslijedio/la je tweet | ||
|
💧Riff Raff
@RichardAOB
|
11. sij |
|
Apparently wombats in fire effected areas are not only allowing other animals to take shelter in their deep, fire-resistant burrows but are actively herding fleeing animals into them.
We’re seeing more leadership and empathy from these guys than the entire Federal government. pic.twitter.com/LGcpSu9x0M
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
Skunkworks Friday at @MaterializeInc: we are now able to implement joins using delta queries, which bounds the incremental memory footprint of weird, multiway join queries to that of whatever aggregation is at the end.
Uses differential's dogs^3 internals
github.com/TimelyDataflow…
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
My understanding is that it is not uniformly agreed [from a policy point of view] that random draws provide sufficient protection. Much disagreement.
Perhaps less contentious, random draws don't provide the same *guarantees* as DP (they are weaker, I think most would agree).
|
||
|
|
||
| Frank McSherry proslijedio/la je tweet | ||
|
Vlad Magdalin
@callmevlad
|
8. sij |
|
This is hands down the best movie of the year. pic.twitter.com/iXcsgtPpNP
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
But you could (and perhaps should) publish the raw measurements, alongside a suite of tools that help you answer questions about P(real_question | dp_observations). Synthetic data is a least common denominator, but there are more direct ways from dp measurements to conclusions.
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
I'm not personally familiar with DP techniques that require post-processing... Most (of the ones I work with) take relatively direct measurements that are tricky for humans to interpret, and then spends some effort turning that information back into something that "feels better".
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
But, correct me if I'm wrong here, TopDown is post DP-measurement right? So from the raw measurements and invariants, one could apply different cleaning routines to return the data to normalcy, to try and match their needs.
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
(( context: I hit my head against MLE and LSE techniques for hierarchical data for a while, and they each had various flavors of defects; the MW one had the fewest that I ran into, but I'm sure it depends tremendously on the setting and requirements. ))
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
10. sij |
|
But at the same time, the multiplicative weights updates come with appealing generalization guarantees that AFAIK the least-squares approaches don't have. At least of a flavor (relative entropy) that can work better for folks working with distributions.
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
9. sij |
|
The connection to multiplicative weights ends up with some pleasant theoretical results about the progressive reduction of relative entropy. I could imagine the raking community have either reached similar conclusions or might be keen to!
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
9. sij |
|
It seems operationally similar to what we ended up doing with the MWEM algorithm for DP measurements: repeatedly applying rescaling based on fit to various noisy linear measurements:
arxiv.org/pdf/1012.4763.…
It sounds like raking might be similar, for row and column marginals.
|
||
|
|
||
| Frank McSherry proslijedio/la je tweet | ||
|
Jonathan Aldrich
@JAldrichCMU
|
5. sij |
|
Join me! ACM members, sign a petition supporting open access, asking @TheOfficialACM to withdraw signature from anti-OA letter and make OA available at cost. OA is good for science, and good for ACM! change.org/p/association-…
|
||
|
|
||
| Frank McSherry proslijedio/la je tweet | ||
|
Vasia Kalavri
@vkalavri
|
5. sij |
|
In other news, I'm beyond excited to have officially started at @BUCompSci last week!
I'll be teaching Data Stream Processing and Analytics this semester, covering fundamental and emerging topics, as well as hands-on @ApacheFlink and @apachekafka.
|
||
|
|
||
| Frank McSherry proslijedio/la je tweet | ||
|
Nikolas Göbel
@NikolasGoebel
|
31. pro |
|
I've collected some of this year's notes on databases, IVM, software engineering, and other topics.
nikolasgoebel.com/2019/12/30/per…
|
||
|
|
||
|
Frank McSherry
@frankmcsherry
|
29. pro |
|
We built the (very neat) Lego treehouse set over the holidays. My sister is now investigating whether we can build the lego.com/en-us/product/… set next, to process the tree.
|
||
|
|
||