Twitter | Pretraživanje | |
Andreas Kirsch
Did you know you can classify MNIST using gzip? 🤓 You can get 45% accuracy on binarized MNIST using class-wise compression and counting bits 🤗 🔥No or needed 🔥 BASH script and classifier 👉
Compression algorithms (like the well-known zip file compression) can be used for machine learning purposes, specifically for classifying hand-written digits (MNIST) - BlackHC/mnist_by_zip
GitHub GitHub @github
Reply Retweet Označi sa "sviđa mi se" More
Andreas Kirsch 17. srp
Odgovor korisniku/ci @PyTorch @TensorFlow i 3 ostali
This is getting more attention than expected, so full acknowledgements: thanks to Christopher Mattern (from ) who mentioned this to me about two years ago over Friday Drinks as fun fact and to for a random afternoon conversation turning into a tiny project 🎉💕
Reply Retweet Označi sa "sviđa mi se"
Yann LeCun 17. srp
Odgovor korisniku/ci @BlackHC @PyTorch i 2 ostali
Sure but...45% accuracy is not exactly good. You can get close to 88% with a linear classifier. You can get 95% with nearest-neighbor/L2 distance. No deep learning necessary. But if you want more than 99% without losing your computational shirt, go with ConvNets.
Reply Retweet Označi sa "sviđa mi se"
Andreas Kirsch 17. srp
Odgovor korisniku/ci @ylecun @PyTorch i 2 ostali
Thanks! That's true🤗I would not recommend anyone use this classifier in seriousness😇 I was surprised it is working this well at all and better than nearest-neighbor on pixel sums. At best, it's a simple proof-of-concept for information-theoretic approaches😊
Reply Retweet Označi sa "sviđa mi se"
Marc G. Bellemare 17. srp
Odgovor korisniku/ci @BlackHC @PyTorch i 2 ostali
Nice! For completeness, a link to some of the original classification-by-compression work:
Reply Retweet Označi sa "sviđa mi se"
Andreas Kirsch 17. srp
Odgovor korisniku/ci @marcgbellemare @PyTorch i 3 ostali
Thanks! I had been looking around a bit for similar papers but haven't found much. It seems well-known in the statistical compression community. Indeed, I have to thank Christopher Mattern (from ) for mentioning this over drinks three years ago as a fun fact/idea 😊
Reply Retweet Označi sa "sviđa mi se"
Sebastian Raschka 17. srp
Odgovor korisniku/ci @BlackHC @PyTorch i 2 ostali
I was recently wondering about sth similar: you can probably just count the number of pixels (i.e., just do a sum over the pixel values) to classify MNIST images with ~50% accuracy, which isn't too bad.
Reply Retweet Označi sa "sviđa mi se"
Andreas Kirsch 17. srp
Odgovor korisniku/ci @rasbt @PyTorch i 2 ostali
Actually, we have tried that 😊You only get 20% accuracy. Zip compression indeed performs significantly better. If you scroll down in the Jupyter Notebook, you can see results for summing on both binarized MNIST and vanilla MNIST. 👉
Reply Retweet Označi sa "sviđa mi se"
ok boomer 17. srp
Odgovor korisniku/ci @BlackHC @PyTorch i 2 ostali
Yaa, I mean ummm... It is definitely creative. Btw, isn't even like random coins should get like 50 percent accuracy?
Reply Retweet Označi sa "sviđa mi se"
Andreas Kirsch 18. srp
Odgovor korisniku/ci @raxtechbits @PyTorch i 2 ostali
Random baseline accuracy is 10% ☺️
Reply Retweet Označi sa "sviđa mi se"
Kenneth Marino 17. srp
Odgovor korisniku/ci @BlackHC @gokstudio i 3 ostali
“We are uncertain whether this is an appraisal of zip compression or an indictment of the MNIST dataset.”
Reply Retweet Označi sa "sviđa mi se"
Nelson Correa 17. srp
Odgovor korisniku/ci @Kenneth_Marino @BlackHC i 4 ostali
After 30 years of optimizing on it, the MNIST test set is no longer a test set; it is rather a validation set.
Reply Retweet Označi sa "sviđa mi se"