This is the legacy version of We will be shutting it down on 15 December 2020. Please switch to a supported browser or device. You can see a list of supported browsers in our Help Center.
Isaac R Caswell Oct 29
What do we need to scale NLP research to 1000 languages? We started off with a goal to build a monolingual corpus in 1000 languages by mining data from the web. Here’s our work documenting our struggles with Language Identification (LangID): 1/8