Flitto is a crowdsourced translation service platform used by over 10 million users worldwide. Through our platform, we collect and provide a multilingual corpus of expressions used by native speakers and various data including speech data and images with handwriting.
Thanks to our crowdsourced platform and efficiency in managing large data sets, we are able to swiftly collect various language data and provide them with the corresponding metadata (gender, age, region, etc.).
Multilingual Parallel Corpus
Flitto provides multilingual corpus data collected through our crowdsourced platform after thorough reviews made by professionals. Our text corpus data is used to train NLP engines and algorithms such as neural machine translation and AI chatbots.
Multilingual Speech Data
We collect and build multilingual speech data for NLP, STT, TTS engines. Speech data is collected according to specific criteria, and metadata (e.g., gender, age, region, etc.) is given or generated.
Flitto collects and provides image data according to various criteria, including images with texts, such as multilingual menus and handwriting. We also social-tag those images and generate various tags for identical images.
Metadata is given or generated by sentiment analyses, object identification, and tag creation of various content.
For more information on Flitto language data, click the URL below.