What kind of language data does Flitto collect and provide? – Flitto Support

Flitto is a crowdsourced translation service platform used by over 10 million users worldwide. Through our platform, we collect and provide a multilingual corpus of expressions used by native speakers and various data including speech data and images with handwriting.

Thanks to our crowdsourced platform and efficiency in managing large data sets, we are able to swiftly collect various language data and provide them with the corresponding metadata (gender, age, region, etc.).

Multilingual Parallel Corpus

Flitto provides multilingual corpus data collected through our crowdsourced platform after thorough reviews made by professionals. Our text corpus data is used to train NLP engines and algorithms such as neural machine translation and AI chatbots.

Multilingual Speech Data

We collect and build multilingual speech data for NLP, STT, TTS engines. Speech data is collected according to specific criteria, and metadata (e.g., gender, age, region, etc.) is given or generated.

Image Data

Flitto collects and provides image data according to various criteria, including images with texts, such as multilingual menus and handwriting. We also social-tag those images and generate various tags for identical images.

Data Annotation

Metadata is given or generated by sentiment analyses, object identification, and tag creation of various content.

For more information on Flitto language data, click the URL below.

Go to Flitto language data