ByteDance/MTVQA
Visual Question AnsweringImage To TextMULTILINGUAL, AR, DEcc-by-nc-4.0
Created by ByteDance at 2024, the ByteDance/MTVQA is a visual question answering dataset in MULTILINGUAL, AR, DE containing 8,794 records in Parquet format. With 333 downloads and 42 likes, it is actively used by the community. It is released under the cc-by-nc-4.0 license and is a 1K<n<10K-scale dataset.
About ByteDance/MTVQA
Dataset Card
The dataset is oriented toward visual question answering of multilingual text scenes in nine languages, including Korean, Japanese, Italian, Russian, Deutsch, French, Thai, Arabic, and Vietnamese. The question-answer pairs are labe...
Details
- Task
- Visual Question Answering, Image To Text
- Language
- MULTILINGUAL, AR, DE
- Format
- Parquet
- Rows / instances
- 8794
- Size
- 1K<n<10K
- Creator
- ByteDance
- Year
- 2024
- License
- cc-by-nc-4.0
- Downloads
- 333
- Likes
- 42