google/wit
Text RetrievalImage To TextAF, AR, ASTcc-by-sa-3.0
Google/wit is a text retrieval-focused dataset in AF, AR, AST distributed in Parquet format. It is distributed under the cc-by-sa-3.0 license and falls in the 1M<n<10M size category, and has been downloaded 269 times.
About google/wit
Wikipedia-based Image Text (WIT) Dataset is a large multimodal multilingual dataset.
WIT is composed of a curated set of 37.6 million entity rich image-text examples with 11.5 million unique images across 108 Wikipedia languages.
Its size enables ...
Details
- Task
- Text Retrieval, Image To Text
- Language
- AF, AR, AST
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1M<n<10M
- Creator
- Year
- 2022
- License
- cc-by-sa-3.0
- Downloads
- 269
- Likes
- 69