pixparse/cc12m-wds
Image To TextEnglish
Pixparse/cc12m-wds is a image to text-focused dataset in English distributed in Parquet format.
About pixparse/cc12m-wds
Dataset Card for Conceptual Captions 12M (CC12M)
Dataset Summary
Conceptual 12M (CC12M) is a dataset with 12 million image-text pairs specifically meant to be used for visionand-language pre-training.
Its data collection pipeline is ...
Details
- Task
- Image To Text
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- pixparse
- Year
- 2023