liuhaotian/LLaVA-CC3M-Pretrain-595K
General NLPEN
Created by liuhaotian at 2023, the liuhaotian/LLaVA-CC3M-Pretrain-595K is a General NLP dataset in EN in Parquet format.
About liuhaotian/LLaVA-CC3M-Pretrain-595K
LLaVA Visual Instruct CC3M 595K Pretrain Dataset Card
Dataset details
Dataset type:
LLaVA Visual Instruct CC3M Pretrain 595K is a subset of CC-3M dataset, filtered with a more balanced concept coverage distribution.
Captions are also...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- liuhaotian
- Year
- 2023