mlfoundations/MINT-1T-ArXiv
Image To TextText GenerationEN
The mlfoundations/MINT-1T-ArXiv dataset is a EN image to text resource from mlfoundations at 2024.
About mlfoundations/MINT-1T-ArXiv
š MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
š MINT-1T is an open-source Multimodal INTerleaved dataset with 1 trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-...
Details
- Task
- Image To Text, Text Generation
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- mlfoundations
- Year
- 2024