Skip to content

mlfoundations/MINT-1T-HTML

Image To TextText GenerationEN

Mlfoundations/MINT-1T-HTML is a image to text-focused dataset in EN distributed in Parquet format.

About mlfoundations/MINT-1T-HTML

šŸƒ MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens šŸƒ MINT-1T is an open-source Multimodal INTerleaved dataset with 1 trillion text tokens and 3.4 billion images, a 10x scale-up from existing op...

Details

Task
Image To Text, Text Generation
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
mlfoundations
Year
2026
Download

Related Image To Text, Text Generation datasets

FAQ