TIGER-Lab/VisualWebInstruct
Question AnsweringVisual Question AnsweringImage Text To TextEN
TIGER-Lab/VisualWebInstruct is a question answering-focused dataset in EN distributed in Parquet format.
About TIGER-Lab/VisualWebInstruct
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search
VisualWebInstruct is a large-scale, diverse multimodal instruction dataset designed to enhance vision-language models' reasoning capabilities. The dataset contains app...
Details
- Task
- Question Answering, Visual Question Answering, Image Text To Text
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- TIGER-Lab
- Year
- 2025