Skip to content

TIGER-Lab/VisualWebInstruct

Question AnsweringVisual Question AnsweringImage Text To TextEN

TIGER-Lab/VisualWebInstruct is a question answering-focused dataset in EN distributed in Parquet format.

About TIGER-Lab/VisualWebInstruct

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search VisualWebInstruct is a large-scale, diverse multimodal instruction dataset designed to enhance vision-language models' reasoning capabilities. The dataset contains app...

Details

Task
Question Answering, Visual Question Answering, Image Text To Text
Language
EN
Format
Parquet
Rows / instances
N/A
Creator
TIGER-Lab
Year
2025
Download

FAQ