Skip to content

Visual Question Answering Datasets

There are 21 visual question answering datasets in our directory. Each links to its source, paper, and download — browse the full list below or filter by language.

Visual Question Answering is the task of answering natural-language questions about the contents of an image. We catalog 21 datasets for it.

Updated June 2026

What languages do visual question answering datasets cover?

Explore other dataset tasks

Frequently asked questions