Question 1

What are Speech Recognition datasets used for?

Accepted Answer

Speech Recognition datasets are collections of labelled or raw data used to train, fine-tune, and evaluate models on the speech recognition task. This page lists 5 such datasets, each linking to its source and paper.

Question 2

Which Speech Recognition dataset is best for benchmarking?

Accepted Answer

None of the listed Speech Recognition datasets are currently tracked as standard LLM benchmarks, but many are widely used for evaluation.

Question 3

How many Speech Recognition datasets are there?

Accepted Answer

We catalog 5 Speech Recognition datasets in one searchable directory.

Speech Recognition Datasets

What languages do speech recognition datasets cover?

Explore other dataset tasks

Frequently asked questions