Skip to content

ATH-MaaS/Marco_Longspeech

Automatic Speech RecognitionAudio ClassificationText GenerationEN, ZHBenchmarkapache-2.0

ATH-MaaS/Marco_Longspeech is a automatic speech recognition benchmark dataset in EN, ZH from ATH-MaaS in Parquet format. It is distributed under the apache-2.0 license and falls in the 10K<n<100K size category, and has been downloaded 11.8K times.

📊 This dataset is used as an LLM benchmark. See model leaderboards →

About ATH-MaaS/Marco_Longspeech

Marco-LongSpeech Dataset Marco-LongSpeech is a multi-task long speech understanding dataset containing 8 different speech understanding tasks designed to benchmark Large Language Models on lengthy audio inputs. 📊 Dataset Stati...

Details

Task
Automatic Speech Recognition, Audio Classification, Text Generation
Language
EN, ZH
Format
Parquet
Rows / instances
N/A
Size
10K<n<100K
Creator
ATH-MaaS
Year
2026
License
apache-2.0
Downloads
11759
Likes
18
Download Homepage

FAQ