retkowski/ytseg
Token ClassificationAutomatic Speech RecognitionEN
Retkowski/ytseg is a token classification dataset in EN from retkowski in Parquet format.
About retkowski/ytseg
YTSeg: A Benchmark for Audio Chaptering and Video Transcript Segmentation
We present YTSeg, a topically and structurally diverse benchmark for the audio chaptering and transcript segmentation task based on YouTube videos. The dataset comprises ...
Details
- Task
- Token Classification, Automatic Speech Recognition
- Language
- EN
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- retkowski
- Year
- 2024