google-research-datasets/paws
Text ClassificationENBenchmark
Created by google-research-datasets at 2022, the google-research-datasets/paws is a text classification benchmark dataset in EN containing 751,450 records in Parquet format. With 99.3K downloads and 40 likes, it is actively used by the community. It is released under the other license and is a 100K<n<1M-scale dataset.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About google-research-datasets/paws
Dataset Card for PAWS: Paraphrase Adversaries from Word Scrambling
Dataset Summary
PAWS: Paraphrase Adversaries from Word Scrambling
This dataset contains 108,463 human-labeled and 656k noisily labeled pairs that feature the importan...
Details
- Task
- Text Classification
- Language
- EN
- Format
- Parquet
- Rows / instances
- 751450
- Size
- 100K<n<1M
- Creator
- google-research-datasets
- Year
- 2022
- License
- other
- Downloads
- 99292
- Likes
- 40