Microsoft Research Paraphrase Corpus (MRPC)
Paraphrasing IdentificationEnglishBenchmark
Microsoft Research Paraphrase Corpus (MRPC) is a paraphrasing identification benchmark dataset in English from Dolan et al. with 5,8 records in Text format.
📊 This dataset is used as an LLM benchmark. See model leaderboards →
About Microsoft Research Paraphrase Corpus (MRPC)
Dataset contains pairs of sentences which have been extracted from news sources on the web, along with human annotations indicating whether each pair captures a paraphrase/semantic equivalence relationship.
Details
- Task
- Paraphrasing Identification
- Language
- English
- Format
- Text
- Rows / instances
- 5,8
- Creator
- Dolan et al.
- Year
- 2005