SetFit/20_newsgroups
General NLPEnglish
SetFit/20_newsgroups is a General NLP-focused dataset in English distributed in Parquet format.
About SetFit/20_newsgroups
This is a version of the 20 newsgroups dataset that is provided in Scikit-learn. From the Scikit-learn docs:
The 20 newsgroups dataset comprises around 18000 newsgroups posts on 20 topics split in two subsets: one for training (or development) an...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- SetFit
- Year
- 2022