bigcode/bigcode-pii-dataset
Token ClassificationCODE
The bigcode/bigcode-pii-dataset dataset is a CODE token classification resource from bigcode at 2023 comprising 12,099 examples. With 18 downloads and 56 likes, it is actively used by the community and is a 10K<n<100K-scale dataset.
About bigcode/bigcode-pii-dataset
PII dataset
Dataset description
This is an annotated dataset for Personal Identifiable Information (PII) in code. The target entities are: Names, Usernames, Emails, IP addresses, Keys, Passwords, and IDs.
The annotation process invo...
Details
- Task
- Token Classification
- Language
- CODE
- Format
- Parquet
- Rows / instances
- 12099
- Size
- 10K<n<100K
- Creator
- bigcode
- Year
- 2023
- Downloads
- 18
- Likes
- 56