huggan/smithsonian_butterflies_subset
General NLPEnglish
The huggan/smithsonian_butterflies_subset dataset is a English General NLP resource from huggan at 2022. With 2K downloads and 56 likes, it is actively used by the community and is a 1K<n<10K-scale dataset.
About huggan/smithsonian_butterflies_subset
This a subset of "ceyda/smithsonian_butterflies" dataset with additional processing done to train the "ceyda/butterfly_gan" model.
The preprocessing includes:
Adding "sim_score" to images with CLIP model using "pretty butterfly","one butterfly","...
Details
- Task
- General NLP
- Language
- English
- Format
- Parquet
- Rows / instances
- N/A
- Size
- 1K<n<10K
- Creator
- huggan
- Year
- 2022
- Downloads
- 2018
- Likes
- 56