HuggingFaceM4/WebSight
General NLPENcc-by-4.0
Created by HuggingFaceM4 at 2024, the HuggingFaceM4/WebSight is a General NLP dataset in EN containing 2,745,658 records in Parquet format. With 9.3K downloads and 395 likes, it is actively used by the community. It is released under the cc-by-4.0 license and is a 1M<n<10M-scale dataset.
About HuggingFaceM4/WebSight
Dataset Card for WebSight
Dataset Description
WebSight is a large synthetic dataset containing HTML/CSS codes representing synthetically generated English websites, each accompanied by a corresponding screenshot.
This dataset serves ...
Details
- Task
- General NLP
- Language
- EN
- Format
- Parquet
- Rows / instances
- 2745658
- Size
- 1M<n<10M
- Creator
- HuggingFaceM4
- Year
- 2024
- License
- cc-by-4.0
- Downloads
- 9255
- Likes
- 395