Upstash/wikipedia-2024-06-bge-m3
General NLPEN, DE, ES
Upstash/wikipedia-2024-06-bge-m3 is a General NLP dataset in EN, DE, ES from Upstash in Parquet format.
About Upstash/wikipedia-2024-06-bge-m3
Wikipedia Embeddings with BGE-M3
This dataset contains embeddings from the
June 2024 Wikipedia dump
for the 11 most popular languages.
The embeddings are generated with the multilingual
BGE-M3 model.
The dataset consists of Wikipedia articles s...
Details
- Task
- General NLP
- Language
- EN, DE, ES
- Format
- Parquet
- Rows / instances
- N/A
- Creator
- Upstash
- Year
- 2024