Skip to content

opendatalab/OmniDocBench

General NLPEnglish

Created by opendatalab at 2024, the opendatalab/OmniDocBench is a General NLP dataset in English in Parquet format.

About opendatalab/OmniDocBench

OmniDocBench English | 简体中文 OmniDocBench is an evaluation dataset for diverse document parsing in real-world scenarios, with the following characteristics: Diverse Document Types: The evaluation set contains 1651 PDF pages, covering 10 documen...

Details

Task
General NLP
Language
English
Format
Parquet
Rows / instances
N/A
Creator
opendatalab
Year
2024
Download

Related General NLP datasets

FAQ