Skip to content
@datalab-to

Datalab

Developing state of the art document intelligence models.

Pinned Loading

  1. marker marker Public

    Convert PDF to markdown + JSON quickly with high accuracy

    Python 25.7k 1.7k

  2. surya surya Public

    OCR, layout analysis, reading order, table recognition in 90+ languages

    Python 17.6k 1.2k

  3. pdftext pdftext Public

    Extract structured text from pdfs quickly

    Python 487 50

Repositories

Showing 4 of 4 repositories
  • surya Public

    OCR, layout analysis, reading order, table recognition in 90+ languages

    datalab-to/surya’s past year of commit activity
    Python 17,585 GPL-3.0 1,168 126 8 Updated Jun 6, 2025
  • marker Public

    Convert PDF to markdown + JSON quickly with high accuracy

    datalab-to/marker’s past year of commit activity
    Python 25,688 GPL-3.0 1,653 254 25 Updated Jun 6, 2025
  • datalab-to/inference-mirror’s past year of commit activity
    Python 1 0 0 0 Updated Jun 6, 2025
  • pdftext Public

    Extract structured text from pdfs quickly

    datalab-to/pdftext’s past year of commit activity
    Python 487 Apache-2.0 50 7 3 Updated May 30, 2025

Top languages

Python

Most used topics

Loading…