Find open-source science resources
Cross-domain directory aggregating tools, AI models, datasets, and research resources from bio.tools, Bioconductor, HuggingFace, curated GitHub awesome-lists, and more.
Filters
Domain(1)
Language
License
Source
Type
10 of 5,674 resources
Parse scientific papers to structured fields (title/author/sections/references)
Machine learning software for extracting structured metadata from scholarly documents
Large-scale PDF/LaTeX/JATS parsing to standardized JSON for millions of papers
High-accuracy PDF→Markdown/JSON/HTML conversion, specialized for tables/formulas/code blocks with benchmark scripts
Production-grade ETL for transforming complex documents into structured formats, with open-source API
Advanced OCR with PP-StructureV3 document parsing, 13% accuracy improvement, supports 80+ languages
Toolkit for linearizing academic PDFs into LLM-ready text with high accuracy and structure preservation, optimized for scientific literature extraction
Neural optical understanding for academic documents, transforms scientific PDFs to Markdown with mathematical formula support
Comprehensive toolkit for high-quality PDF content extraction with layout detection, formula recognition, and OCR
SOTA multimodal document parsing with 1.2B parameters outperforming GPT-4o, converts PDFs to LLM-ready Markdown/JSON