Find open-source science resources
Cross-domain directory aggregating tools, AI models, datasets, and research resources from bio.tools, Bioconductor, HuggingFace, curated GitHub awesome-lists, and more.
Filters
Domain
Language
License
Source
Type(1)
3,084 of 5,674 resources
Showing 2,851–2,900
Extract figures, tables, captions, and section titles from scholarly PDFs
Docling-powered parsing with UI/CLI demonstration for rapid prototyping
Parse scientific papers to structured fields (title/author/sections/references)
Machine learning software for extracting structured metadata from scholarly documents
Large-scale PDF/LaTeX/JATS parsing to standardized JSON for millions of papers
High-accuracy PDF→Markdown/JSON/HTML conversion, specialized for tables/formulas/code blocks with benchmark scripts
Production-grade ETL for transforming complex documents into structured formats, with open-source API
Advanced OCR with PP-StructureV3 document parsing, 13% accuracy improvement, supports 80+ languages
Toolkit for linearizing academic PDFs into LLM-ready text with high accuracy and structure preservation, optimized for scientific literature extraction
Neural optical understanding for academic documents, transforms scientific PDFs to Markdown with mathematical formula support
Comprehensive toolkit for high-quality PDF content extraction with layout detection, formula recognition, and OCR
SOTA multimodal document parsing with 1.2B parameters outperforming GPT-4o, converts PDFs to LLM-ready Markdown/JSON
Automated code generation from machine learning research papers into runnable implementations (4.5K+ stars, 2025)
Large-scale chart summarization datasets for training chart description capabilities
Universal chart comprehension and reasoning model
Automated academic illustration generation for AI scientists, converting research papers into publication-ready figures using VLMs and diffusion models with iterative refinement (PKU & Google Research, 6.2K+ stars, 2026)
Transform arXiv research papers into engaging presentations and YouTube-ready videos
First benchmark for automatic video generation from scientific papers (NeurIPS 2025)
Azure Semantic Kernel multi-agent PPT generation reference
Convert PDF files into editable slides with three lines of code
AI-powered tool that automatically converts academic papers (PDF) into presentation slides
Transform arXiv papers into Beamer slides using LLMs
Beyond text-to-slides generation with PPTEval multi-dimensional evaluation (EMNLP 2025)
Multimodal LLM for scientific charts and diagrams understanding/generation
Multi-agent system with Parser-Planner-Painter architecture converting `paper.pdf` to editable `poster.pptx`, outperforms GPT-4o with 87% fewer tokens
Comprehensive collection of 125+ ready-to-use scientific skill modules for Claude AI across bioinformatics, cheminformatics, clinical research, ML, and materials science
Programmatic data labeling and weak supervision
Multi-type data labeling and annotation tool
Secure text-to-visualization through standardized chart specifications
Automated data visualization with minimal code
Conversational data analysis using natural language
A curated list of molecular docking software, datasets, and other closely related resources.
A list of papers, data sets, and other resources for machine learning for small-molecule drug discovery.
Another list focuses on Python stuff related to Chemistry.
Chemoinformatics and drug discovery section in deeplearning-biology repo.
A teaching platform for computer-aided drug design (CADD) using open source packages and data.
an automated workflow for the generation and storage of DFT calculations for organic molecules.
Python-centric Cookiecutter for Molecular Computational Chemistry Packages by [MolSSL](https://molssi.org/)
Parsers and algorithms for computational chemistry logfiles.
Analysis of molecular dynamics trajectories.
Automates and standardizes ligand preparation for AutoDock Vina.
Open source web framework for small molecule analysis based on Django.
[RDKit](http://www.rdkit.org/) and [OSRA](https://cactus.nci.nih.gov/osra/) in the [Bottle](http://bottlepy.org/docs/dev/) on [Tornado](http://www.tornadoweb.org/en/stable/).
A python package for optimizing chemical reactions using machine learning (contains 10 algorithms + several benchmarks).
A Cheminformatics Modeling Laboratory for Fitting and Assessing Machine Learning Models in R.