Find open-source science resources
Cross-domain directory aggregating tools, AI models, datasets, and research resources from bio.tools, Bioconductor, HuggingFace, curated GitHub awesome-lists, and more.
Filters
Domain
Language
License
Source(1)
Type
365 of 5,674 resources
Showing 1–50
AI for chemical reaction prediction and synthesis planning
Universal medical image segmentation foundation model trained on 1.57M image-mask pairs across 10 imaging modalities and 30+ cancer types (Nature Communications 2024)
ECMWF's unified framework and command-line tool to run AI-based weather forecasting models (GraphCast, Aurora, Pangu, NeuralGCM, FourCastNet) with operational ECMWF data infrastructure, enabling standardized inference and benchmarking across state-of-the-art meteorological AI systems (ECMWF, 576+ stars)
Unified Python framework for bulk, single-cell, and spatial RNA-seq multi-omics analysis with deep learning deconvolution (VAE) and graph neural networks, bridging Bindea, Bindea, scanpy and squidpy ecosystems (Nature Communications 2024)
Julia differential equations suite
Industrial-grade reinforcement-learning-based generative platform for de novo molecular design with transformer architectures, supporting multi-objective optimization, scaffold decoration, and curriculum learning (AstraZeneca MolecularAI, REINVENT 4, 2024)
Shanghai AI Lab's deep learning-based global weather forecasting model pushing skillful forecasts beyond 10 days lead, with open-source inference code and pretrained ONNX model weights (arXiv 2023)
E(3)-equivariant neural network interatomic potentials achieving DFT accuracy with up to 1000× less training data than invariant models, foundational architecture behind MACE and Allegro (Harvard, MIT, Nature Communications 2022)
Open-source framework for building physics-ML models at scale (renamed from Modulus, 2025)
Pretrained time series foundation model for zero-shot forecasting across diverse scientific and real-world domains; tokenizes continuous time series into discrete bins to train transformer language models on large-scale corpora, achieving strong zero-shot generalization and competitive performance with task-specific supervised models on climate, energy, and health benchmarks (5.3K+ stars, Apache 2.0, 2024-2026)
AI-powered pipeline converting papers into interactive websites, posters, and multimedia presentations with "Let's Make Your Paper Alive!" philosophy
Transformer encoder-decoder for de novo peptide sequencing from tandem mass spectrometry, translating MS/MS spectra directly to peptide sequences without reference databases, enabling identification of novel peptides for immunopeptidomics, antibody repertoires, and metaproteomes (Noble Lab UW, Nature Communications 2024)
General multimodal protein design framework enabling DNA-encoding of chemistry for programmable enzyme design and diverse protein generation through diffusion-based generative modeling (190+ stars, Apache 2.0, 2026)
Interactive and hardware-agnostic SDK for laboratory automation, enabling programmatic control of liquid handlers, plate readers, and other lab instruments across multiple vendors; foundational infrastructure for self-driving laboratories and AI-driven experimental execution (447+ stars)
Agent skill for AI-assisted scientific manuscript writing review distilled from Stanford's *Writing in the Sciences* course, performing five sequential editorial audit passes on clarity, voice, structure, consistency, and integrity (2026)
Unified framework for state-of-the-art pre-trained bio foundation models across genomics and transcriptomics, providing standardized interfaces and pipelines for DNA, RNA, and single-cell models including Evo 2, Geneformer, scGPT, and UCE with streamlined inference, benchmarking, and fine-tuning workflows (213+ stars, 2024-2025)
LLM papers for scientific discovery
Equivariant graph attention Transformer (ICLR2023)
First agentic LLM for autonomous data science with end-to-end pipeline from data to analyst-grade reports
Open-source medical large language model for complex clinical reasoning, extending the o1 long-chain-of-thought paradigm to biomedical question answering and diagnostic inference (FreedomIntelligence, 1.3K+ stars)
Multimodal AI system generating virtual populations for tumor microenvironment modeling from H&E and multiplex immunofluorescence pathology images, enabling large-scale spatial analysis of cancer biology and therapeutic response prediction (Microsoft Research & Providence, 370+ stars)
Structure prediction and design of proteins with noncanonical amino acids, enabling AI-powered modeling of synthetic biology constructs and expanded genetic code systems (133+ stars, 2025)
LLM agents for working with the SRA (Sequence Read Archive) and associated bioinformatics databases, enabling natural language querying of high-throughput sequencing data and metadata across genomic repositories (Arc Institute, 169+ stars, 2024-2026)
Cross-platform library for differentiable programming of quantum computers with automatic differentiation, enabling hybrid quantum-classical machine learning for quantum chemistry, quantum physics, and NISQ algorithm research (Xanadu, 3k+ stars)
Universal pretrained neural network potential with charge and magnetic moment awareness, trained on 1.5M+ Materials Project inorganic structures for charge-informed molecular dynamics and phase diagram prediction (Berkeley, Nature Machine Intelligence 2023 Cover)
Deep learning library for Chemistry based on Tensorflow
Foundation model for universal cell segmentation achieving state-of-the-art performance across bacteria, tissue, yeast, cell culture, and diverse imaging modalities (brightfield, fluorescence, phase), with pip-installable inference and Napari plugin (vanvalenlab/Caltech, bioRxiv 2024)
Diffusion model for scalable protein structure design with multi-motif scaffolding capabilities, achieving state-of-the-art designability, diversity, and novelty through SE(3)-equivariant attention and massive data augmentation (AlQuraishi Lab, 2024)
Pretrained time series foundation model for long-horizon forecasting across diverse scientific domains including climate variables, biomedical signals, and physical observations; decoder-only Transformer architecture with strong zero-shot generalization (19.8K+ stars, Apache 2.0, 2024-2025)
DeepMind's Olympiad-level geometry theorem prover combining neural language model with symbolic deduction engine, AlphaGeometry2 solves 84% of IMO geometry problems (42/50) at gold-medalist level (Nature 2024)
Co-create PowerPoint presentations with Generative AI from documents or topics
Comprehensive collection of Chinese medical datasets for AI research
Curated open dataset collection of 602M+ observational and perturbational single-cell profiles for accelerating virtual cell model creation, integrating Tahoe-100M and scBaseCount data with Google Cloud Marketplace distribution (Arc Institute, 2025-2026)
Probabilistic framework for inferring cell fate decisions and trajectory dynamics from multi-view single-cell data using Markov chains and machine learning, integrating RNA velocity, pseudotime, and metabolic labeling to predict differentiation paths and terminal states (scverse/Theis Lab, 449+ stars, BSD 3-Clause)
Google DeepMind's official collection of agentic science skills accelerating scientific workflows with better grounding and higher token efficiency, integrating insights from AlphaGenome, AFDB, UniProt and 30+ other databases and tools (2026)
Comprehensive survey of foundation models for weather and climate data understanding
Biomedical AI agents
Physics-informed ML and SciML
200+ AI for Science papers with Chinese interpretations
LLM agents across scientific domains
Autonomous AI scientist research
PINN research collection
LLM for scientific research papers
Curated scientific LLM papers (260+ models)
Parallel Computing and Scientific Machine Learning: MIT 18.337J/6.338J course materials (1.9k+ stars)
Graph neural network library for PyTorch enabling molecular modeling, materials discovery, protein interaction networks, and scientific knowledge graph learning (23.7k+ stars)
Computational fluid dynamics in JAX, enabling differentiable Navier-Stokes simulations with automatic differentiation for ML-accelerated CFD research, supporting turbulence modeling, convection-diffusion, and complex boundary conditions on CPUs and GPUs (Google Research, 947+ stars)
GPU-accelerated differentiable physics simulation engine built on NVIDIA Warp, supporting rigid/soft body, cloth, and gradient-based optimization for scientific ML, initiated by Disney Research, DeepMind, and NVIDIA (Linux Foundation, Apache 2.0, 2025)
End-to-end molecular dynamics engine built on PyTorch, enabling differentiable simulations with neural network potentials and GPU acceleration for machine learning-accelerated molecular dynamics (MIT License, 707+ stars)
Deep learning package for many-body potential energy representation and molecular dynamics, achieving quantum-mechanical accuracy with classical MD efficiency (DeepModeling, Gordon Bell Prize 2020, 1.9k+ stars)