Find open-source science resources

Autonomous Research Systems (2023-2025 Breakthroughs)

Skill operating layer for biomedical AI agents with 211 production-ready SKILL.md files across 7 domains (biology, pharmacology, medicine, data science, literature search), enabling modular dry-lab reasoning and protocol composition for Stanford LabOS-compatible agents

ToolUniverse

Autonomous Research Systems (2023-2025 Breakthroughs)

Democratizing AI scientists by transforming any LLM into research systems with 600+ scientific tools (Harvard MIMS)

freephdlabor

Autonomous Research Systems (2023-2025 Breakthroughs)

First fully customizable open-source multiagent framework automating complete research lifecycle from idea conception to LaTeX papers with dynamic workflows

InternAgent

Autonomous Research Systems (2023-2025 Breakthroughs)

Closed-loop multi-agent system from hypothesis to verification across 12 scientific tasks, #1 on MLE-Bench (36.44%)

AIDE (WecoAI, arXiv 2025)

Autonomous Research Systems (2023-2025 Breakthroughs)

LLM-driven machine learning engineering agent using agentic tree search to autonomously draft, debug and benchmark ML code; wins 4× more medals than the best linear agent on OpenAI's MLE-Bench (75 Kaggle competitions) (1.3K+ stars, MIT License)

AI-Researcher

Autonomous Research Systems (2023-2025 Breakthroughs)

Autonomous pipeline from literature review→hypothesis→algorithm implementation→publication-level writing with Scientist-Bench evaluation

AutoResearchClaw

Autonomous Research Systems (2023-2025 Breakthroughs)

Fully autonomous research from idea to paper with multi-agent debate, citation verification, and OpenClaw integration (11K+ stars, 2026)

AlphaResearch

Autonomous Research Systems (2023-2025 Breakthroughs)

Autonomous algorithm discovery combining evolutionary search with peer-review reward models, achieving best-known performance on circle packing problems

Kosmos

Autonomous Research Systems (2023-2025 Breakthroughs)

Extended autonomy AI scientist with 200 parallel agent rollouts, 42K lines of code execution, 1.5K papers analyzed per run, achieving 79.4% accuracy and 7 scientific discoveries (Edison Scientific)

DeepScientist

Autonomous Research Systems (2023-2025 Breakthroughs)

First system progressively surpassing human SOTA on frontier AI tasks (183.7%, 1.9%, 7.9% improvements), month-long autonomous discovery with 20,000+ GPU hours

Virtual Lab (Stanford Zou Group, Nature 2025)

Autonomous Research Systems (2023-2025 Breakthroughs)

AI-human collaborative research platform where a human researcher works with a team of LLM agents via team and individual meetings to perform scientific research; demonstrated by designing new SARS-CoV-2 nanobodies with wet-lab validation

OpenEvolve

Autonomous Research Systems (2023-2025 Breakthroughs)

Open-source implementation of AlphaEvolve's evolutionary coding agent paradigm, enabling LLMs to autonomously discover and optimize algorithms through iterative evolution, matching the approach behind DeepMind's breakthrough matrix multiplication discovery (6.2K+ stars, 2025)

FunSearch (DeepMind, Nature 2023)

Autonomous Research Systems (2023-2025 Breakthroughs)

First system to make novel, verifiable scientific discoveries by pairing LLMs with evolutionary search, solving open problems in combinatorics (cap set problem) and discovering faster matrix multiplication algorithms

Awesome-LLM-KG

Knowledge Graph Resources

Comprehensive collection of papers on unifying LLMs and knowledge graphs

KoPA

Knowledge Graph Construction

Structure-aware prefix adaptation for integrating LLMs with knowledge graphs (ACM MM 2024)

GraphGen

Knowledge Graph Construction

Knowledge graph-guided synthetic data generation for LLM fine-tuning, achieving strong performance on scientific QA (GPQA-Diamond) and math reasoning (AIME)

iText2KG

Knowledge Graph Construction

Incremental knowledge graph construction using LLMs with entity extraction and Neo4j visualization

Claude Prism

Scientific Writing & Collaboration

Offline-first scientific writing workspace powered by Claude, integrating LaTeX, Python, and 100+ scientific skills with local execution, Zotero integration, and privacy-focused design (2026)

Obsidian Smart Connections

Scientific Writing & Collaboration

AI-powered note linking and research graph navigation

Zotero-GPT (MuiseDestiny)

Literature Management Plugins

Classic open-source plugin for document Q&A and summarization within Zotero

PapersGPT for Zotero

Literature Management Plugins

Multi-PDF conversation, retrieval, and citation in Zotero with commercial/local models (Ollama), MCP support

llm-for-zotero

Literature Management Plugins

Research agent system deeply integrated with Zotero supporting Agent Mode, skills, multi-model backends (OpenAI-compatible, Claude Code, WebChat, Codex), and MinerU PDF parsing for literature Q&A, summarization, figure inspection, and source comparison (1.3K+ stars, 2026)

AutoR

Human-centered research OS with terminal-first harness and local browser Studio, turning research work into reproducible artifact-backed runs through a 9-stage workflow with human approval gates, resume/rollback controls, and venue-aware manuscript packaging (1K+ stars, 2026)

OpenBioMed

Open-source biomedical AI platform integrating multimodal foundation models (BioMedGPT, PharmolixFM, LangCell) with agentic workflows and 45+ Claude Code skills for drug discovery, protein engineering, and single-cell omics analysis (PharMolix & Tsinghua AIR, 1K+ stars, 2023-2026)

Notebook Intelligence (NBI)

AI coding assistant for JupyterLab with agent mode, supporting arbitrary LLM providers (2025+)

Jupyter AI (JupyterLab Extension)

Official Jupyter extension with `%%ai` magic commands and sidebar chat assistant, connecting multiple model providers and local inference

STORM

LLM agent system synthesizing Wikipedia-like long-form research articles from scratch through multi-perspective question asking, web retrieval, and citation-grounded report generation, with Co-STORM extension for collaborative human-LLM knowledge curation conversations (Stanford OVAL, NAACL 2024 & EMNLP 2024)

paper-reviewer

Generate comprehensive reviews from arXiv papers and convert to blog posts

Valsci

Self-hostable scientific claim-verification and literature-review tool combining Semantic Scholar retrieval, bibliometric scoring, and LLM-based evidence synthesis for large-batch validation workflows

OpenScholar

Retrieval-augmented LM synthesizing scientific literature from 45M papers with human-expert-level citation accuracy, outperforming GPT-4o by 5% on ScholarQABench (Nature 2026, UW & Ai2)

PaperQA2

High-accuracy RAG for scientific PDFs with citation support, agentic RAG, and contradiction detection

TableBank

Figure & Table Extraction

Large-scale table detection and recognition dataset with pre-trained models

PDFFigures2

Figure & Table Extraction

Extract figures, tables, captions, and section titles from scholarly PDFs

Mozilla document-to-markdown

Production Pipelines & Data Preparation

Docling-powered parsing with UI/CLI demonstration for rapid prototyping

Science-Parse / SPv2 (AllenAI)

Parse scientific papers to structured fields (title/author/sections/references)

GROBID

Machine learning software for extracting structured metadata from scholarly documents

S2ORC doc2json (AllenAI)

Large-scale PDF/LaTeX/JATS parsing to standardized JSON for millions of papers

Marker

High-accuracy PDF→Markdown/JSON/HTML conversion, specialized for tables/formulas/code blocks with benchmark scripts

Unstructured

Production-grade ETL for transforming complex documents into structured formats, with open-source API

PaddleOCR 3.0 (2024/2025)

Advanced OCR with PP-StructureV3 document parsing, 13% accuracy improvement, supports 80+ languages

olmOCR (AllenAI)

Toolkit for linearizing academic PDFs into LLM-ready text with high accuracy and structure preservation, optimized for scientific literature extraction

Nougat (Meta AI)

Neural optical understanding for academic documents, transforms scientific PDFs to Markdown with mathematical formula support

PDF-Extract-Kit (2024)

Comprehensive toolkit for high-quality PDF content extraction with layout detection, formula recognition, and OCR

MinerU (2024/2025)

SOTA multimodal document parsing with 1.2B parameters outperforming GPT-4o, converts PDFs to LLM-ready Markdown/JSON

Paper2Code

Automated Code Generation

Automated code generation from machine learning research papers into runnable implementations (4.5K+ stars, 2025)

Chart-to-Text Datasets

Chart-to-Code & Reproducibility

Large-scale chart summarization datasets for training chart description capabilities

ChartAssistant / ChartAst (ACL 2024)

Chart-to-Code & Reproducibility

Universal chart comprehension and reasoning model

PaperBanana

Figure & Illustration Generation

Automated academic illustration generation for AI scientists, converting research papers into publication-ready figures using VLMs and diffusion models with iterative refinement (PKU & Google Research, 6.2K+ stars, 2026)

paper2video

Video & Media Generation

Transform arXiv research papers into engaging presentations and YouTube-ready videos

Paper2Video