Find open-source science resources

Cross-domain directory aggregating tools, AI models, datasets, and research resources from bio.tools, Bioconductor, HuggingFace, curated GitHub awesome-lists, and more.

5,674 resources indexed

Showing 251300

Foundation model for universal cell segmentation achieving state-of-the-art performance across bacteria, tissue, yeast, cell culture, and diverse imaging modalities (brightfield, fluorescence, phase), with pip-installable inference and Napari plugin (vanvalenlab/Caltech, bioRxiv 2024)

1956 months ago
Python
NOASSERTION

An Apache-based persistent URL (PURL) service

52 weeks ago
HTML
MIT

Diffusion model for scalable protein structure design with multi-motif scaffolding capabilities, achieving state-of-the-art designability, diversity, and novelty through SE(3)-equivariant attention and massive data augmentation (AlQuraishi Lab, 2024)

1921 year ago
Python
Apache-2.0

Experiments with expanded ensembles to explore chemical space.

1996 months ago
Python
MIT

Pretrained time series foundation model for long-horizon forecasting across diverse scientific domains including climate variables, biomedical signals, and physical observations; decoder-only Transformer architecture with strong zero-shot generalization (19.8K+ stars, Apache 2.0, 2024-2025)

20.1K6 days ago
Python
Apache-2.0

DeepMind's Olympiad-level geometry theorem prover combining neural language model with symbolic deduction engine, AlphaGeometry2 solves 84% of IMO geometry problems (42/50) at gold-medalist level (Nature 2024)

4.8K4 months ago
Python
Apache-2.0

Co-create PowerPoint presentations with Generative AI from documents or topics

3582 weeks ago
Python
MIT

Comprehensive collection of Chinese medical datasets for AI research

2801 year ago

A Swiss Army knife for genome arithmetic.

1K1 year ago
C
MIT

ScANVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. In addition, to scVI, ScANVI is a semi-supervised model that can leverage labeled data to learn a cell-type classifier in the latent space…

02 months ago

# ACE-V1.1: Brain Tumor Detection !Python!Format > [!CAUTION] > MEDICAL RESEARCH USE ONLY. ACE-V1.1 is NOT a cleared medical device. It must not be used for primary diagnosis or clinical decision-making. All outputs must be verified by a qualified professional.

04 weeks ago
03 months ago

A Python package for protein dynamics analysis

5462 months ago
Python
NOASSERTION

OpenChem is a deep learning toolkit for Computational Chemistry with PyTorch backend.

7452 years ago
Python
MIT

Pythonic Access to the Ensembl database.

4001 week ago
Python
Apache-2.0

lumpy: a general probabilistic framework for structural variant discovery.

3423 months ago
C
MIT

The Context and Measurement Ontology (COMO) contains ontological terms to describe the context for various types of experimental data and measurements. It is useful in its current state for several different environmental microbiology projects. This ontology is used in multiple CORAL (Contextual Ontology-based Repository Analysis Library) deployments.

81 month ago
Python
AGPL-3.0-only

The Common Core Ontologies (CCO) comprise twelve ontologies that are designed to represent and integrate taxonomies of generic classes and relations across all domains of interest. CCO is a mid-level extension of Basic Formal Ontology (BFO), an upper-level ontology framework widely used to structure and integrate ontologies in the biomedical domain (Arp, et al., 2015). BFO aims to represent the most generic categories of entity and the most generic types of relations that hold between them, by defining a small number of classes and relations. CCO then extends from BFO in the sense that every class in CCO is asserted to be a subclass of some class in BFO, and that CCO adopts the generic relations defined in BFO (e.g., has_part) (Smith and Grenon, 2004). Accordingly, CCO classes and relations are heavily constrained by the BFO framework, from which it inherits much of its basic semantic relationships.

3315 days ago
Python
CC-BY-4.0

MIMIC-III is a dataset comprising health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012

1462 years ago
PLpgSQL
68 years ago
Python

Use this database to browse the CMECS classification and to get definitions for individual CMECS Units. This database contains the units that were published in the Coastal and Marine Ecological Classification Standard.

86 months ago
CC0-1.0

An ontology that enables characterization of the nature or type of citations, both factually and rhetorically.

156 years ago
CC-BY-4.0

An ontology encoding the Common Information Model (CIM) schema

84 years ago

The Chromosome Ontology is an automatically derived ontology of chromosomes and chromosome parts.

165 days ago
Python
BSD-3-Clause

An ontology developed as part of the Chemical Analysis Metadata Project (ChAMP) as a resource to semantically annotate standards developed using the ChAMP platform. (source: CAO ontology)

01 year ago
Makefile
CC-BY-3.0

This is a code repository for the SIB - Swiss Institute of Bioinformatics CALIPHO group neXtProt project, which is a comprehensive human-centric discovery platform, that offers a integration of and navigation through protein-related data. CALIPHO is an interdisciplinary team which aims to use a variety of methodologies to help uncover the function of uncharacterized human proteins.

22 years ago
CC-BY-4.0

SO is a collaborative ontology project for the definition of sequence features used in biological sequence annotation. It is part of the Open Biomedical Ontologies library.

1058 months ago
Makefile
3753 days ago
Python

Upper-Level ontology for Biology and Medicine. Compatible with BFO, DOLCE, and the UMLS Semantic Network

48 years ago
Perl
CC-BY-3.0

BioTools is a registry of databases and software with tools, services, and workflows for biological and biomedical research.

871 week ago
HTML
CC-BY-4.0

Bioschemas aims to improve the Findability on the Web of life sciences resources such as datasets, software, and training materials. It does this by encouraging people in the life sciences to use Schema.org markup in their websites so that they are indexable by search engines and other services. Bioschemas encourages the consistent use of markup to ease the consumption of the contained markup across many sites. This structured information then makes it easier to discover, collate, and analyse distributed resources. [from BioSchemas.org]

633 days ago
HTML

The Bioregistry is integrative meta-registry of biological databases, ontologies, and nomenclatures that is backed by an open database.

1434 days ago
HTML
CC0-1.0

Biofactoid is a web-based system that empowers authors to capture and share machine-readable summaries of molecular-level interactions described in their publications.

291 year ago
JavaScript
CC0-1.0

BioCompute is shorthand for the IEEE 2791-2020 standard for Bioinformatics Analyses Generated by High-Throughput Sequencing (HTS) to facilitate communication. This pipeline documentation approach has been adopted by a few FDA centers. The goal is to ease the communication burdens between research centers, organizations, and industries. This web portal allows users to build a BioCompute Objects through the interface in a human and machine readable format.

171 year ago
HTML

The Bibframe vocabulary consists of RDF classes and properties used for the description of items cataloged principally by libraries, but may also be used to describe items cataloged by museums and archives. Classes include the three core classes - Work, Instance, and Item - in addition to many more classes to support description. Properties describe characteristics of the resource being described as well as relationships among resources. For example: one Work might be a "translation of" another Work; an Instance may be an "instance of" a particular Bibframe Work. Other properties describe attributes of Works and Instances. For example: the Bibframe property "subject" expresses an important attribute of a Work (what the Work is about), and the property "extent" (e.g. number of pages) expresses an attribute of an Instance.

545 months ago
Python
CC0-1.0

The Basic Register of Thesauri, Ontologies & Classifications (BARTOC) is a database of Knowledge Organization Systems and KOS related registries. The main goal of BARTOC is to list as many Knowledge Organization Systems as possible at one place in order to achieve greater visibility, highlight their features, make them searchable and comparable, and foster knowledge sharing. BARTOC includes any kind of KOS from any subject area, in any language, any publication format, and any form of accessibility. BARTOC’s search interface is available in 20 European languages and provides two search options: Basic Search by keywords, and Advanced Search by taxonomy terms. A circle of editors has gathered around BARTOC from all across Europe and BARTOC has been approved by the International Society for Knowledge Organization (ISKO).

273 days ago
JavaScript
PDDL-1.0
182 weeks ago
CC-BY-SA-4.0
04 months ago
CC-BY-4.0

The AOPO provides classes and relationships for the semantic representation of the Adverse Outcome Pathway framework.

A vocabulary for describing semantic assets, defined as highly reusable metadata (e.g. XML1 schemata, generic data models) and reference data (e.g. code lists, taxonomies, dictionaries, vocabularies).

25 days ago
HTML
CC-BY-4.0

An ontology to support disciplinary annotation of Arctic Data Center datasets.

14 years ago
R
CC-BY-4.0

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains.

1.4K2 years ago
Python

In recent years, pre-trained language models (PLMs) achieve the best performance on a wide range of natural language processing (NLP) tasks. While the first models were trained on general domain data, specialized ones have emerged to more effectively treat specific domains.

3092 years ago
Python

Specialized model for Chemical Entity Recognition - Identifies chemical compounds and substances in biomedical literature

529 months ago
Python

A patient-level disease classification model trained on single-cell RNA-seq data. Given a matrix of gene expression profiles (one row per cell), the model produces a disease-category prediction for the patient.

692 weeks ago
Python

In search enginers, rerankers are crucial for improving the accuracy of your retrieval system.

12.7K2 months ago
Python