Find open-source science resources

Cross-domain directory aggregating tools, AI models, datasets, and research resources from bio.tools, Bioconductor, HuggingFace, curated GitHub awesome-lists, and more.

5,674 resources indexed

Showing 351400

ScVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. The learned low-dimensional latent representation of the data can be used for visualization and clustering.

02 months ago

Stereoscope is a variational inference model for single-cell RNA-seq data that can learn a cell-type specific rate of gene expression. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in Stereoscope.

02 months ago

CondSCVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in DestVI.

02 months ago

ScANVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. In addition, to scVI, ScANVI is a semi-supervised model that can leverage labeled data to learn a cell-type classifier in the latent space…

02 months ago

ScVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. The learned low-dimensional latent representation of the data can be used for visualization and clustering.

02 months ago

Stereoscope is a variational inference model for single-cell RNA-seq data that can learn a cell-type specific rate of gene expression. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in Stereoscope.

02 months ago

CondSCVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in DestVI.

02 months ago

ScANVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. In addition, to scVI, ScANVI is a semi-supervised model that can leverage labeled data to learn a cell-type classifier in the latent space…

02 months ago

ScVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. The learned low-dimensional latent representation of the data can be used for visualization and clustering.

02 months ago

Stereoscope is a variational inference model for single-cell RNA-seq data that can learn a cell-type specific rate of gene expression. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in Stereoscope.

02 months ago

CondSCVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in DestVI.

02 months ago

ScANVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. In addition, to scVI, ScANVI is a semi-supervised model that can leverage labeled data to learn a cell-type classifier in the latent space…

02 months ago

ScVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. The learned low-dimensional latent representation of the data can be used for visualization and clustering.

02 months ago
02 years ago

> A CMR-report contrastive model combining Vision Transformers and pretrained text encoders.

1410 months ago

Qwen3-8B-syco_med-gated-attention-FT is a plug-and-play gated attention weight released for AI safety research.

06 days ago
Python

PII Detection Model | 44M Parameters | Open Source

27K4 months ago
Python

FitCareer_AI Introduction

03 weeks ago

CodonTranslator is a protein-conditioned codon sequence generation model trained on the representative-only data_v3 release.

01 month ago

Model weights, configurations, and vocabularies for CpGPT: A Foundation Model for DNA Methylation.

03 months ago

Tahoe-x1 is a family of perturbation-trained single-cell foundation models with up to 3 billion parameters, developed by Tahoe Therapeutics. Pretrained on 266 million single-cell transcriptomic profiles including the Tahoe-100M perturbation compendium, Tahoe-x1 achieves state-of-the-art performance…

307 months ago

ScVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. The learned low-dimensional latent representation of the data can be used for visualization and clustering.

02 months ago

CondSCVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in DestVI.

02 months ago

ScVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space, integrate technical batches and impute dropouts. The learned low-dimensional latent representation of the data can be used for visualization and clustering.

02 months ago

CondSCVI is a variational inference model for single-cell RNA-seq data that can learn an underlying latent space. The predictions of the model are meant to be afterward used for deconvolution of a second spatial transcriptomics dataset in DestVI.

02 months ago

This repository contains pre-trained models from RadImageNet, a large-scale radiologic image dataset designed to facilitate transfer learning for medical imaging applications.

010 months ago
02 years ago

Target-Conditioned Molecular Ideation Model for Drug Discovery Research

05 days ago
Python

A diffusion language model for genome-scale perturbation prediction across diverse cellular contexts.

02 months ago
03 months ago

Unsloth Dynamic 2.0 achieves superior accuracy & outperforms other leading quants.

7.7K10 months ago
Python

This ontology integrates cell type markers for cells in the Cell Ontology from various sources along with details of marker context (anatomical context, assay), confidence (where available) and provenance. [from repository]

12 months ago
Python
624 years ago
HTML

The Bibliographic Ontology Specification provides main concepts and properties for describing citations and bibliographic references (i.e. quotes, books, articles, etc) on the Semantic Web.

42 months ago

netZooR unifies the implementations of several Network Zoo methods (netzoo, netzoo.github.io) into a single package by creating interfaces between network inference and network analysis methods. Currently, the package has 3 methods for network inference including PANDA and its optimized implementation OTTER (network reconstruction using mutliple lines of biological evidence), LIONESS (single-sample network inference), and EGRET (genotype-specific networks). Network analysis methods include CONDOR (community detection), ALPACA (differential community detection), CRANE (significance estimation of differential modules), MONSTER (estimation of network transition states). In addition, YARN allows to process gene expresssion data for tissue-specific analyses and SAMBAR infers missing mutation data based on pathway information.

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

The toolkit 'µSTASIS', or microSTASIS, has been developed for the stability analysis of microbiota in a temporal framework by leveraging on iterative clustering. Concretely, the core function uses Hartigan-Wong k-means algorithm as many times as possible for stressing out paired samples from the same individuals to test if they remain together for multiple numbers of clusters over a whole data set of individuals. Moreover, the package includes multiple functions to subset samples from paired times, validate the results or visualize the output.

RgnTX allows the integration of transcriptome annotations so as to model the complex alternative splicing patterns. It supports the testing of transcriptome elements without clear isoform association, which is often the real scenario due to technical limitations. It involves functions that do permutaion test for evaluating association between features and transcriptome regions.

Identifies motifs that are significantly co-enriched from enhancer-promoter interaction data. While enhancer-promoter annotation is commonly used to define groups of interaction anchors, spatzie also supports co-enrichment analysis between preprocessed interaction anchors. Supports BEDPE interaction data derived from genome-wide assays such as HiC, ChIA-PET, and HiChIP. Can also be used to look for differentially enriched motif pairs between two interaction experiments.

granulator is an R package for the cell type deconvolution of heterogeneous tissues based on bulk RNA-seq data or single cell RNA-seq expression profiles. The package provides a unified testing interface to rapidly run and benchmark multiple state-of-the-art deconvolution methods. Data for the deconvolution of peripheral blood mononuclear cells (PBMCs) into individual immune cell types is provided as well.

This package provides users with the ability to query the Human Cell Atlas data repository for single-cell experiment data. The `projects()`, `files()`, `samples()` and `bundles()` functions retrieve summary information on each of these indexes; corresponding `*_details()` are available for individual entries of each index. File-based resources can be downloaded using `files_download()`. Advanced use of the package allows the user to page through large result sets, and to flexibly query the 'list-of-lists' structure representing query responses.

receptLoss identifies genes whose expression is lost in subsets of tumors relative to normal tissue. It is particularly well-suited in cases where the number of normal tissue samples is small, as the distribution of gene expression in normal tissue samples is approximated by a Gaussian. Originally designed for identifying nuclear hormone receptor expression loss but can be applied transcriptome wide as well.

An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.

Implements bindings for SQL tables that are compatible with Bioconductor S4 data structures, namely the DataFrame and DelayedArray. This allows SQL-derived data to be easily used inside other Bioconductor objects (e.g., SummarizedExperiments) while keeping everything on disk.

This package allows users to perform DE analysis using multiple algorithms. It seeks consensus from multiple methods. Currently it supports "Voom", "EdgeR" and "DESeq". It uses RUV-seq (optional) to remove unwanted sources of variation.