tahoebio/Tahoe-x1

Maintenance lightby tahoebio3070updated 7 months ago

Tahoe-x1 is a family of perturbation-trained single-cell foundation models with up to 3 billion parameters, developed by Tahoe Therapeutics. Pretrained on 266 million single-cell transcriptomic profiles including the Tahoe-100M perturbation compendium, Tahoe-x1 achieves state-of-the-art performance…

README

license: apache-2.0 datasets: tahoebio/Tahoe-100M tags: biology single-cell RNA chemistry tahoebio pytorch Tahoe-x1 Tahoe-x1 is a family of perturbation-trained single-cell foundation models with up to 3 billion parameters, developed by Tahoe Therapeutics. Pretrained on 266 million single-cell transcriptomic profiles including the Tahoe-100M perturbation compendium, Tahoe-x1 achieves state-of-the-art performance on cancer-relevant tasks with 3–30× higher compute efficiency than other cell-state…

Source attribution

  • HuggingFacetahoebio/Tahoe-x1

Related resources

This model is a BERT-like sequence classifier for 221 human protein drug targets, fine-tuned from gbyuvd/chemselfies-base-bertmlm on a dataset derived ChemBL34 (Zdrazil et al. 2023). It predicts potential drug targets using chemical structures represented as SELFIES (Self-Referencing Embedded…

91 year ago
Python

# Model details ## Model description Nature Language Model (NatureLM) is a sequence-based science foundation model designed for scientific discovery. Pre-trained with data from multiple scientific domains, NatureLM offers a unified, versatile model that enables various applications including…

24011 months ago

ChemFIE-SA is a BERT-like sequence classifier for predicting synthesis accessibility given a SELFIES string of a compound, fine-tuned from gbyuvd/chemselfies-base-bertmlm on DeepSA's expanded dataset from Wang et al. 2023.

91 year ago
Python

# Model details ## Model description Nature Language Model (NatureLM) is a sequence-based science foundation model designed for scientific discovery. Pre-trained with data from multiple scientific domains, NatureLM offers a unified, versatile model that enables various applications including…

3311 months ago

This model is a lightweight model pre-trained on SELFIES (Self-Referencing Embedded Strings) representations of molecules. It is trained on 2.7M unique and valid molecules taken from COCONUTDB and ChemBL34, with 7.3M total generated masked examples.

137 months ago
Python

# InstaNovo: De novo Peptide Sequencing Model ## Model Description

362 weeks ago