BGI-HangzhouAI/Genos-m

text-generation
Actively maintainedby BGI-HangzhouAI241updated 4 days ago
Python

Genos-m is a foundation model for human-associated microbial genomes. It is trained to model microbial DNA sequences at single-nucleotide resolution and supports ultra-long genomic contexts up to one million tokens.

README

license: apache-2.0 language: dna tags: biology genomics microbe dna-language-model mixture-of-experts metagenomics library_name: transformers Genos-m Genos-m is a foundation model for human-associated microbial genomes. It is trained to model microbial DNA sequences at single-nucleotide resolution and supports ultra-long genomic contexts up to one million tokens. For instructions, details, benchmarks, and examples, please refer to Genos-m GitHub and paper. Model Specification | Specification |…

Source attribution

  • HuggingFaceBGI-HangzhouAI/Genos-m

Related resources

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 54% better perplexity than standard knowledge distillation at 9.4x compression.

52 months ago
Python

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 31% better perplexity than standard knowledge distillation at 3.8x compression.

662 months ago
Python

darkknight25/deepseek-16b-medical-GPT is a fine-tuned version of deepseek-ai/deepseek-l6b-moe-chat, optimized for medical question answering, reasoning, and clinical summarization using QLoRA and open-access healthcare datasets.

010 months ago
Python

A patient-level disease classification model trained on single-cell RNA-seq data. Given a matrix of gene expression profiles (one row per cell), the model produces a disease-category prediction for the patient.

692 weeks ago
Python

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 87% better perplexity than standard knowledge distillation at 20x compression.

202 months ago
Python