Find open-source science resources

A domain-optimized reasoning model built on DeepSeek-R1-Distill-Qwen-32B, refined through a multi-stage pipeline of GPTQ quantization-aware training and QLoRA fine-tuning. Achieves 84% on MedQA — within 4 points of GPT-4o — in a ~20GB package that fits on a single L40/L40s GPU.

8191 month ago

ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-CLINTOX-101

by ibm-research

# ibm/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-CLINTOX-101 biomed.sm.mv-te-84m is a multimodal biomedical foundation model for small molecules created using MMELON (Multi-view Molecular Embedding with Late Fusion), a flexible approach to aggregate multiple views (sequence, image, graph) of…

161 year ago

zhihan1996/DNA_bert_3

by zhihan1996

1.9K10 months ago

StanfordShahLab/clmbr-t-base

by StanfordShahLab

2.8K2 months ago

mradermacher/Qwen-3-32B-Medical-Reasoning-i1-GGUF

by mradermacher

For a convenient overview and download list, visit our model page for this model.

3.6K10 months ago

rajveer43/gemma-4-E4B-medical-legal-finance-qa

by rajveer43

Fine-tuned version of google/gemma-4-E4B-it across three professional domains — Medical, Legal, and Finance — using QLoRA (4-bit NF4) with Optuna-tuned hyperparameters, trained on Kaggle T4 GPU.

1K1 month ago

ncfrey/ChemGPT-19M

by ncfrey

# ChemGPT 19M ChemGPT is based on the GPT-Neo model and was introduced in the paper Neural Scaling of Deep Chemical Models.

6.8K3 years ago

ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-QM7-101

by ibm-research

# ibm/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-QM7-101 biomed.sm.mv-te-84m is a multimodal biomedical foundation model for small molecules created using MMELON (Multi-view Molecular Embedding with Late Fusion), a flexible approach to aggregate multiple views (sequence, image, graph) of…

191 year ago

cambridgeltl/SapBERT-from-PubMedBERT-fulltext

by cambridgeltl

feature-extraction

datasets: - UMLS

1.8M2 years ago

ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-SIDER-101

by ibm-research

# ibm/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-SIDER-101 biomed.sm.mv-te-84m is a multimodal biomedical foundation model for small molecules created using MMELON (Multi-view Molecular Embedding with Late Fusion), a flexible approach to aggregate multiple views (sequence, image, graph) of…

111 year ago

zeroentropy/zerank-2-reranker

by zeroentropy

text-ranking

In search engines, rerankers are crucial for improving the accuracy of your retrieval system.

150.1K2 weeks ago

ibm-research/biomed.omics.bl.sm.ma-ted-458m.moleculenet_clintox_tox

by ibm-research

Drugs must satisfy stringent criteria for both efficacy and safety. This model predicts the likelihood of failure in clinical toxicity trials for small-molecule drugs, represented using SMILES (Simplified Molecular Input Line Entry System) strings.

241 year ago

prov-gigatime/GigaTIME

by prov-gigatime

image-to-image

2705 months ago

thelamapi/next-ocr

by thelamapi

image-text-to-text

![Language: Multilingual]()

3.3K2 months ago

google/medgemma-27b-it

by google

image-text-to-text

215.5K10 months ago

andrewdalpino/ESM2-150M-Protein-Molecular-Function

by andrewdalpino

text-classification

An Evolutionary-scale Model (ESM) for protein function prediction from amino acid sequences using the Gene Ontology (GO). Based on the ESM2 Transformer architecture, pre-trained on UniRef50, and fine-tuned on the AmiGO dataset, this model predicts the GO subgraph for a particular protein sequence -…

1511 months ago

biohub/esm3-sm-open-v1

by biohub

3.8K1 year ago

ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-ESOL-101

by ibm-research

# ibm/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-ESOL-101 biomed.sm.mv-te-84m is a multimodal biomedical foundation model for small molecules created using MMELON (Multi-view Molecular Embedding with Late Fusion), a flexible approach to aggregate multiple views (sequence, image, graph) of…

221 year ago

zhihan1996/DNA_bert_6

by zhihan1996

26.9K10 months ago

PocketDoc/Dans-PersonalityEngine-V1.3.0-24b

by PocketDoc

Dans-PersonalityEngine-V1.3.0-24b Dans-PersonalityEngine-V1.3.0-24b ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⢀⠀⠄⠀⡂⠀⠁⡄⢀⠁⢀⣈⡄⠌⠐⠠⠤⠄⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⡄⠆⠀⢠⠀⠛⣸⣄⣶⣾⡷⡾⠘⠃⢀⠀⣴⠀⡄⠰⢆⣠⠘⠰⠀⡀⠀⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠃⠀⡋⢀⣤⡿⠟⠋⠁⠀⡠⠤⢇⠋⠀⠈⠃⢀⠀⠈⡡⠤⠀⠀⠁⢄⠀⠀⠀⠀ ⠀⠀⠀⠀⠀⠁⡂⠀⠀⣀⣔⣧⠟⠋⠀⢀⡄⠀⠪⣀⡂⢁⠛⢆⠀⠀⠀⢎⢀⠄⢡⠢⠛⠠⡀⠀⠄⠀⠀ ⠀⠀⡀⠡⢑⠌⠈⣧⣮⢾⢏⠁⠀⠀⡀⠠⠦⠈⠀⠞⠑⠁⠀⠀⢧⡄⠈⡜⠷⠒⢸⡇⠐⠇⠿⠈⣖⠂⠀…

1201 year ago

songlab/gpn-brassicales

by songlab

# GPN trained on Arabidopsis thaliana and 7 other Brassicales See https://github.com/songlab-cal/gpn for more details.

4791 year ago

birder-project/vit_reg1_s14_ls_dino-v2-dist-bio

by birder-project

image-feature-extraction

vitreg1s14lsdino-v2-dist-bio is a compact Bio-DINO image encoder distilled from the larger Bio-DINO SoViT-150M/14 model. It keeps the same natural-photography biodiversity scope as the teacher model, but uses a much smaller ViT-S/14-style student with 21.7M backbone parameters and 384-dimensional…

1913 days ago

ibm-research/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-MUV-101

by ibm-research

# ibm/biomed.sm.mv-te-84m-MoleculeNet-ligand_scaffold-MUV-101 biomed.sm.mv-te-84m is a multimodal biomedical foundation model for small molecules created using MMELON (Multi-view Molecular Embedding with Late Fusion), a flexible approach to aggregate multiple views (sequence, image, graph) of…

141 year ago

zhihan1996/DNA_bert_5

by zhihan1996

48410 months ago

andrewdalpino/ESM2-35M-Protein-Biological-Process

by andrewdalpino

text-classification

2511 months ago

nvidia/geneformer_V1_10M

by nvidia

## Description: Geneformer is a foundational transformer model pretrained on a large-scale corpus of single-cell transcriptomes to enable context-specific predictions in settings with limited data in network biology.

155 months ago

JThomas-CoE/coe-gemma4-biology-mmlu_pro-14b-a4b-q4

by JThomas-CoE

Base model: google/gemma-4-26b-it Architecture: MoE — 26B total / ≈4B active parameters (1 shared expert + 8 routed from a pool of 128 per MoE layer, 30 MoE layers) Method: Activation-directed expert surgery — 128 → 64 experts per layer (50% reduction) Quantization: Q4KM (≈9.7 GB on disk) Tags:…

581 week ago

nvidia/geneformer_V2_104M_CLcancer

by nvidia

135 months ago

Verdugie/STEM-Oracle-27B

by Verdugie

# or·a·cle /ˈôrəkəl/ — a source of wise counsel; one who provides authoritative knowledge. From Latin ōrāculum, meaning divine announcement. In computer science, an oracle is a black box that always returns the correct answer — you don't ask it how it knows, you ask and it answers.

1362 months ago

keejkrej/cellpose-cpsam-onnx

by keejkrej

image-segmentation

ONNX export of the Cellpose cpsam (Cellpose-SAM) model for cell segmentation in microscopy images.

03 months ago

darkknight25/deepseek-16b-medical-GPT

by darkknight25

darkknight25/deepseek-16b-medical-GPT is a fine-tuned version of deepseek-ai/deepseek-l6b-moe-chat, optimized for medical question answering, reasoning, and clinical summarization using QLoRA and open-access healthcare datasets.

010 months ago

nvidia/AMPLIFY_350M

by nvidia

> [!NOTE] > This model has been optimized using NVIDIA's TransformerEngine > library. Slight numerical differences may be observed between the original model and the optimized > model. For instructions on how to install TransformerEngine, please refer to the > official documentation.

278 months ago

mradermacher/Dans-PersonalityEngine-V1.3.0-24b-i1-GGUF

by mradermacher

For a convenient overview and download list, visit our model page for this model.

38910 months ago

SaltySander/MOSAIC

by SaltySander

011 months ago

nvidia/geneformer_V2_104M

by nvidia

245 months ago

Keylab/COMO

by Keylab

COMO (Closed-loop Optical Molecule recOgnition) is a deep learning framework for Optical Chemical Structure Recognition (OCSR). It recognizes chemical structure diagrams from images and predicts SMILES strings with atom-level 2D coordinates and bond matrices.

01 day ago

gbyuvd/chemembed-chemselfies

by gbyuvd

sentence-similarity

ChemFIE-BED is a sentence-transformers based on gbyuvd/chemselfies-base-bertmlm fine-tuned on around (for now) 2 million pairs of valid molecules' SELFIES (Krenn et al. 2020) taken from COCONUTDB (Sorokina et al. 2021) and ChemBL34 (Zdrazil et al. 2023).

6326 months ago

gbyuvd/chemselfies-base-bertmlm

by gbyuvd

This model is a lightweight model pre-trained on SELFIES (Self-Referencing Embedded Strings) representations of molecules. It is trained on 2.7M unique and valid molecules taken from COCONUTDB and ChemBL34, with 7.3M total generated masked examples.

137 months ago

littleworth/protgpt2-distilled-tiny

by littleworth

A compact protein language model distilled from ProtGPT2 using complementary-regularizer distillation---a method that combines uncertainty-aware position weighting with calibration-aware label smoothing to achieve 87% better perplexity than standard knowledge distillation at 20x compression.

202 months ago

SaeedLab/ProteoRift

by SaeedLab

feature-extraction

Github | Cite

52 months ago

InstaDeepAI/instanovo-phospho-v1.0.0

by InstaDeepAI

InstaNovo-P is a specialized transformer-based model for de novo peptide sequencing from phosphoproteomics mass spectrometry data. This model is specifically trained and optimized for identifying phosphorylated peptides and their modification sites.

282 weeks ago

andrewdalpino/ESM2-150M-Protein-Cellular-Component

by andrewdalpino

text-classification

2611 months ago

BGI-HangzhouAI/Genos-m

by BGI-HangzhouAI

Genos-m is a foundation model for human-associated microbial genomes. It is trained to model microbial DNA sequences at single-nucleotide resolution and supports ultra-long genomic contexts up to one million tokens.

245 days ago

AIRI-Institute/moderngena-base

by AIRI-Institute

# ModernGENA base ModernGENA is a DNA foundation model based on ModernBERT (a modernized BERT-style encoder architecture) adapted for genomic sequence modeling. ModernGENA base is the 377M-parameter version introduced in the paper Back to BERT in 2026: ModernGENA as a Strong, Efficient Baseline for…

4951 month ago

Acryl-aLLM/ALLM.H-Bv4-Gemma4-31B-BF16

by Acryl-aLLM

131 month ago

Junhauwong/Surge-Cognition-4x8B

by Junhauwong

325 days ago

prithivMLmods/Indian-Western-Food-34

by prithivMLmods

image-classification

!fffffff.png

271 year ago

BioMistral/BioMistral-7B-GGUF

by BioMistral

Abstract:

8752 years ago

birder-project/dino_v2_vit_reg4_so150m_p14_ls_bio

by birder-project

This repository contains the full Bio-DINO DINOv2 training weights for a SoViT-150M/14 Vision Transformer trained on natural photographs of living organisms. It is the companion release to the Birder backbone checkpoints at .

1324 days ago

scvi-tools/tabula-sapiens-large_intestine-scanvi

by scvi-tools