gbyuvd/chemembed-chemselfies
ChemFIE-BED is a sentence-transformers based on gbyuvd/chemselfies-base-bertmlm fine-tuned on around (for now) 2 million pairs of valid molecules' SELFIES (Krenn et al. 2020) taken from COCONUTDB (Sorokina et al. 2021) and ChemBL34 (Zdrazil et al. 2023).
README
libraryname: sentence-transformers metrics: pearsoncosine spearmancosine pearsonmanhattan spearmanmanhattan pearsoneuclidean spearmaneuclidean pearsondot spearmandot pearsonmax spearmanmax pipelinetag: sentence-similarity tags: sentence-transformers sentence-similarity feature-extraction loss:MatryoshkaLoss loss:CosineSimilarityLoss chemistry biology drug-discovery herbal coconutdb chembl34 selfies drugs molecules compounds widget: sourcesentence: >- [N] [C] [=N] [C] [=N] [C] [=C] [Ring1]…
Source attribution
- HuggingFace — gbyuvd/chemembed-chemselfies
Related resources
This model is a BERT-like sequence classifier for 221 human protein drug targets, fine-tuned from gbyuvd/chemselfies-base-bertmlm on a dataset derived ChemBL34 (Zdrazil et al. 2023). It predicts potential drug targets using chemical structures represented as SELFIES (Self-Referencing Embedded…
This model is a lightweight model pre-trained on SELFIES (Self-Referencing Embedded Strings) representations of molecules. It is trained on 2.7M unique and valid molecules taken from COCONUTDB and ChemBL34, with 7.3M total generated masked examples.
ChemFIE-SA is a BERT-like sequence classifier for predicting synthesis accessibility given a SELFIES string of a compound, fine-tuned from gbyuvd/chemselfies-base-bertmlm on DeepSA's expanded dataset from Wang et al. 2023.
datasets: - UMLS