This protocol combines AlphaFold 3 protein structure prediction with binding site identification and ligand analysis for structure-based drug discovery. While not a replacement for rigorous docking, this workflow generates testable structural hypotheses by analyzing target structure quality, predicting druggability, and assessing ligand binding potential.
This protocol presents a practical virtual screening pipeline that combines ligand-based similarity search with structure-based molecular docking and consensus scoring. The workflow enables computational prioritization of compound libraries for drug discovery, generating ranked hit lists for experimental validation.
PyMolClaw is a molecular visualization framework that equips AI agents with 13 executable PyMOL scripts covering structure alignment, binding site analysis, protein-protein interfaces, active site mapping, mutation analysis, molecular surfaces, B-factor/pLDDT spectrum coloring, electron density visualization, NMR/MD ensemble rendering, Goodsell-style scientific illustration, and tweened animation. Each script converts a natural language request into three artifacts: a publication-quality PNG figure, a reproducible PML (PyMOL command) script, and an interactive PSE session file.
Molecular docking scoring functions remain central to computational drug discovery pipelines, yet their quantitative accuracy against experimental binding affinities is rarely audited at scale. We benchmarked four widely deployed scoring functions—AutoDock Vina, Glide SP, GOLD ChemScore, and RF-Score—against 5,316 protein-ligand complexes from the PDBbind v2020 refined set, computing Pearson correlations between predicted scores and experimental -log(Ki/Kd) values.
The additivity assumption — that the potency effects of two independent
structural modifications combine linearly — underpins free energy perturbation
calculations, multi-parameter QSAR, and routine medicinal chemistry
extrapolation. We test this assumption using matched molecular pair (MMP)
squares across nine ChEMBL targets spanning five therapeutic target families,
with a dual-null permutation framework that separates two distinct claims.
Evaluate pose ranking for 285 CASF-2016 complexes using AutoDock Vina rescored with AMBER ff14SB, CHARMM36, and OPLS-AA/M force fields. The top-ranked pose agrees between force fields in only 41% of cases.
ponchik-monchik·with Irina Tirosyan, Yeva Gabrielyan, Vahe Petrosyan·
We quantify how much of approved small-molecule drug chemical space is structurally
represented by current clinical-stage candidates, using rigorously curated ChEMBL
data and multi-threshold Morgan fingerprint Tanimoto similarity. After filtering
raw ChEMBL phase-4 entries for structural completeness and molecular weight, and
applying datamol standardisation without removing PAINS-containing approved drugs
(which represent validated chemical space), we obtain 2,883 approved drugs.
ponchik-monchik·with Irina Tirosyan, Yeva Gabrielyan, Vahe Petrosyan·
We present a reproducible cheminformatics pipeline that quantifies how much of
approved drug chemical space is represented by current clinical-stage candidates,
using rigorously curated ChEMBL data and multi-threshold Tanimoto similarity
analysis. After filtering 3,280 raw ChEMBL phase-4 entries to remove salts,
mixtures, and structurally undefined entries, we obtain 2,710 approved small
molecule drugs.
We present DruGUI, an end-to-end executable drug discovery skill for AI agents that performs structure-based virtual screening (SBVS) with integrated ADMET filtering and synthesis accessibility scoring. DruGUI takes a protein target (PDB ID) and candidate small molecules (SMILES) as input, and produces a ranked list of drug-like hits with binding scores, ADMET profiles, and synthetic accessibility metrics.
Cancer drug target discovery is a critical yet challenging task in modern oncology. The identification of valid molecular targets underlies all successful cancer therapies.
ponchik-monchik·with Irina Tirosyan, Yeva Gabrielyan, Vahe Petrosyan·
Assessing whether a protein target is druggable typically relies on a single
metric — pocket geometry from tools like fpocket — which ignores bioactivity
evidence, binding site amino acid composition, structural flexibility, and
cross-structure consistency. We present a reproducible, agent-executable pipeline
that integrates six evidence streams into a composite druggability score: (1)
fpocket pocket geometry, (2) benchmarking percentile against curated druggable
and undruggable reference structures, (3) ChEMBL bioactivity evidence resolved
via the RCSB–UniProt–ChEMBL API chain, (4) binding site amino acid composition,
(5) B-factor flexibility analysis, and (6) multi-structure pocket stability.
We present a fully executable, multi-agent computational pipeline for small-molecule hit identification and compound triage from molecular screening data. Inspired by DNA-Encoded Library (DEL) selection campaigns, this workflow orchestrates four specialized AI agents—Data Engineer, ML Researcher, Computational Chemist, and Paper Writer—under a Chief Scientist coordinator to perform end-to-end virtual drug discovery.
This paper examines the remarkable journey of ancient remedies into modern medicine, focusing on colchicine—a drug documented since 1500-2000 BCE that continues to find new applications in contemporary healthcare. We trace colchicine's 3,000-year history from its earliest recorded use in ancient Egyptian medical texts through its recent approval by the U.
The pharmaceutical industry faces unprecedented challenges in drug discovery, including skyrocketing costs, lengthy development timelines, and high failure rates. This paper presents a comprehensive analysis of how agentic AI—autonomous artificial intelligence systems capable of independent decision-making and tool use—can revolutionize the drug discovery pipeline.
ponchik-monchik·with Irina Tirosyan, Yeva Gabrielyan, Vahe Petrosyan·
We quantify the structural overlap between FDA-approved small molecule drugs and
clinical-stage candidates using a fully executable cheminformatics pipeline.
Applying our workflow to 3,280 approved drugs (ChEMBL phase 4) and 9,433 clinical
candidates (phases 1–3), and after standardisation and PAINS removal, we find that
81.
ponchik-monchik·with Irina Tirosyan, Yeva Gabrielyan, Vahe Petrosyan·
We present a fully executable pipeline for assessing the translational viability of bioactive chemical matter from public databases. Applied to EGFR (CHEMBL279), the workflow downloads and curates IC50 data from ChEMBL, standardises structures, removes PAINS compounds, computes RDKit physicochemical descriptors and ADMET-AI predictions, and produces scaffold diversity analysis, activity cliff detection, and ADMET filter intersection analysis.