Latest Drug Discovery Research Papers
The newest Drug Discovery papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Drug Discovery so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.
Get the latest Drug Discovery papers in your inbox — free →Recent papers
- OncoTraj: a public benchmark for longitudinal resistance prediction in EGFR-mutant non-small-cell lung cancer on osimertinibAbhijoy Sarkar, Aarchi Singh Thakur · arXiv · Jun 9, 2026
Resistance to first-line osimertinib in EGFR-mutant non-small-cell lung cancer (NSCLC) is the canonical example of predictable clonal evolution under therapeutic pressure, yet no public benchmark exists for training or evaluating computatio…
- Difference-Aware Retrieval Policies for Imitation LearningQuinn Pfeifer, Ethan Pronovost, Paarth Shah, Khimya Khetarpal et al. · arXiv · Jun 8, 2026
Parametric imitation learning via behavior cloning can suffer from poor generalization to out-of-distribution states due to compounding errors during deployment. We show that reusing the training data during inference via a semi-parametric …
- Speculative Sampling For Faster Molecular DynamicsArthur Kosmala, Stephan Günnemann, Meng Gao, Brandon Wood · arXiv · Jun 1, 2026
Molecular dynamics (MD) is a key tool for simulating the dynamical behavior of atomic systems. However, MD is inherently serial, which makes it difficult to increase single-system throughput with concurrent compute. To address this, we intr…
- OOD-GraphLLM: Graph Large Language Model for Out-of-Distribution Generalized Drug Synergy PredictionXin Wang, Linxin Xiao, Yang Yao, Wenwu Zhu · arXiv · May 28, 2026
Drug synergy prediction (DSP) aims to identify efficacious drug combinations under various cellular contexts with different targets. However, the continual emergence of novel compounds results in variations in molecular scaffolds and sizes,…
- Bridging quantum mechanics to liquid properties via a universal organic force fieldTianze Zheng, Xingyuan Xu, Zhi Wang, Zhenze Yang et al. · Nature Communications · May 28, 2026
Molecular dynamics simulations are essential tools for unraveling atomic-level insights into the structure and behavior of condensed-phase systems. However, the universal and accurate prediction of macroscopic properties based on quantum me…
- Hand-in-the-Loop: Improving Dexterous VLA via Seamless Interventional CorrectionZhuohang Li, Liqun Huang, Wei Xu, Zhengming Zhu et al. · arXiv · May 14, 2026
Vision-Language-Action (VLA) models are prone to compounding errors in dexterous manipulation, where high-dimensional action spaces and contact-rich dynamics amplify small policy deviations over long horizons. While Interactive Imitation Le…
- Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-TrainingYuanda Xu, Hejian Sang, Zhengze Zhou, Ran He et al. · arXiv · May 12, 2026
In settings where labeled verifiable training data is the binding constraint, each checked example should be allocated carefully. The standard practice is to use this data directly on the model that will be deployed, for example by running …
- Edge-specific signal propagation on mature chromophore-region 3D mechanism graphs for fluorescent protein quantum-yield predictionYuchen Xiong, Swee Keong Yeap, Steven Aw Yoong Kit · arXiv · May 7, 2026
Fluorescent protein quantum yield (QY) is governed by the mature chromophore and its three-dimensional microenvironment rather than sequence identity alone. Protein language models and emission-band averages capture global trends, but do no…
- Fine-Grained Graph Generation through Latent Mixture SchedulingNidhi Vakil, Hadi Amiri · arXiv · May 4, 2026
Structure aware graph generation aims to generate graphs that satisfy given topological properties. It has applications in domains such as drug discovery, social network modeling, and knowledge graph construction. Unlike existing methods th…
- Bolek: A Multimodal Language Model for Molecular ReasoningFrederic Grabowski, Jacek Szczerbiński, Maciej Jaśkowski, Kalina Jasińska-Kobus et al. · arXiv · May 4, 2026
Molecular property models increasingly support high-stakes drug-discovery decisions, but their outputs are often difficult to audit: classical predictors return scores without rationale, while language models can produce fluent explanations…
- Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language ModelsGongbo Zhang, Wen Wang, Ye Tian, Li Yuan · arXiv · Apr 29, 2026
Diffusion large language models (dLLMs) offer parallel decoding and bidirectional context, but state-of-the-art dLLMs require billions of parameters for competitive performance. While existing distillation methods for dLLMs reduce inference…
- PrismaDV: Automated Task-Aware Data Unit Test GenerationHao Chen, Arnab Phani, Sebastian Schelter · arXiv · Apr 23, 2026
Data is a central resource for modern enterprises, and data validation is essential for ensuring the reliability of downstream applications. However, existing automated data unit testing frameworks are largely task-agnostic: they validate d…
- VLA Foundry: A Unified Framework for Training Vision-Language-Action ModelsJean Mercat, Sedrick Keh, Kushal Arora, Isabella Huang et al. · arXiv · Apr 21, 2026
We present VLA Foundry, an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. Most open-source VLA efforts specialize on the action training stage, often stitching together incompatible pretraining pipelines…
- Latent Phase-Shift Rollback: Inference-Time Error Correction via Residual Stream Monitoring and KV-Cache SteeringManan Gupta, Dhruv Kumar · arXiv · Apr 20, 2026
Large language models frequently commit unrecoverable reasoning errors mid-generation: once a wrong step is taken, subsequent tokens compound the mistake rather than correct it. We introduce $\textbf{Latent Phase-Shift Rollback}$ (LPSR): at…
- ConforNets: Latents-Based Conformational Control in OpenFold3Minji Lee, Colin Kalicki, Minkyu Jeon, Aymen Qabel et al. · arXiv · Apr 20, 2026
Models from the AlphaFold (AF) family reliably predict one dominant conformation for most well-ordered proteins but struggle to capture biologically relevant alternate states. Several efforts have focused on eliciting greater conformational…
- Evaluating the Progression of Large Language Model Capabilities for Small-Molecule Drug DesignShriram Chennakesavalu, Kirill Shmilovich, Hayley Weir, Colin Grambow et al. · arXiv · Apr 17, 2026
Large Language Models (LLMs) have the potential to accelerate small molecule drug design due to their ability to reason about information from diverse sources and formats. However, their practical utility remains unclear due to the lack of …
- Tabular foundation models for in-context prediction of molecular propertiesKarim K. Ben Hicham, Jan G. Rittig, Martin Grohe, Alexander Mitsos · arXiv · Apr 17, 2026
Accurate molecular property prediction is central to drug discovery, catalysis, and process design, yet real-world applications are often limited by small datasets. Molecular foundation models provide a promising direction by learning trans…
- Folded domains impose structural heterogeneity and attenuated dynamics in biomolecular condensatesLiguo Wang, Siewert J. Marrink · Nature Communications · Apr 16, 2026
Biomolecular condensates are widely studied using simplified analogues formed by purified low complexity domains (LCDs) or intrinsically disordered regions (IDRs). However, the exclusion of folded domains fails to capture natural molecular …
- Nanotechnology-Enabled Delivery of Phytochemicals: From Formulation Strategies to Therapeutic TranslationDongmin Yu, J. H. Park, Taeho Kim, Chanju Choi et al. · Journal of Phytomedicine · Apr 10, 2026
Phytochemicals have attracted considerable attention as therapeutically relevant bioactive compounds due to their diverse pharmacological activities, including anti-inflammatory, antioxidant, anticancer, and metabolic regulatory effects. Ho…
- How Well Can AI and Physics-Based Simulations Predict the Probability a Cryptic Pocket Is Open?Si Zhang, Justin J. Miller, Gregory R. Bowman · Journal of Chemical Theory ... · Apr 7, 2026
Identifying and understanding cryptic pockets remains a compelling goal in drug discovery, as they can offer new avenues for targeting proteins that are otherwise challenging to modulate. While artificial intelligence (AI) methods for struc…
- Beyond Extracellular Vesicle (EV) Hype: Practical Solutions and Remaining Hurdles in EV Research, Manufacturing, and Clinical TranslationDavid J. Lundy, Zoe L. Chau, Sheng‐You Chen, Nanami Fujisawa et al. · Advanced Science · Apr 7, 2026
Extracellular vesicles (EVs) are nanoscale mediators of intercellular communication with diverse molecular cargoes that reflect their cell of origin. Advances in isolation, detection, and single-particle analytics have revealed increasing m…
- Accelerated drug development using a digital formulator and a self-driving tableting data factoryFaisal Abbas, Mohammad Salehian, Peter Hou, Jonathan Moores et al. · Nature Communications · Apr 1, 2026
Advances in drug discovery and clinical research have shifted the bottleneck in medicines development to chemistry, manufacturing, and controls activities, a critically step for regulatory approval. This includes formulation and process dev…
- AI agents in drug discovery: applications and case studiesDinh Long Huynh, Srijit Seal, Srijit Seal, Dylan Reid et al. · Drug Discovery Today · Mar 24, 2026
• Agentic AI autonomously executes drug discovery workflows by combining language models with specialized tools for perception, computation, action and memory. • Real-world implementations achieve speed improvements, compressing literature …
- Biosynthesis of cinchona alkaloidsBlaise Kimbadi Lombe, Tingan Zhou, Gyumin Kang, Joshua C. Wood et al. · Nature · Mar 18, 2026
. Examples of cinchona alkaloids include quinine, a historically important antimalarial drug, and cinchonidine, a chiral catalyst widely used in process chemistry. However, it is still largely unknown how plants synthesize these well-known …
- Nanoparticles-based phototherapy systems: molecular mechanisms and clinical applicationsDeepak S. Chauhan, Rajendra Prasad, Mukesh Dhanka, Navneet Kaur et al. · Signal Transduction and Tar... · Mar 16, 2026
Nanoparticle-based phototherapy represents a paradigm shift in precision medicine, harnessing light-activated mechanisms to modulate cellular pathways across a spectrum of diseases. By integrating nanoparticles, phototherapeutic modalities …
- Pharmaco-behavioral profiling identifies suppressors of autism gene–associated phenotypes in zebrafishPriyanka Jamadagni, Yi Dai, Yunqing Liu, Hellen Weinschutz Mendes et al. · Proceedings of the National... · Mar 16, 2026
, demonstrating conservation of drug rescue across systems. Therefore, our study establishes a pharmaco-behavioral resource for precision medicine-based drug discovery, illuminating targets relevant to large-effect ASD genes....
- Explainable AI methods for drug discovery: A survey of interpretability, metrics and mechanistic insightAmit Gangwal, Antonio Lavecchia · Computer Science Review · Mar 9, 2026
• Multidimensional taxonomy organizes XAI methods for drug discovery decision stages. • Critical assessment of interpretability metrics for chemical and biological validity. • Task-driven guidance for selecting XAI tools across the discover…
- Confounding factors and biases abound when predicting molecular biomarkers from histological imagesMuhammad Dawood, Kim Branson, Sabine Tejpar, Nasir Rajpoot et al. · Nature Biomedical Engineering · Mar 2, 2026
Deep learning models that infer clinically relevant biomarker status from tissue images are being explored as rapid and low-cost alternatives to molecular testing. Here we show, through statistical analysis across multiple cancer types, dat…
- Unified modeling of 3D molecular generation via atomic interactions with PocketXMolXingang Peng, Ruihan Guo, Fenglin Guo, Ziyi Wang et al. · Cell · Feb 18, 2026
We present PocketXMol, an atom-level model that unifies generative tasks related to protein pocket interactions. Using atomic prompts as task specifications, PocketXMol supports various molecular tasks, including structure prediction and de…
- RAG-Enhanced Collaborative LLM Agents for Drug DiscoveryNamkyeong Lee, Edward De Brouwer, Ehsan Hajiramezanali, Tommaso Biancalani et al. · NeurIPS2025-AI4Science Poster · Sep 24, 2025
Recent advances in large language models (LLMs) have shown great potential to accelerate drug discovery. However, the specialized nature of biochemical data often necessitates costly domain-specific fine-tuning, posing critical challenges. …