Learning Paradigms

Latest Transfer Learning Research Papers

The newest Transfer Learning papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Transfer Learning so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.

Get the latest Transfer Learning papers in your inbox — free →

Recent papers

Online Variance Reduction for Domain Adaptation on Streaming Data
Andrea Napoli · arXiv · Jul 22, 2026
This paper studies the problem of stochastic variance reduction (SVR) for the maximum mean discrepancy (MMD) and correlation alignment (CORAL) loss functions. Although various offline SVR algorithms for these losses have been proposed, thes…
Variance-reduced Domain Adaptation using Paired Sampling
Andrea Napoli · arXiv · Jul 22, 2026
Correlation alignment and the maximum mean discrepancy are two widely used distribution-matching frameworks for unsupervised domain adaptation (UDA). However, high variance in these losses has been shown to undermine their effectiveness in …
The Blessing of Dimensionality: How Near-Orthogonality in High-Dimensional Spaces Explains Temporal Portability
Abigail Woodring, Adrian Chan, Rana Muhammad Shahroz Khan, Sukwon Yun et al. · arXiv · Jul 22, 2026
Fine-tuning has been widely used to adapt large language models (LLMs) for domain-specific tasks. Parameter efficient fine-tuning (PEFT) methods such as low-rank adaptation (LoRA) are frequently used to reduce computational costs. PortLLM i…
CircuitKIT : Circuit Discovery, Evaluation, and Application Toolkit for Mechanistic Interpretability
Pratinav Seth, Hem Gosalia, Aditya Kasliwal, Vinay Kumar Sankarapu · arXiv · Jul 21, 2026
Circuit analysis can support not only model explanation but also downstream interventions such as pruning, editing, steering, and selective fine-tuning. However, conducting such analyses currently requires stitching together separate implem…
Verifier-Based Reinforcement Fine-Tuning of Reasoning Models for Thermal Energy Storage Control
Takumi Shioda, Kohei Terashima, Tatsuo Nagai · arXiv · Jul 14, 2026
Buildings are expected to shift cooling loads in response to grid conditions. Thermal energy storage (TES) enables this shift, but scheduling it well requires planning hours ahead under storage constraints. Model predictive control (MPC) an…
TriA Pipeline: A Large-Scale Automatic Audio Annotation Pipeline For Audio Classification In Specific Scenarios
Hong Lyu, Mingru Yang, Qianhua He, Yanxiong Li et al. · arXiv · Jul 7, 2026
There are some datasets of varying scales for audio classification (AC) applied to different tasks. However, annotated data is limited for most scenarios, such as domestic environments. To address this challenge, we propose an $\textbf{A}$u…
ZO-Act: Efficient Zeroth-Order Fine-Tuning via One-Shot Activation-Informed Low-Rank Subspaces
Xun Dong, Yibo Xu, Naigang Wang, Xin Li et al. · arXiv · Jul 1, 2026
Zeroth-order (ZO) optimization enables fine-tuning large language models when backpropagation is unavailable or memory-prohibitive, but existing methods often perturb full model weights or randomly constructed low-dimensional subspaces, yie…
Learning from Mistakes: Rollout-Retrieval Lifelong Policy Learning for Autonomous Driving
Cheng Gong, Haoyang Wang, Chao Lu, Zirui Li et al. · arXiv · Jun 29, 2026
Autonomous driving policies should be able to improve continually as deployment exposes them to increasingly diverse and long-tail traffic situations. However, most learning-based policies are trained or fine-tuned on expert demonstrations …
ITSPACE: Monotone Gaussian Optimal Transport Updates
Woojoo Na, Jennifer Dy · arXiv · Jun 29, 2026
Covariance matrices serve as compact descriptors of feature distributions in many machine-learning pipelines, including domain adaptation and Gaussian embeddings. Under a centered Gaussian approximation, the unregularized Wasserstein-2 opti…
Physics-Informed Neural Network with Transfer Learning for State Estimation in Lithium-Ion Batteries using the Single Particle Model with Electrolyte
Gift Modekwe, Qiugang Lu · arXiv · Jun 26, 2026
Physics-informed neural networks (PINNs) have emerged as a powerful tool for solving nonlinear partial differential equations (PDEs), including battery electrochemical models. They typically en-force conservation laws within the loss functi…
A Multi-Fidelity Convolutional Autoencoder-Transfer Learning Framework for Guided-Wave-Based Damage Diagnosis Using Large Simulated and Limited Experimental Datasets
Santosh Kapuria, Abhishek · arXiv · Jun 25, 2026
Guided wave-based structural health monitoring (GWSHM) with onboard transducers offers significant potential for the early diagnosis of damage in engineering structures. However, the practical deployment of deep learning models is often hin…
Taxonomy-aware deep learning for hierarchical marine species classification in underwater imagery
Dan Zimmerman, Dimitris A. Pados, George Sklivanitis · arXiv · Jun 24, 2026
Automated classification of marine species from underwater imagery is essential for scalable ocean biodiversity monitoring and conservation policy. Existing approaches struggle with severe domain shift across collection platforms, fine-grai…
WinDOM: Self-Family Distillation for Small-Model GUI Grounding
Chengheng Li-Chen, Zhiqian Zhou, Hao Chen, Nicolas Chauvin · arXiv · Jun 24, 2026
Small ($\sim$2B) GUI-grounding agents are attractive for on-device deployment, accessibility tooling, and low-cost iteration, but at this scale they face two open recipe questions: how to obtain bounding-box training data without expensive …
ASALT: Adaptive State Alignment for Lateral Transfer in Multi-agent Reinforcement Learning
Anurag Akula, Satheesh K. Perepu, Abhishek Sarkar, Kaushik Dey · arXiv · Jun 23, 2026
Multi-agent reinforcement learning (MARL) addresses the problem of training multiple agents that pursue collaborative, competitive, or mixed objectives. Prior work has investigated transfer learning between source and target domains in MARL…
An LLM-based Two-Stage Transformer Framework for Cross-Domain Bearing Fault Diagnosis with Limited Data
Jinghan Wang, Feng Cheng, Wentao Wu, Hang Li et al. · arXiv · Jun 23, 2026
Bearing fault diagnosis faces critical challenges when dataset heterogeneity, operating condition variations, and limited labeled data occur simultaneously in industrial environments. Existing approaches address these issues in isolation an…
RECALL: Recovery Experience Collection for Active Lifelong Learning in Vision-Language-Action Models
Ulas Berk Karli, Tesca Fitzgerald · arXiv · Jun 22, 2026
Vision-Language-Action (VLA) models are commonly fine-tuned through passive imitation learning, where additional demonstrations are collected for tasks where the policy performs poorly. This approach incurs several downsides: it requires th…
Hedgementation = Hedgerow Segmentation: A Remote Sensing Benchmark
Nathan Senyard, Salem Hamdani, Astrid Zhang, Derek Wang et al. · arXiv · Jun 22, 2026
We propose Hedgementation: a new benchmark to evaluate machine learning models for hedgerow mapping from remote sensing data at country scale and 10m$^2$ spatial resolution. We combine and harmonize multiple remote sensing data products and…
Scaling Linear Mode Connectivity and Merging to Billion Parameter Pretrained Transformers
Tianyi Li, Zhiqiang Shen · arXiv · Jun 22, 2026
Linear mode connectivity (LMC) provides a promising foundation for understanding and merging independently trained neural networks, but existing methods typically optimize the interpolation path from only one model endpoint, limiting their …
Probe-and-Refine Tuning of Repository Guidance for Coding Agents
Asa Shepard, Jeannie Albrecht · arXiv · Jun 18, 2026
LLM-based coding agents need higher-level operational knowledge about a repository (which files house which subsystems, how to run the test suite, which workflows have historically led to wrong fixes) that does not exist in the code itself.…
Does VLA Even Know the Basics? Measuring Commonsense and World Knowledge Retention in Vision-Language-Action Models
Nikita Kachaev, Andrey Moskalenko, Matvey Skripkin, Nikita Kurlaev et al. · arXiv · Jun 17, 2026
Embodied Vision-Language-Action (VLA) models are typically obtained by fine-tuning powerful pretrained VLMs on robotics data, yet it is unclear how much commonsense and factual knowledge they retain after adaptation. Failures on knowledge-s…
From Reasoning Traces to Reusable Modules: Understanding Compositional Generalization in Language Model Reasoning
Lingjing Kong, Xin Liu, Guangyi Chen, Martin Q. Ma et al. · arXiv · Jun 16, 2026
Post-training pipelines that combine supervised fine-tuning (SFT) with reinforcement learning (RL) have emerged as the key recipe for transforming large language models (LLMs) into robust reasoners. We argue that this combined success is dr…
Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes
Tongyan Fang, Siyuan Huang, Naiyu Fang, Ganlong Zhao et al. · arXiv · Jun 15, 2026
When pretrained VLA policies are fine-tuned through online RL, each rollout episode produces only a single binary outcome (success or failure), yet the actor update requires per-transition supervision. Existing approaches commonly reduce th…
Selection Without Signal, Recovery Through Expression: A Measurement Study of Post-Hoc Falsification Operators for Frozen Small Code Models
Mehmet Iscan · arXiv · Jun 15, 2026
Frozen small code models (<=1.5B parameters, run locally without fine-tuning) suit offline and privacy-constrained use, but often emit plausible-but-wrong programs. A natural remedy is a post-hoc operator that selects, verifies, repairs,…
A Comparative Study of Deep Learning Architectures for Multi-Horizon Behavioural Forecasting for Mobile Health
Pavlos Nicolaou, Kleanthis Malialis, Artemis Kontou, Panayiotis Kolios · arXiv · Jun 12, 2026
Wearable devices and smartphones generate rich behavioural time series that can support proactive health interventions, yet systematic comparisons of modern forecasting architectures for these data are lacking. In particular, it remains unc…
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks
Mengyu Zheng, Kai Han, Boxun Li, Haiyang Xu et al. · arXiv · Jun 10, 2026
General-purpose agents such as OpenClaw are increasingly used as autonomous tool users, but their coding ability is difficult to measure under SWE-bench: a generic agent does not by itself satisfy the clean Docker workspace, patch, and pred…
ALIGNBEAM : Inference-Time Alignment Transfer via Cross-Vocabulary Logit Mixing
Chirag Chawla, Pratinav Seth, Vinay Kumar Sankarapu · arXiv · Jun 10, 2026
Domain fine-tuning degrades the safety of large language models: fine-tuned specialists readily comply with harmful prompts framed in domain language. Existing inference-time defenses that mix logits from a safe anchor model require both mo…
Harness In-Context Operator Learning with Chain of Operators
Minghui Yang, Ling Guo, Liu Yang · arXiv · Jun 10, 2026
Neural operators approximate mappings between function spaces, but often generalize poorly to other operators and usually require fine-tuning or retraining. In-Context Operator Networks (ICON) addresses this issue by prompting the model wit…
A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design
Tong Xie, Yuanhao Ban, Yunqi Hong, Sohyun An et al. · arXiv · Jun 9, 2026
Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can be non-unique, noisy, or misaligned with the model prior. Strictly fitting toward this one-hot targe…
OncoTraj: a public benchmark for longitudinal resistance prediction in EGFR-mutant non-small-cell lung cancer on osimertinib
Abhijoy Sarkar, Aarchi Singh Thakur · arXiv · Jun 9, 2026
Resistance to first-line osimertinib in EGFR-mutant non-small-cell lung cancer (NSCLC) is the canonical example of predictable clonal evolution under therapeutic pressure, yet no public benchmark exists for training or evaluating computatio…
Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' Mayan
Alexander Chulzhanov, Soeren Eberhardt, Arjun Mukherjee · arXiv · Jun 8, 2026
Neural machine translation for digitally low-resource Indigenous languages is often hindered by extreme data scarcity, prompting reliance on extractive web-scraping. To ensure data sovereignty, this study introduces a data synthesis methodo…

Track Transfer Learning on Distill AI — start free →

Latest Transfer Learning Research Papers

Recent papers

Related topics