Latest Transfer Learning Research Papers
The newest Transfer Learning papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Transfer Learning so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.
Get the latest Transfer Learning papers in your inbox — free →Recent papers
- A Unifying Lens on Supervised Fine-Tuning Through Target Distribution DesignTong Xie, Yuanhao Ban, Yunqi Hong, Sohyun An et al. · arXiv · Jun 9, 2026
Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can be non-unique, noisy, or misaligned with the model prior. Strictly fitting toward this one-hot targe…
- OncoTraj: a public benchmark for longitudinal resistance prediction in EGFR-mutant non-small-cell lung cancer on osimertinibAbhijoy Sarkar, Aarchi Singh Thakur · arXiv · Jun 9, 2026
Resistance to first-line osimertinib in EGFR-mutant non-small-cell lung cancer (NSCLC) is the canonical example of predictable clonal evolution under therapeutic pressure, yet no public benchmark exists for training or evaluating computatio…
- Data Synthesis and Parameter-Efficient Fine-Tuning for Low-Resource NMT: A Case Study on Q'eqchi' MayanAlexander Chulzhanov, Soeren Eberhardt, Arjun Mukherjee · arXiv · Jun 8, 2026
Neural machine translation for digitally low-resource Indigenous languages is often hindered by extreme data scarcity, prompting reliance on extractive web-scraping. To ensure data sovereignty, this study introduces a data synthesis methodo…
- TailLoR: Protecting Principal Components in Parameter-Efficient Continual LearningMarius Dragoi, Ioana Pintilie, Alexandra Dragomir, Antonio Barbalau et al. · arXiv · Jun 4, 2026
Parameter-efficient finetuning methods based on spectral decomposition have enabled progress in Continual Learning. In this paper we introduce TailLoR, which utilizes the singular bases U and V of the pre-trained weights as a fixed referenc…
- RREDCoT: Segment-Level Reward Redistribution for Reasoning ModelsMykyta Ielanskyi, Kajetan Schweighofer, Lukas Aichberger, Sepp Hochreiter · arXiv · Jun 4, 2026
Recent advancements in reasoning language models have been driven by Reinforcement Learning (RL) fine-tuning. Most often, these rely on the Group Relative Policy Optimization (GRPO) algorithm or modifications thereof to steer the models to …
- Towards Accurate Model Selection in Deep Unsupervised Domain AdaptationKaichao You, Ximei Wang, Mingsheng Long, Michael I. Jordan · arXiv (Cornell University) · Jun 3, 2026
Deep unsupervised domain adaptation (Deep UDA) methods successfully leverage rich labeled data in a source domain to boost the performance on related but unlabeled data in a target domain. However, algorithm comparison is cumbersome in Deep…
- Drifting Preference Optimization for One-Step Generative ModelsZhou Jiang, Yandong Wen, Zhen Liu · arXiv · Jun 1, 2026
One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denois…
- On the Scaling of PEFT: Towards Million Personal Models of Trillion ParametersMind Lab, :, Song Cao, Vic Cao et al. · arXiv · Jun 1, 2026
Parameter-efficient fine-tuning (PEFT) is usually treated as a cheaper alternative to full fine-tuning. We study a broader role: small trainable adapters as persistent local state on top of strong shared foundation models. In this framing, …
- Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient CachingAlaa Khamis, Alaa Maalouf · arXiv · May 28, 2026
Test-time finetuning (TTFT) is a rapidly evolving paradigm that adapts a language model to each prompt by retrieving related sequences, updating the model on them, and then evaluating the prompt. However, TTFT is only practical if it is fas…
- How LoRA Remembers? A Parametric Memory Law for LLM FinetuningZiwen Xu, Haiwen Hong, Linsong Yu, Benglei Cui et al. · arXiv · May 28, 2026
Large Language Models (LLMs) must continuously learn and update knowledge to remain effective in dynamic real-world environments. While Low-Rank Adaptation (LoRA) is widely used for such memory updates, existing studies mainly rely on quali…
- PEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity PerspectiveYangyi Huang, Ruotian Peng, Zeju Qiu, Jiale Kang et al. · arXiv · May 27, 2026
Parameter-efficient finetuning (PEFT) has become the standard approach for adapting large language models, yet evaluations largely emphasize downstream accuracy while overlooking the retention of pretrained capabilities. We argue that PEFT …
- LLM Zeroth-Order Fine-Tuning is an Inference WorkloadZelin Li, Caiwen Ding · arXiv · May 27, 2026
Zeroth-order (ZO) fine-tuning is attractive for large language models because it replaces backpropagation with forward objective evaluations. Existing implementations nevertheless execute ZO algorithms inside conventional training loops, ev…
- Extrapolative Weight Averaging Reveals Correctness-Efficiency Frontiers in Code RLKunhao Zheng, Pierre Chambon, Juliette Decugis, Jonas Gehring et al. · arXiv · May 27, 2026
Linear interpolation between fine-tuned checkpoints has been shown to trace the Pareto front between competing objectives, but whether extrapolative weight averaging can extend such frontiers to new checkpoints useful at inference time, wit…
- Transfer Learning using 66 Diseases for Disease Forecasting ApplicationsLauren J Beesley, Alexander C Murph, Dave Osthus, Lauren A Castro · arXiv · May 26, 2026
Disease forecasting models typically rely on a single data stream, making models brittle when histories are short or noisy. Recent top-performing models have shown that synthesizing multiple reporting systems for the same disease improves p…
- The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation LearningVishal Rajput · arXiv · May 21, 2026
Robustness, domain adaptation, photometric and occlusion invariance, compositional generalisation, temporal robustness, alignment safety, and classical anisotropic regularisation are usually treated as separate problems with separate method…
- SeqLoRA: Bilevel Orthogonal Adaptation for Continual Multi-Concept GenerationJavad Parsa, Enis Simsar, Amir Joudaki, Thomas Hofmann et al. · arXiv · May 21, 2026
Parameter-efficient fine-tuning enables fast personalization of text-to-image diffusion models, but composing multiple custom concepts remains challenging due to representation interference. Existing modular methods either rely on expensive…
- ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated MergingNeha Verma, Nikhil Mehta, Shao-Chuan Wang, Naijing Zhang et al. · arXiv · May 12, 2026
Despite the rapid advancements in large language model (LLM) development, fine-tuning them for specific tasks often results in the catastrophic forgetting of their general, language-based reasoning abilities. This work investigates and addr…
- PET-Adapter: Test-Time Domain Adaptation for Full and Limited-Angle PET Image ReconstructionRüveyda Yilmaz, Yuli Wu, Johannes Stegmaier, Volkmar Schulz · arXiv · May 8, 2026
Positron Emission Tomography (PET) image reconstruction is inherently challenged by Poisson noise and physical degradation factors, which are further exacerbated in limited-angle acquisitions. While deep learning methods demonstrate promisi…
- Optimizer-Model Consistency: Full Finetuning with the Same Optimizer as Pretraining Forgets LessYuxing Liu, Jianyu Wang, Tong Zhang · arXiv · May 7, 2026
Optimizers play an important role in both pretraining and finetuning stages when training large language models (LLMs). In this paper, we present an observation that full finetuning with the same optimizer as in pretraining achieves a bette…
- Crafting Reversible SFT Behaviors in Large Language ModelsYuping Lin, Pengfei He, Yue Xing, Yingqian Cui et al. · arXiv · May 7, 2026
Supervised fine-tuning (SFT) induces new behaviors in large language models, yet imposes no structural constraint on how these behaviors are distributed within the model. Existing behavior interpretation methods, such as circuit attribution…
- PSK at SemEval-2026 Task 9: Multilingual Polarization Detection Using Ensemble Gemma Models with Synthetic Data AugmentationSrikar Kashyap Pulipaka · arXiv · May 6, 2026
We present our system for SemEval-2026 Task 9: Multilingual Polarization Detection, a binary classification task spanning 22 languages. Our approach fine-tunes separate Gemma~3 models (12B and 27B parameters) per language using Low-Rank Ada…
- Adaptive Policy Selection and Fine-Tuning under Interaction Budgets for Offline-to-Online Reinforcement LearningAlper Kamil Bozkurt, Xiaoan Xu, Shangtong Zhang, Miroslav Pajic et al. · arXiv · May 6, 2026
In offline-to-online reinforcement learning (O2O-RL), policies are first safely trained offline using previously collected datasets and then further fine-tuned for tasks via limited online interactions. In a typical O2O-RL pipeline, candida…
- Label-Efficient School Detection from Aerial Imagery via Weakly Supervised Pretraining and Fine-TuningZakarya Elmimouni, Fares Fourati, Mohamed-Slim Alouini · arXiv · May 5, 2026
Accurate school detection is essential for supporting education initiatives, including infrastructure planning and expanding internet connectivity to underserved areas. However, many regions around the world face challenges due to outdated,…
- Ecologically-Constrained Task Arithmetic for Multi-Taxa Bioacoustic Classifiers Without Shared DataRagib Amin Nihal, Benjamin Yen, Runwu Shi, Takeshi Ashizawa et al. · arXiv · May 5, 2026
Training data for bioacoustics is scattered across taxa, regions, and institutions. Centralizing it all is often infeasible. We show that independently fine-tuned BEATs encoders can be composed into a unified 661-species classifier via task…
- On Adaptivity in Zeroth-Order OptimizationHassan Dbouk, Nidham Gazagnadou, Matthias Reisser, Christos Louizos · arXiv · May 5, 2026
We investigate the effectiveness of adaptive zeroth-order (ZO) optimization for memory-constrained fine-tuning of large language models (LLMs). Contrary to prior claims, we show that adaptive ZO methods such as ZO-Adam offer no convergence …
- Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language ModelsGongbo Zhang, Wen Wang, Ye Tian, Li Yuan · arXiv · Apr 29, 2026
Diffusion large language models (dLLMs) offer parallel decoding and bidirectional context, but state-of-the-art dLLMs require billions of parameters for competitive performance. While existing distillation methods for dLLMs reduce inference…
- MoRFI: Monotonic Sparse Autoencoder Feature IdentificationDimitris Dimakopoulos, Shay B. Cohen, Ioannis Konstas · arXiv · Apr 29, 2026
Large language models (LLMs) acquire most of their factual knowledge during the pre-training stage, through next token prediction. Subsequent stages of post-training often introduce new facts outwith the parametric knowledge, giving rise to…
- Benchmarking Pathology Foundation Models for Breast Cancer Survival PredictionFredrik K. Gustafsson, Constance Boissin, Johan Vallon-Christersson, David A. Clifton et al. · arXiv · Apr 27, 2026
Pathology foundation models (PFMs) have recently emerged as powerful pretrained encoders for computational pathology, enabling transfer learning across a wide range of downstream tasks. However, systematic comparisons of these models for cl…
- Zero-Shot Morphological Discovery in Low-Resource Bantu Languages via Cross-Lingual Transfer and Unsupervised ClusteringHillary Mutisya, John Mugane · arXiv · Apr 24, 2026
We present a method for discovering morphological features in low-resource Bantu languages by combining cross-lingual transfer learning with unsupervised clustering. Applied to Giriama (nyf), a language with only 91 labeled paradigms, our p…
- Fine-Tuning Regimes Define Distinct Continual Learning ProblemsPaul-Tiberiu Iordache, Elena Burceanu · arXiv · Apr 23, 2026
Continual learning (CL) studies how models acquire tasks sequentially while retaining previously learned knowledge. Despite substantial progress in benchmarking CL methods, comparative evaluations typically keep the fine-tuning regime fixed…