Language & NLP

Latest Dialogue Systems Research Papers

The newest Dialogue Systems papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Dialogue Systems so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.

Get the latest Dialogue Systems papers in your inbox — free →

Recent papers

Fostering Engagement through a Latency-Optimized LLM-based Dialogue System for Multimodal ECA Responses - Supplemental Material
Kühlem, Konstantin W., Ehret, Jonathan, Kuhlen, Torsten W., Bönsch, Andrea · Zenodo (CERN European Organ... · Dec 18, 2026
Fostering Engagement through a Latency-Optimized LLM-based Dialogue System for Multimodal ECA Responses - Supplemental Material
Kühlem, Konstantin W., Ehret, Jonathan, Kuhlen, Torsten W., Bönsch, Andrea · Zenodo (CERN European Organ... · Dec 18, 2026
Theatre of the Tech-Oppressed: Exploring algorithmic injustice through participatory performances
Babai, Najme, Duvall, Eva, Iivari, Netta, Kinnula, Marianne et al. · OpenAlex · Dec 5, 2026
As artificial intelligence (AI) becomes embedded in systems that shape everyday decisions, it is critical to engage young people in making sense of these technologies and their sociotechnical implications. This paper explores how participat…
RUMBA: Russian User Memory Benchmark
Elizaveta Shevtsova, Inna Glebkina, Mark Baushenko, Pavel Gulyaev et al. · arXiv · Jul 23, 2026
The ability to handle long-term memory in LLMs is becoming increasingly critical, yet existing benchmarks remain English-centric and rely on aggregate retrieval metrics, failing to capture interactions between long-range context, temporal i…
Word meaning co-determines vowel-inherent spectral change. A corpus-based investigation of conversational Mandarin
Xiaoyun Jin, Mirjam Ernestus, R. Harald Baayen · arXiv · Jul 23, 2026
This study investigates vowel-inherent spectral change (VISC) in spontaneous conversational Mandarin. Using the generalized additive model and word embeddings from distributional semantics, we show that, when controlling for variables such …
CultureTalk-ID: A Multi-Task Dialogue Benchmark for Cultural Commonsense in Indonesian Local Languages
Muhammad Dehan Al Kautsar, Salsabila Pranida, Bilal Elbouardi, Fajri Koto · arXiv · Jul 23, 2026
Culture is lived through conversation, yet existing Indonesian cultural commonsense benchmarks evaluate LLMs on short and isolated prompts, stripping away the dialogic context in which cultural nuances actually surface. We introduce Culture…
Memory-Driven Self-Disclosure and Relational Turning Points: A Longitudinal Multimodal Study of Human-AI Interaction
Ryuichi Sumida, Mao Saeki, Masaki Eguchi, Sadahiro Yoshikawa et al. · arXiv · Jul 16, 2026
As conversational AI systems are designed for repeated use, a central question is how a series of interactions becomes a relationship. We present a longitudinal multimodal study of a memory-augmented conversational agent (24 participants x …
Penny: Transition Network Analysis of Learner-Chatbot Interactions in Scaffolded EFL Writing
Steve Woollaston, Brendan Flanagan, Yuko Toyokawa, Hiroaki Ogata · arXiv · Jul 16, 2026
Generative AI chatbots promise to transform English as a Foreign Language (EFL) writing by providing immediate, personalised feedback. However, their pedagogical value depends on how learners engage with them - a process often treated as a …
Epistemic Stance Flexibility Probing: Measuring Prompt-Conditioned Register Shift in Large Language Models
Binwen Liu, Yilin Ren · arXiv · Jul 14, 2026
A language model may be asked either what experts believe about a contested claim or what it believes about the claim itself. A trustworthy conversational agent should distinguish these two requests and respond in different epistemic regist…
Relational Positioning as a Measurable Risk Object: History-Carried Lock-in and Self-Confabulation in Multi-Turn Human-AI Dialogue
Jihong Chen · arXiv · Jul 13, 2026
In long, multi-turn dialogue a large language model maintains an implicit relational stance toward the user, spanning from "push the user toward real-world others" to "position itself as the user's sole support." When it slides toward the l…
The Paternalistic Filter: Epistemic Injustice and Differential Refusal in LLM-Mediated History Education for Marginalized Romanian Students
Alexis Popovici, Andrei Ionascu, Adrian-Marius Dumitran · arXiv · Jul 13, 2026
As Large Language Models (LLMs) are increasingly deployed as conversational tutors, they risk institutionalizing systemic inequalities. This study presents a systematic API audit of four LLMs acting as history tutors, evaluating 1,800 respo…
FreyaTTS Technical Report
Ahmet Erdem Pamuk, Ömer Yentür, Ahmet Tunga Bayrak, Yavuz Alp Sencer Öztürk et al. · arXiv · Jul 10, 2026
We introduce Freya-TTS, a compact, tokenizer-free, Turkish-first text-to-speech model designed for highly reliable and efficient conversational synthesis. Freya-TTS is a 183.2M-parameter non-autoregressive conditional flow-matching Diffusio…
Towards Detecting Inconsistencies in End-to-end Generated TODs
Tiziano Labruna, Giovanni Bonetta, Bernardo Magnini · arXiv · Jul 10, 2026
Generative AI is profoundly transforming the core technologies behind conversational systems, shifting from component-based to end-to-end approaches. However, Large Language Models (LLMs) may still generate inconsistencies, a critical issue…
The complexities of patient-centred conversational artificial intelligence
João Matos, Olivia Buege, Donny Cheung, Gary S. Collins et al. · arXiv · Jul 9, 2026
Consumer-facing health chatbots powered by large language models (LLMs) are increasingly used for symptom assessment. However, chatbot development and evaluation often rely on cooperative, articulate, simulated patients. We analysed 2,053 r…
Improving Ad-hoc Search Effectiveness for Conversational Information Retrieval via Model Merging
Ahmed Rayane Kebir, Jose G. Moreno, Lynda Tamine · arXiv · Jul 9, 2026
Conversational information retrieval is challenging since it requires the consideration of the conversation history which potentially gives rise to topic shifts and coreference resolution across previous turns. To address these challenges, …
Cognitive-structured Multimodal Agent for Multimodal Understanding, Generation, and Editing
Feng Wang, Canmiao Fu, Zhipeng Huang, Chen Li et al. · arXiv · Jul 9, 2026
Recent unified multimodal models show a single architecture can jointly perform vision/language understanding and image generation/editing. However, they repeatedly feed all historical visual and textual inputs into a shared context window,…
Diarization-Guided Qwen-ASR Adaptation for Multilingual Two-Speaker Conversational Speech
Hao Wu, RongQi Han, Zhen Wang, Wei Liang et al. · arXiv · Jul 9, 2026
This paper describes our self-designed system for Task 1 of the MLC-SLM 2026 Challenge for multilingual two-speaker conversational speech. The system combines a modular speaker diarization front end with a challenge-adapted Qwen3-ASR-1.7B r…
Multimodal Voice Activity Projection for Turn-Taking in Social Robots with Voice-Activity-Related Pretrained Encoders
Antonio Cano, Guillermo Pérez, Luis Merino, Randy Gomez · arXiv · Jul 8, 2026
Turn-taking prediction is a key requirement for social robots involved in human-human interaction, particularly in mediator settings, where the robot must anticipate conversational dynamics rather than merely react to pauses. This work pres…
Using embeddings to predict spoken word duration and pitch in Mandarin monosyllabic words
Xiaoyun Jin, Mirjam Ernestus, R. Harald Baayen · arXiv · Jul 2, 2026
Time-normalized f0 contours of Mandarin words in conversational speech have been shown to be predictable in part from their contextualized embeddings (CEs). The present study investigates whether CEs also predict spoken word duration for 74…
Beyond Supervised Clarification: Input Rewriting with LLMs for Dialogue Discourse Parsing
Yiming Liu, Ziyue Zhang, Zhichao Xu, Xin Yu et al. · arXiv · Jul 2, 2026
Rewriting inputs to improve frozen downstream models has become a common strategy in modern NLP pipelines. Prior work on incremental dialogue discourse parsing (DDP) shows that supervised clarification models can rewrite fragmentary or unde…
DiPS: Dialogue Policy Selection for High-Stakes Persuasion Agents
Tianyi Zhang, Mousumi Das, Abrar Anwar, Jesse Thomason et al. · arXiv · Jul 2, 2026
Large Language Models (LLMs) often struggle with persuasion in high-stakes scenarios. People's individual personalities and concerns require tailored strategies rather than a one-size-fits-all approach. To address this challenge, we focus o…
Towards Developing a Multimodal Chat Assistant for University Stakeholders: RAG-based Approach
Md Abu Hanif Shaikh, Abdullah Al Shafi · arXiv · Jul 1, 2026
University stakeholders often face difficulties in accessing timely and reliable information, especially in developing countries, where there are very few intelligent support systems. Existing rule-based chatbots are unable to handle comple…
Behavior-Adaptive Conversational Agents: Toward a Fluid Personality Framework
Hasibur Rahman, Smit Desai · arXiv · Jul 1, 2026
Large language model (LLM)-based conversational agents (CAs) are now ubiquitous, creating new opportunities for AI-mediated behavior change. Their capacity to project nuanced personalities and adopt diverse metaphorical roles raises a desig…
Quantifying the Affective Gap: A Zero-Shot Evaluation of LLMs on Fine-Grained Emotion Taxonomies
Lawrence Obiuwevwi, Krzysztof J. Rechowicz, Jessica M. Johnson, Vikas Ashok et al. · arXiv · Jul 1, 2026
Emotion recognition in natural language is a foundational challenge in affective computing, with critical implications for human-computer interaction, mental health support, and conversational AI. This paper presents a rigorous, unified zer…
Persona Non Grata: LLM Persona-Driven Generations in MCQA are Unstable in Distinct Dimensions
César Guerra-Solano, Xiang Lorraine Li · arXiv · Jul 1, 2026
Persona-driven generations (PDGs) have seen prolific use in research and industry applications, where a large language model (LLM) takes on a 'persona' while completing some task. While persona expressed through free-form text (like dialogu…
StochasT: Learning with Stochastic Turn Depth for Visual Instruction Tuning
Yuan Qing, Chengzhi Mao, Boqing Gong · arXiv · Jul 1, 2026
Large Vision-Language Models (LVLMs) rely extensively on Visual Instruction Tuning (VIT) to elicit their multimodal reasoning capabilities. However, we find a discrepancy: VIT often packs multiple language tasks about the same image for con…
TRACE: State-Aware Query Processing over Temporal Evidence Graphs for Conversational Data
Maolin Wang, Yu Wang, Zichun Liu, Baiyuan Qiu et al. · arXiv · Jul 1, 2026
Conversational data is increasingly used as a persistent source of user state for long-running assistants and AI agents. However, querying this data remains challenging because conversations naturally evolve: plans are revised, preferences …
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent
Lei Bai, Zongsheng Cao, Yang Chen, Zhiyao Cui et al. · arXiv · Jun 29, 2026
We introduce Agents-A1, a 35B Mixture-of-Experts Agentic Model that reaches trillion-parameter-level performance by scaling the agent horizon. We investigate agent-horizon scaling from two perspectives: scaling long-horizon trajectories and…
TRACE: Temporal Relationship-Aware Conversational Entrainment Detection in Dyadic Speech
Sathvik Manikantan Napa Ugandhar, Hao Zhang, Alison Gunzler, Yuzhe Wang et al. · arXiv · Jun 29, 2026
With the proliferation of speech AI agents, understanding emotional entrainment in conversational interaction has become increasingly important. Emotional entrainment is shaped by social relationships and conversational context, influencing…
SIMAX: A Scalable and Interpretable Framework for Multi-Fidelity and Annotated Clinician-Patient Dialogue Simulation
Zhuhan Bao, Rui Yang, Bohao Yang, Zhiyi Liu et al. · arXiv · Jun 29, 2026
Background. The widespread deployment of ambient digital scribes is driving large-scale capture of clinician-patient dialogues. Human coding of clinical communication data remains costly, inconsistent, and difficult to scale, motivating AI-…

Track Dialogue Systems on Distill AI — start free →

Latest Dialogue Systems Research Papers

Recent papers

Related topics