Vision

Latest Computer Vision Research Papers

The newest Computer Vision papers from across the field — arXiv, NeurIPS, CVPR, Nature, and more — refreshed daily and ranked by relevance. Distill AI tracks Computer Vision so you don’t have to: get the standout work delivered to your inbox every morning, with 2-sentence summaries and the option to chat with any paper.

Get the latest Computer Vision papers in your inbox — free →

Recent papers

Beyond the Cash Kickback: Contractual Control as Remuneration in Private-Equity-Backed Physician Practices
Maxwell Dandrea · Seton Hall University eRepo... · Jan 1, 2027
When a private equity firm acquires a physician practice, it...
The Doctrine of Christian Discovery: How Medieval Papal Authority Shaped U.S. Property Law and Continues to Deny Justice to Indigenous Nations
Colin Sumner · eYLS (Yale Law School) · Jan 1, 2027
A House of Cards: Humphrey’s Executor, Trump v. Slaughter, Latombe, and the Structural Vulnerability of the EU-US Data Privacy Framework
Noah Jaffe · Seton Hall University eRepo... · Jan 1, 2027
After the Privacy Shield dissolved, the Commission entered into talks with the US to adopt a new adequacy decision that would meet essential equivalency and the CJEU standard under GDPR Article 45. 10 The US adopted Executive Order 14,086 (…
TaIlored ManagEment of Sleep (TIMES) for people with dementia and mild cognitive impairment in primary care in England: protocol for a feasibility cluster-randomised controlled trial
Jayden van Horik, Louise Allan, Aidin Aryankhesal, Niall M. Broomfield et al. · Aston Publications Explorer... · Dec 31, 2026
BACKGROUND: People living with dementia (PLWD) and mild cognitive impairment (MCI), and their family carers, often experience sleep disturbance which can impair daily living and care. There are limited options for effective long-term pharma…
Portfolio Optimisation under Transaction Costs with Recursive Preferences
Martin Herdegen, David Hobson, Alex S. L. Tse · UCL Discovery (University C... · Dec 31, 2026
The Merton investment-consumption problem is fundamental, both in the field of finance and in stochastic control. An important extension of the problem adds transaction costs, which is highly relevant from a financial perspective but also c…
SOAR: Smooth Online Activation Routing for Stable Neural Learning from Evolving Streams
Sizhen Niu · Open MIND · Dec 31, 2026
Online neural learning requires models that update after each incoming example, remain calibrated under distributional change, and avoid brittle gradient transmission. The original version of this work used a small static benchmark, a shall…
Hubungan Antara Pola Asuh Otoriter Dengan Agresivitas Pada Remaja Di Surabaya
- Laila Nafisatus Sholicha · Universitas Airlangga Repos... · Dec 5, 2026
Laila Nafisatus Sholicha, 111911133054, Hubungan Antara Pola Asuh Otoriter Dengan Agresivitas Pada Remaja Di Surabaya, Skripsi, Fakultas Psikologi Universitas Airlangga, 2025 xvii + 100 halaman, 3 lampiran Penelitian ini bertujuan untuk men…
Catalyst poisoning influences from various functional groups of energy carriers towards electrochemical oxidation reactions on non-noble high-entropy alloy anodes in acidic media
Tahawy Rafat, Muflihah Salma Aridha, Hara Kosuke, Ohto Tatsuhiko et al. · Institutional Repositories ... · Dec 1, 2026
Electrolytic synthesis of energy carriers using renewable energy and fuel cells that use energy carriers for regeneration are important technologies for achieving our carbon-neutral society. However, electrochemical reactions in electrolyte…
When Lawyers Attack the Rule of Law: The Rise of Autocracy in America
Scott L. Cummings · eYLS (Yale Law School) · Dec 1, 2026
Upholding the Rule of Law in a Backsliding Democracy: Taking Stands, Pushing Back, and the Role of the PublicThank you for the honor of speaking with you today.Let me start with the obvious.We are all here because we are deeply concerned ab…
Binder-Free Mesoporous Vanadium Oxide Electrode: Anodic Electrodeposition, Characterization, and Supercapacitor Application
M Kazazi, Soheila Kazemi, Behzad Koozegar Kaleji · DOAJ (DOAJ: Directory of Op... · Oct 1, 2026
Mesoporous vanadium oxide (V2O5) was galvanostatically electrodeposited into nickel foam to obtain a binder-free electrode with a three-dimensional (3D) porous structure for supercapacitive performance. The anodic electrodeposition process …
Natur – Tiere – Krieg. Beziehungen und Wechselwirkungen von der Antike bis zur Gegenwart
TU Dortmund University · Open MIND · Sep 16, 2026
Vom 16. bis zum 18. September 2026 findet an der TU Dortmund eine internationale Konferenz zum Thema “Natur – Tiere – Krieg. Beziehungen und Wechselwirkungen von der Antike bis zur Gegenwart” statt. Sie verfolgt das Ziel, die bislang häufig…
Tales, Traditions, and Tides: The Little Logan River and its Role in Cache Valley's History, Culture, and Conservation
Emily A Kovacic · Digital Commons - USU (Utah... · Aug 1, 2026
The Little Logan River (LLR) plays a vital role in the cultural, historical, and ecological landscape of Cache Valley, Utah. Amid growing concerns raised by the Logan River Watershed Project (LRWP), which threatens to alter the river’s envi…
Spatial Prediction Under Uncertainty: Methodological and Computational Advances in Bayesian Maximum Entropy
Kinspride K. Duah · Utah State Research and Sch... · Aug 1, 2026
Environmental decisions such as infrastructure design, water management, and snow load estimation depend on spatial data that are often incomplete or uncertain. In many cases, measurements are not exact values but ranges, reflecting limitat…
Multiaxial fatigue assessment of aluminium-to-steel welded joints
C.T. Ng, L. Susmel · White Rose Research Online ... · Aug 1, 2026
Three Essays in Discrete Choice Analysis: Valuation, Attention Level, and Spatial Heterogeneity
Marcelo Pignatari · Digital Commons - USU (Utah... · Aug 1, 2026
Understanding how people value food labels and environmental goods is essential for designing policies, incentives, and market mechanisms that improve the well-being of the population. In this study, we found that a local label indicating t…
Time integrals under the Black-Scholes-Merton and Margrabe economies
José Carlos Dias, Mark B. Shackleton, Fernando da Silva, Rafał M. Wojakowski · Lancaster EPrints (Lancaste... · Jul 31, 2026
The problem of integrating the Black, Scholes, and Merton (BSM) formula with respect to the time variable is paramount for an economist. Inspired by the real options literature, Shackleton and Wojakowski offer analytic formulae for valuing …
STRUCTURAL AND ELECTROCHEMICAL PROPERTIES OF CHITOSAN/GQDS/ZIF-67 COMPOSITES
Nessa Gina Sonia, Irwana Nainggolan, Andriayani Andriayani, Nessa Gina Sonia et al. · BIOTIK: Scientific Journal ... · Jul 31, 2026
The development of materials based on Zeolitic Imidazolate Framework-67 (ZIF-67) has attracted attention in the field of electrochemical sensors due to their large surface area, good porosity, and the presence of Co²⁺ active sites, which pl…
The role of profitability in mediating the effect of intellectual capital and ESG disclosure on firm value in Islamic banking in the Asean region
Miftahul Farichah, Eka Wahyu Hestya Budianto · Research Repository Univers... · Jul 29, 2026
The purpose of this research is to determine the role of profitability in mediating the influence of Intellectual Capital and ESG Disclosure on Firm Value. The sample used in this study is Islamic banking in the ASEAN region that published …
ATSplat: Compact Feed-forward 3D Gaussian Splatting with Adaptive Token Expansion
Cho In, Jeonghwan Cho, Mijin Yoo, Gim Hee Lee et al. · arXiv · Jul 22, 2026
3D Gaussian Splatting (3DGS) achieves high-quality novel-view synthesis by optimizing freely placed primitives in 3D and adaptively densifying them in under-reconstructed regions. However, this scene-adaptive capacity allocation is largely …
PercepCap: Video Captioner with Structured Spatio-Temporal Perception
Yifan Xu, Zihao Wang, Zhixiao Wang, Jiaming Zhang et al. · arXiv · Jul 22, 2026
Video captioning requires fine-grained spatio-temporal understanding of videos, including spatial perception of where objects are located and temporal perception of when events occur. Existing MLLMs usually generate captions directly from v…
Persian Pixel: A large-scale synthetic OCR dataset for Persian language
Pouria Mahdi, Haq Nawaz Malik · arXiv · Jul 22, 2026
Optical Character Recognition (OCR) for Persian remains substantially less mature than for Latin-script languages despite Persian being spoken by more than 110 million people across multiple countries. This gap arises from two fundamental c…
Self Gradient Forcing: Native Long Video Extrapolation
Junhao Zhuang, Shiyi Zhang, Yuxuan Bian, Yaowei Li et al. · arXiv · Jul 22, 2026
Recent autoregressive video diffusion methods are increasingly built upon Self Forcing, where the student is trained on histories produced by its own rollout rather than ground-truth video contexts. This reduces exposure bias, but the histo…
Look Less, Think Faster: Joint Token-Compute Adaptation for Multimodal LLMs
Pengcheng Wang, Zhiquan Wang, Jayoung Lee, Zhuoyan Xu et al. · arXiv · Jul 22, 2026
Multimodal Large Language Models (MLLMs) have recently demonstrated strong performance across vision-language tasks. However, their high inference cost, arising from both the large number of input visual tokens and the heavy computation of …
Test-Time Training for Modality Order Consistency in Vision-Language Models
Aditi Gupta, Yossi Gandelsman · arXiv · Jul 22, 2026
We find that vision-language models are sensitive to a specific semantically irrelevant change: the order in which the image and question are presented. Across three models and three benchmarks, image first prompting consistently outperform…
Toward Reliable RGB-D Semantic Segmentation: Handling Missing Modalities via Condition Dropout
Xuchen Zhu, Yajuan Wei, Shuang Hao, Jiwei Jiang et al. · arXiv · Jul 22, 2026
RGB-D semantic segmentation has achieved remarkable progress, yet most models assume that RGB and depth are always available. In practice, failures or occlusions of surveillance sensors often remove one modality. Although RGB or depth alone…
Evolving Cache Schedules for Fast Diffusion Policy Inference
Siying Wang, Kangye Ji, Di Wang, Fei Cheng · arXiv · Jul 22, 2026
Diffusion policies achieve strong visuomotor control by iteratively denoising action chunks, but repeated denoising makes real-time deployment computationally demanding. Cache-based methods reduce inference cost by reusing intermediate acti…
Diverse-Intent Multi-Turn Fashion Image Retrieval
Mingqiang Tang, Haokun Wen, Meng Liu, Yupeng Hu et al. · arXiv · Jul 22, 2026
Real-world fashion search involves interactive retrieval across multiple turns. However, existing multi-turn retrieval methods are built on a restrictive assumption that every interaction follows the same attribute-editing paradigm, leaving…
Multimodal Large Language Models for Remote Sensing Image Understanding: Domain-Specific or General-Purpose?
Qiwei Ma, Chunping Qiu, Xinjun Cheng, Xiaoyu Zhang et al. · arXiv · Jul 22, 2026
The rapid development of multimodal large language models (MLLMs) has introduced a flexible paradigm for remote sensing image scene understanding (RSISU), enabling natural-language interaction with remote sensing imagery. However, a systema…
Self-supervision drives representational convergence in medical foundation models more than clinical supervision
Soroosh Tayebi Arasteh, Sebastian Ziegelmayer, Mahshad Lotfinia, Lisa Adams et al. · arXiv · Jul 22, 2026
Medical image encoders from different groups are increasingly treated as interchangeable, on the assumption that scale and clinical supervision concentrate their representations onto a shared structure. Whether this convergence is real, wha…
How Does Urban Context Relate to Residential Building Health? A Vision-POI Fusion Framework for Building-Level Housing Inspection
Kun Zhao, Helei Ren, Guilin Tang, Tianyi Chen et al. · arXiv · Jul 22, 2026
Housing-level urban physical examination is essential for identifying residential building problems and supporting targeted urban renewal. Existing automated inspection studies primarily rely on individual images and rarely examine whether …

Track Computer Vision on Distill AI — start free →

Latest Computer Vision Research Papers

Recent papers

Related topics