SUMMARY
Focus-dLLM reduces computational cost of long-context diffusion language models through confidence-guided sparse attention, addressing the challenge of estimating attention importance for tokens during non-autoregressive decoding #llm #ml #ai
Get research like this, matched to your field
Distill AI tracks arXiv, Nature, NeurIPS, CVPR, GitHub, HuggingFace and more — then surfaces the papers that matter to you, every morning. Track any custom topic, get 2-sentence summaries, and chat with any paper.
Try Distill AI — free →