HuggingFaceShared via Distill AI

microsoft/Florence-2-large

SUMMARY

Microsoft Florence-2-large processes images and text to generate contextual text responses. The model handles visual question answering, image captioning, and multimodal understanding tasks through its image-text-to-text architecture. #cv #ai #huggingface

HuggingFace →

Get research like this, matched to your field

Distill AI tracks arXiv, Nature, NeurIPS, CVPR, GitHub, HuggingFace and more — then surfaces the papers that matter to you, every morning. Track any custom topic, get 2-sentence summaries, and chat with any paper.

Try Distill AI — free →

Browse AI research topics →