-
microsoft/OmniParser
Image-Text-to-Text • Updated • 312 • 1.71k -
Maya: An Instruction Finetuned Multilingual Multimodal Model
Paper • 2412.07112 • Published • 29 -
Visual Representation Alignment for Multimodal Large Language Models
Paper • 2509.07979 • Published • 84 -
ModernVBERT: Towards Smaller Visual Document Retrievers
Paper • 2510.01149 • Published • 33
Frank Sommers PRO
fsommers
AI & ML interests
None yet
Recent Activity
upvoted a paper about 3 hours ago
Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification upvoted a collection about 3 hours ago
Qwen3.6 upvoted an article 2 days ago
DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models