-
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Paper • 2405.15223 • Published • 17 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 91 -
Matryoshka Multimodal Models
Paper • 2405.17430 • Published • 35
Collections
Discover the best community collections!
Collections including paper arxiv:2406.13923
-
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions
Paper • 2401.13313 • Published • 5 -
BAAI/Bunny-v1_0-4B
Text Generation • 4B • Updated • 22 • 10 -
What matters when building vision-language models?
Paper • 2405.02246 • Published • 104 -
Jina CLIP: Your CLIP Model Is Also Your Text Retriever
Paper • 2405.20204 • Published • 37
-
Dissecting In-Context Learning of Translations in GPTs
Paper • 2310.15987 • Published • 6 -
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Paper • 2309.08958 • Published • 2 -
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Paper • 2305.04160 • Published • 2 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1
-
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Paper • 2405.15223 • Published • 17 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 91 -
Matryoshka Multimodal Models
Paper • 2405.17430 • Published • 35
-
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2 -
Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering
Paper • 2308.13259 • Published • 2 -
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Paper • 2309.05653 • Published • 11 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 19
-
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Paper • 2405.15223 • Published • 17 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 91 -
Matryoshka Multimodal Models
Paper • 2405.17430 • Published • 35
-
iVideoGPT: Interactive VideoGPTs are Scalable World Models
Paper • 2405.15223 • Published • 17 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 55 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 91 -
Matryoshka Multimodal Models
Paper • 2405.17430 • Published • 35
-
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions
Paper • 2401.13313 • Published • 5 -
BAAI/Bunny-v1_0-4B
Text Generation • 4B • Updated • 22 • 10 -
What matters when building vision-language models?
Paper • 2405.02246 • Published • 104 -
Jina CLIP: Your CLIP Model Is Also Your Text Retriever
Paper • 2405.20204 • Published • 37
-
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Paper • 2204.07705 • Published • 2 -
Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for Knowledge-intensive Question Answering
Paper • 2308.13259 • Published • 2 -
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Paper • 2309.05653 • Published • 11 -
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Paper • 2309.12284 • Published • 19
-
Dissecting In-Context Learning of Translations in GPTs
Paper • 2310.15987 • Published • 6 -
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Paper • 2309.08958 • Published • 2 -
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Paper • 2305.04160 • Published • 2 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1