Skip to main content

8 docs tagged with "multimodal"

View all tags

CheXagent

论文名称 Towards a Foundation Model for Chest X-Ray Interpretation

CLIP

论文名称:Learning Transferable Visual Models From Natural Language Supervision

DALLE

DALLE:from text to image.

LLaVA

论文名称:Visual Instruction Tuning

VisionLLM

论文名称:VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks