Xuhui Zhan
Xuhui Zhan
Home
Project
Mini Project
Business Idea
Fun Fact
Light
Dark
Automatic
Projects
Inverse-LLaVA: Eliminating Alignment Pre-training Through Text-to-Vision Mapping
We’ve flipped the script on multimodal fusion, instead of forcing visual features into discrete text space, we map text embeddings into continuous visual space, eliminating costly alignment pre-training while achieving competitive performance.
Website
Code
Paper
Knowledge Injection for Text Generation in Social Network
A fusion network for integrating Graph Neural Network and LLMs to generate text-related edge features in social network via knowledge injection and PEFT.
Repo
AI for Negotiation
AI for coding transcripts and AI for simulating real-world negotiation scenarios.
Repo
Ancient Mortars Classification
An vision transformer application for ancient mortars classification.
Repo