Xuhui (Daniel) Zhan is a Data Scientist and Machine Learning Engineer with over 3 years of extensive experience specializing in NLP, recommender systems, AI engineering, deployment, data science, and full-stack software development. Currently pursuing a Master’s in Data Science at Vanderbilt University (GPA: 3.99/4.0), he combines robust theoretical knowledge with practical, real-world applications.
At Vanderbilt University’s AI Negotiation Lab, as Lead AI Data Science Researcher, Xuhui significantly enhanced model performance, raising human-model agreement rates from 30% to 80% and drastically reducing annotation costs by more than 99%. Additionally, he successfully trained and deployed scalable models on datasets containing over 100,000 transcripts, gaining adoption from leading institutions including MIT, CMU, and Northwestern University.
In his recent role as a Graduate Research Assistant at Vanderbilt’s Network and Data Science Lab, Xuhui innovated the integration of Temporal Graph Attention Networks (TGAT) with Large Language Models (LLMs), exceeding prior state-of-the-art results by designing custom fusion architectures and optimizing pipelines across massive datasets.
Prior to Vanderbilt, as an Algorithm Development Engineer, Xuhui developed comprehensive ML pipelines for automated warehouse robotics, optimizing computer vision and reinforcement learning algorithms, and overseeing scalable deployments across more than 100 industrial sites.
Xuhui earned his Bachelor’s degree in Data Science with First Class Honours and Highest Distinction from Beijing Normal-Hong Kong Baptist University, showcasing his commitment to academic excellence.
MSc in Data Science, 2023-Present
Vanderbilt University
BSc in Data Science, 2018-2022
Beijing Normal-Hong Kong Baptist University
BSc in Data Science, 2018-2022
Hong Kong Baptist University
Advisor: Ray Friedman (AINegotiation Lab)
Advisor: Tyler Derr (NDS Lab)
Develop a fusion network that integrates Graph Neural Networks (TGAT) and LLMs (Llama 3), not from a token based fusion but a highly customized knowledge injection way with adoption of ideas from PEFT (LoRA and DoRA) and differential transformer to enable the LLMs understanding the topological information and messages for personalized generation in social network settings.
Implement from scratch, need to rewrite the transformers library for customization need, first on Venmo Dataset (Spend about 3 months to collect, 3 million users) then switch to Amazon Review Dataset (Adopted from UCSD collections, raw dataset more than 78 GB after compression)
Showing potential to expand to unified modalities fusion.
Idea is cheap, experiments are expensive.
Graduate Teaching Fellowship
Responsible for organizing labs, creating quizzes, grading materials and hosting office hours.
Graduate level:
Undergrad level:
Advisor: Markus Eberl
Work as an algorithm engineer and take in charge of all machine learning applications and efficient data analysis.
Responsibilities include: