Xuanhe Zhou (周煊赫)

[Google Scholar] [Personal Github] [Group Github]

Tenure-Track Assistant Professor, at Dept. of Computer Science, Shanghai Jiao Tong University.

Faculty of Zhiyuan Honors Program, Shanghai Jiao Tong University.

Assistant Researcher at Shanghai AI Laboratory (with Conghui He).

Technical Consultant of 4Paradigm.

Contact Email: [email protected] ; [email protected] (Prof. Fan Wu)

周煊赫,现任上海交通大学计算机学院长聘轨助理教授,上海人工智能实验室兼职助理研究员,第四范式技术顾问。主要研究智融数据分析、ML/LLM数据底座、自治数据库系统(AI4DB),曾获得通信学会科学技术一等奖1项、CCF-A类国际顶会论文奖2项。本人在SIGMOD、VLDB、NIPS、TKDE等CCF A类会议和期刊上已发表论文数十篇,包括近五年NIPS、VLDB、ICDE高被引论文,入选卡耐基梅隆大学、康奈尔大学等高校课程。谷歌学术引用量近三千次。曾获 ACM Jim Gray博士论文提名奖(大陆首位)【1】、VLDB 2023最佳工业论文亚军奖(第一作者)【2】、CCF优博、世界人工智能大会云帆奖【3】、微软学者、字节跳动奖学金、清华特奖等荣誉。代表性工作OpenMLDB已经落地第四范式先知(AIOS)平台【4】,并在金融、电商、能源等百余个真实场景中实现规模化应用。

【1】 https://mp.weixin.qq.com/s/OsjD01C1E_-Cel3YkCpFpA

【2】 https://mp.weixin.qq.com/s/FJWva702Q8QCjBq8gm501Q

【3】 https://mp.weixin.qq.com/s/uzOpuSwU6VM4SQdgnkQSqQ

【4】 https://mp.weixin.qq.com/s/UtvvnDz-C-JTkNWhfYrh9g

Student Collaborators

  • Wei Zhou (PhD)

  • Zirui Tang (PhD)

  • Bangrui Xu (PhD)

  • Xuzhou Zhu (PhD)

  • Xufei Wu (PhD)

  • Haodong Chen (PhD, w/ Fan Wu)

  • Shaokun Han (M.S.)

  • Haoyu Zhao (M.S.)

  • Jun Zhou (M.S.)

  • Yumou Liu (M.S.)

  • Shuhang Lu (M.S.)

  • Changdong Liu (Postdoc, w/ Fan Wu)

  • Research Assistant List (coming soon 🪐)

We are building data infrastructures (e.g., data annotation and synthesis platform) in the LLM era, and empowering intelligent applications (e.g., agentic AI) and system optimization [see My Resume].

We keep seeking for strong and self-motivated PhD students, master students, and undergrad interns. Our Team owns GPU cards (like H100s) sufficient for experiments.

  • (Fall 2027 admission) 1–2 PhD positions, 1 Master position

Build the data foundation in the AGI era together 🌤

News

Sep 29, 2025 | MinerU2.5 Technical Report is Released, the world’s leading open-source data parsing engine for LLM/Rag/Agent (Led by Conghui He & Bin Wang) 🎉

July 27, 2025 | World Artificial Intelligence Conf. (WAIC) Yunfan Award (one of 15 global recipients) 🎉

July 23, 2025 I 4+ Papers (NIPS, SIGMOD, VLDB) are selected into 2025 Highly-Cited List (Google Scholar) 🌤

April 20, 2025 | ACM Jim Gray Dissertation Honorable Mention Award (in memory of the Turing Award winner Jim Gray; first in Mainland China) :confetti_ball:

Nov 25, 2024 | 1st Level Science and Technology Award for "An Open-Source Database for Large-Scale Enterprise Applications" (Led by Prof. Li) 🎉

Nov 4, 2024 | BMTools (with 2.9k stars at Github) is accepted by CSUR 🌤

August 20, 2024 | BIRD-SQL is adopted by OpenAI to promote their finetuning service [news] 🎉

July 24, 2024 | D-Bot is Now Sponsored by Azure AI 🎉

June 9, 2024 I Two Papers (VLDB, ICDE) are selected into 2024 Highly-Cited List (2019-2023) 🌤

July 24, 2023 | FEBench wins VLDB [website] 🌤

October 20, 2022 | Microsoft Research Asia Fellow [news] 🎉

Surveys

A Survey of LLM × DATA. https://github.com/weAIDB/awsome-data-llm 🔥

Xuanhe Zhou, Junxuan He, Wei Zhou, Haodong Chen, Zirui Tang, Haoyu Zhao, Xin Tong, Guoliang Li, Youmin Chen, Jun Zhou, Zhaojun Sun, Binyuan Hui, Shuo Wang, Conghui He, Zhiyuan Liu, Jingren Zhou, Fan Wu. [Affiliations: SJTU, Tsinghua, Shanghai AI Lab, Alibaba]

LLM/Agent-as-Data-Analyst: A Survey. [paper]

Zirui Tang, Weizheng Wang, Zihang Zhou, Yang Jiao, Bangrui Xu, Boyu Niu, *Xuanhe Zhou, Guoliang Li, Yeye He, Wei Zhou, Yitong Song, Cheng Tan, Bin Wang, Conghui He, Xiaoyang Wang, Fan Wu. [Affiliations: SJTU, Tsinghua, Microsoft Research, Shanghai AI Lab, Fudan]

Example Projects

🪐 OpenMLDB (https://openmldb.ai)

🪐 MinerU (https://mineru.net/)

🪐 openGauss (https://opengauss.org)

Peer-Reviewed Publications

🟡 Data-Centric AI & Data Infra 🔴 Data Analytics (for the future) 🟤 Data Management (for the future)

(*indicates equal contribution; ^indicates corresponding author)

2026

(SIGMOD) ST-Raptor: LLM-Powered Semi-Structured Table Question Answering. [paper] [code] 🔴

Zirui Tang, Boyu Niu, ^Xuanhe Zhou, Boxiu Li, Wei Zhou, Jiannan Wang, Guoliang Li, Xinyi Zhang, Fan Wu.

2025

(SIGMOD) OpenMLDB: A Real-Time Relational Data Feature Computation System for Online ML. [paper] [code] 🟡 https://openmldb.ai Over 1.6k stars

Xuanhe Zhou, Wei Zhou, Liguo Qi, Hao Zhang, Dihao Chen, ^Bingsheng He, Mian Lu, Guoliang Li, Fan Wu, Yuqiang Chen.

(SIGMOD) Cracking SQL Barriers: An LLM-based Dialect Translation System. [paper] [code] 🟤

Wei Zhou, Yuyang Gao, ^Xuanhe Zhou, ^Guoliang Li.

(SIGMOD Demo) D-Bot: An LLM-Powered DBA Copilot.

Zhaoyan Sun, Xuanhe Zhou, Jianming Wu, Wei Zhou, ^Guoliang Li

(NeurIPS) PARROT: A Benchmark for Evaluating LLMs in Cross-System SQL Translation. [paper] [code] [leaderboard] 🟤 (4.75/5)

Wei Zhou, Guoliang Li, Haoyu Wang, Yuxing Han, Wu Xufei, Fan Wu, ^Xuanhe Zhou.

(VLDB) R-Bot: An LLM-based Query Rewrite System. [code]

Zhaoyan Sun, Xuanhe Zhou, Guoliang Li

(KDD) Revolutionizing Database QA with Large Language Models: Comprehensive Benchmark and Evaluation. [paper] [code]

Yihang Zheng, Bo Li, Zhenghao Lin, Yi Luo, Xuanhe Zhou, Chen Lin, Jinsong Su, Guoliang Li, Shifu Li.

(ICDE) MemQ: A Graph-Based Query Memory Prediction Framework for Effective Workload Scheduling. [code]

Yang Wu, Xuanhe Zhou, Xiaoguang Li, Feifei Li, Yong Zhang.

2024

(SIGMOD) Robustness of Updatable Learning-based Index Advisors. [paper] [code]

Yihang Zheng, Chen Lin, Xian Lyu, Xuanhe Zhou, Guoliang Li, Tianqing Wang.

(VLDB) D-Bot: Database Diagnosis System using Large Language Models. [paper] [code] BenchCouncil Top 100 Open Achievements

Xuanhe Zhou, Guoliang Li, Zhaoyan Sun, Zhiyuan Liu, Weize Chen, Jianming Wu, Jiesi Liu, Ruohang Feng, Guoyang Zeng.

(VLDB) Breaking It Down: An In-depth Study of Index Advisors [EA&B]. [code] [pypi] Direct Accept with Shepherding

Wei Zhou*, Chen Lin*, Xuanhe Zhou*, Guoliang Li.

(VLDB Demo) Chat2Data: An Interactive Data Analysis System with RAG, Vector Databases and LLMs. [paper] [code]

Xinyang Zhao, Xuanhe Zhou, Guoliang Li.

(VLDB Tutorial) LLM for Data Management. [slides] [repo]

Guoliang Li, Xuanhe Zhou, Xinyang Zhao.

(ICDE) TRAP: Tailored Robustness Assessment for Index Advisors via Adversarial Perturbation. [paper] [code]

Wei Zhou, Chen Lin, Xuanhe Zhou, Guoliang Li.

(TKDE) Automatic Index Tuning: A Survey. [paper]

Yang Wu*, Xuanhe Zhou*, Yong Zhang, Guoliang Li

(CSUR) Tool Learning with Foundation Models. [paper] [repo] 2.9k stars

2023

(SIGMOD) Grep: A Graph Learning Based Database Partitioning System. [paper] [code]

Xuanhe Zhou, Guoliang Li, Wei Guo, Luyang Liu.

(VLDB Industry) FEBench: A Benchmark for Real-Time Relational Data Feature Extraction. [paper] [code] Best Industry Paper Runnerup Award

Xuanhe Zhou*, Cheng Chen*, Kunyi Li, Bingsheng He, Mian Lu, Qiaosheng Liu, Wei Huang, Guoliang Li, Zhao Zheng, Yuqqiang Chen.

(VLDB Demo) A Learned Query Rewrite System. [demo]

Xuanhe Zhou, Guoliang Li, Jianming Wu, Jiesi Liu, Zhaoyan Sun, Xinning Zhang.

(VLDB) Learned Index: A Comprehensive Experimental Evaluation. [paper] [code]

Zhaoyan Sun, Xuanhe Zhou, Guoliang Li.

(ICDE) DBAugur: An Adversarial-based Trend Forecasting System for Diversified Workloads. [paper] [code]

Yuanning Gao, Xiuqi Huang, Xuanhe Zhou, Xiaofeng Gao, Guoliang Li.

(NeurIPS) Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs. [paper] [code] Spotlight

Led by Damo Academy (Binyuan Hui, Yongbin Li, et al) and University of Hong Kong (Jinyang Li)

(TKDE) Automatic Database Knob Tuning: A Survey. [paper] [code]

Xinyang Zhao*, Xuanhe Zhou*, Guoliang Li.

(DSE) DB-GPT: Large Language Model Meets Database. [paper] [code]

Xuanhe Zhou*, Zhaoyan Sun*, Guoliang Li.

2022

(SIGMOD) LearnedSQLGen: Constraint-Aware SQL Generation using Reinforcement Learning. [paper] [code]

Lixi Zhang, Chengliang Chai, Xuanhe Zhou, Guoliang Li.

(VLDB) A Learned Query Rewrite System using Monte Carlo Tree Search. [paper] [code]

Xuanhe Zhou, Guoliang Li, Chengliang Chai, Jianhua Feng.

(ICDE) AutoIndex: An Incremental Index Management System for Dynamic Workloads. [paper] [code]

Xuanhe Zhou, Luyang Liu, Wenbo Li, Lianyuan Jin, Tianqing Wang, Shifu Li.

(ICDE) Adaptive Code Learning for Spark Configuration Tuning. [paper] [code]

Chen Lin, Junqing Zhuang, Jiadong Feng, Hui Li, Xuanhe Zhou, Guoliang Li.

(ICDE Tutorial) Machine Learning for Data Management: A System View. [paper] [slide]

Guoliang Li, Xuanhe Zhou.

2021

(VLDB Industry) openGauss: An Autonomous Database System. [paper] [code] Over 2.7k stars

Guoliang Li, Xuanhe Zhou (first student author), Ji Sun, Xiang Yu, Yue Han, Lianyuan Jin, Wenbo Li, Tianqing Wang, Shifu Li.

(VLDB Demo) DBMind: A Self-Driving Platform in openGauss. [paper] [code]

Xuanhe Zhou, Lianyuan Jin, Ji Sun, Xinyang Zhao, Xiang Yu, Shifu Li, Tianqing Wang, et al.

(SIGMOD Tutorial) AI Meets Database: AI4DB and DB4AI. [paper] [slide]

Guoliang Li, Xuanhe Zhou, Lei Cao, Chengliang Chai.

(VLDB Tutorial) Machine Learning for Databases. [paper]

Guoliang Li, Xuanhe Zhou, Lei Cao.

(Journal of Software) Survey of Data Management Techniques for Supporting Artificial Intelligence. [paper]

Guoliang Li, Xuanhe Zhou.

2020

(VLDB) Query Performance Prediction for Concurrent Queries using Graph Embedding. [code] [paper]

Xuanhe Zhou, Ji Sun, Guoliang Li, Jianhua Feng.

(TKDE) Database meets artificial intelligence: A survey. [paper]

Xuanhe Zhou, Chengliang Chai, Guoliang Li, Ji Sun.

(Chinese Journal of Computers) Overview of database technology based on machine learning. [paper]

Guoliang, Li, Xuanhe Zhou, Sun Ji, Yu Xiang, Yuan Haitao, Liu Jiabin, and Han Yue.

2019

(VLDB) QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. [paper]

Guoliang Li, Xuanhe Zhou, Shifu Li, Bo Gao.

(Data Eng.) Xuanyuan: An AI-Native Database. [paper]

Guoliang Li, Xuanhe Zhou, Sihao Li.

(Journal of Computer) A Survey of Machine-Learning-based Database Techniques. [paper]

Guoliang Li, Xuanhe Zhou, Ji Sun, Xiang Yu, Haitao Yuan, Jiabin Li, Yue Han.

Honors and Awards

2025 - WAIC Yunfan Award (15 global recipients)

2025 - ACM Jim Gray Dissertation Honorable Mention Award (fisrt in Mainland)

2024 - CCF Doctoral Dissertation Award (CCF 优博, 10 recipients in China)

2024 - Tsinghua Doctoral Dissertation Award (清华优博)

2023 - VLDB Best Industry Paper RunnerUp Award (first author)

2023 - Top 100 Open Source Achievements by Benchcouncil

2022 - Outstanding Scholarship of Tsinghua University (清华特奖)

2022 - ByteDance Fellowship (10 recipients in China)

2022 - MSRA Fellowship (12 recipients in the Asia-Pacific region)

2021 - Zhongshimo Fellowship (钟士模奖学金)

2021 - Apple Scholars in AI/ML Nomination

2023, 2017 - National Scholarship

Services

Journals - TKDE, VLDB Journal, ACM CSUR, Transactions on Computers, Journal of Big Data

Teaching

Spring 2025, Data Structures, SJTU, Teaching

Spring 2025, Database Systems, SJTU, Teaching

2019-2022 Database Systems, THU/30240262

Last updated

Was this helpful?