Xuanhe Zhou (周煊赫)
  • Xuanhe Zhou (周煊赫)
Powered by GitBook
On this page
  • News
  • Surveys
  • Representative Projects
  • Students
  • Peer-Reviewed Publications
  • 2025
  • 2024
  • 2023
  • 2022
  • 2021
  • 2020
  • 2019
  • Honors and Awards
  • Activities (archive)
  • Services
  • Datasets
  • Teaching

Was this helpful?

Xuanhe Zhou (周煊赫)

Last updated 1 day ago

Was this helpful?

[Google Scholar] [Personal Github] [Group Github]

Tenure-Track Assistant Professor, at Dept. of Computer Science, Shanghai Jiao Tong University.

Faculty of Zhiyuan Honors Program (for selected 1st-year undergraduate students).

Committed to building data infrastructures (e.g., data computation for online ml, data preparation for llm) in the AI era, and empowering upper-level applications (e.g., analytics / QA tasks) and underlying systems [see My Resume].

I got my Ph.D. degree from CS of Tsinghua, advised by Prof. Guoliang Li.

We have published papers in top-tier data management and AI conferences and journals, with 2000+ citations in Google Scholar and 2 main conference awards (SIGMOD 2025 | VLDB 2023).

We have been maintaining highly-starred paper lists at Github (LLM✖️Data [link], AI✖️Data [link]).

Contact Email: zhouxh@cs.sjtu.edu.cn ; fwu@cs.sjtu.edu.cn (Prof. Fan Wu)

I am actively seeking for strong and self-motivated PhD students, master students, and undergrad interns. Our Team owns GPU cards (like H100s) sufficient for experiments.

Build the data foundation in the AGI era together 🌤

News

April 20, 2025 | ACM Jim Gray Dissertation Honorable Mention Award (an honor in memory of Jim Gray, the Turing Award winner known for his key contributions in data management)

Nov 25, 2024 | 1st Level Science and Technology Award for "An Open-Source Database for Large-Scale Enterprise Applications" (Led by Prof. Li) 🎉

Nov 4, 2024 | BMTools (with 2.9k stars at Github) is accepted by CSUR 🌤

August 20, 2024 | BIRD-SQL is adopted by OpenAI to show their finetuning service [news] 🎉

July 24, 2024 | D-Bot is Now Sponsored by Azure AI 🎉

June 9, 2024 I Two Papers (VLDB, ICDE) are selected into 2024 Highly-Cited List (2019-2023) 🌤

July 24, 2023 | FEBench wins the [website] 🌤

October 20, 2022 | Microsoft Research Asia Fellow [news] 🎉

Surveys

A Survey of LLM × DATA. https://github.com/weAIDB/awsome-data-llm 🔥

Xuanhe Zhou, Junxuan He, Wei Zhou, Haodong Chen, Zirui Tang, Haoyu Zhao, Xin Tong, Guoliang Li, Youmin Chen, Jun Zhou, Zhaojun Sun, Binyuan Hui, Shuo Wang, Conghui He, Zhiyuan Liu, Jingren Zhou, Fan Wu. [Affiliations: SJTU, Tsinghua, Shanghai AI Lab, Alibaba]

Representative Projects

👉 OpenMLDB: Real-Time Feature Computation for Online ML (https://openmldb.ai)

An open-source machine learning system that computes consistent features for training and inference.

[system, SIGMOD 2025] [benchmark, VLDB 2023]

👉 D-Bot: LLM-Based DBA Copilot (http://dbgpt.dbmind.cn)

An LLM-based administrator that can acquire maintenance experience from textual sources, and provide reasonable, well-founded, in-time optimization advice for cloud instances.

[system, VLDB 2024] [demo, SIGMOD 2025]

👉 DBMind: A Self-Driving Database Platform (with openGauss)

Full-process autonomous database operation and maintenance capabilities, e.g., anomaly detection, root cause analysis, slow SQL optimization, index recommendation, fault self-repair, and etc.

Students

  • Wei Zhou (PhD)

  • Zirui Tang (PhD)

  • Haodong Chen (PhD, w/ Fan Wu)

  • Changdong Liu (Postdoc, w/ Fan Wu)

  • Shaokun Han (M.S.)

  • Haoyu Zhao (M.S.)

  • Jun Zhou (M.S.)

Peer-Reviewed Publications

(*indicates equal contribution; ^indicates corresponding author)

2025

(SIGMOD) OpenMLDB: A Real-Time Relational Data Feature Computation System for Online ML. [paper] [code] https://openmldb.ai Over 1.6k stars

Xuanhe Zhou, Wei Zhou, Liguo Qi, Hao Zhang, Dihao Chen, ^Bingsheng He, Mian Lu, Guoliang Li, Fan Wu, Yuqiang Chen.

(SIGMOD) Cracking SQL Barriers: An LLM-based Dialect Translation System. [paper] [code]

Wei Zhou, Yuyang Gao, ^Xuanhe Zhou, ^Guoliang Li.

(SIGMOD Demo) D-Bot: An LLM-Powered DBA Copilot.

Zhaoyan Sun, Xuanhe Zhou, Jianming Wu, Wei Zhou, ^Guoliang Li

(KDD) Revolutionizing Database QA with Large Language Models: Comprehensive Benchmark and Evaluation. [paper] [code]

Yihang Zheng, Bo Li, Zhenghao Lin, Yi Luo, Xuanhe Zhou, Chen Lin, Jinsong Su, Guoliang Li, Shifu Li.

(ICDE) MemQ: A Graph-Based Query Memory Prediction Framework for Effective Workload Scheduling. [code]

Yang Wu, Xuanhe Zhou, Xiaoguang Li, Feifei Li, Yong Zhang.

2024

Yihang Zheng, Chen Lin, Xian Lyu, Xuanhe Zhou, Guoliang Li, Tianqing Wang.

Xuanhe Zhou, Guoliang Li, Zhaoyan Sun, Zhiyuan Liu, Weize Chen, Jianming Wu, Jiesi Liu, Ruohang Feng, Guoyang Zeng.

Wei Zhou*, Chen Lin*, Xuanhe Zhou*, Guoliang Li.

Xinyang Zhao, Xuanhe Zhou, Guoliang Li.

Guoliang Li, Xuanhe Zhou, Xinyang Zhao.

Wei Zhou, Chen Lin, Xuanhe Zhou, Guoliang Li.

Yang Wu*, Xuanhe Zhou*, Yong Zhang, Guoliang Li

2023

Xuanhe Zhou, Guoliang Li, Wei Guo, Luyang Liu.

Xuanhe Zhou*, Cheng Chen*, Kunyi Li, Bingsheng He, Mian Lu, Qiaosheng Liu, Wei Huang, Guoliang Li, Zhao Zheng, Yuqqiang Chen.

Xuanhe Zhou, Guoliang Li, Jianming Wu, Jiesi Liu, Zhaoyan Sun, Xinning Zhang.

Zhaoyan Sun, Xuanhe Zhou, Guoliang Li.

Yuanning Gao, Xiuqi Huang, Xuanhe Zhou, Xiaofeng Gao, Guoliang Li.

(NeurIPS) Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs. [paper] [code] 🖐🏻 Spotlight

Led by Damo Academy (Binyuan Hui, Yongbin Li, et al) and University of Hong Kong (Jinyang Li)

Xinyang Zhao*, Xuanhe Zhou*, Guoliang Li.

Xuanhe Zhou*, Zhaoyan Sun*, Guoliang Li.

2022

Lixi Zhang, Chengliang Chai, Xuanhe Zhou, Guoliang Li.

Xuanhe Zhou, Guoliang Li, Chengliang Chai, Jianhua Feng.

Xuanhe Zhou, Luyang Liu, Wenbo Li, Lianyuan Jin, Tianqing Wang, Shifu Li.

Chen Lin, Junqing Zhuang, Jiadong Feng, Hui Li, Xuanhe Zhou, Guoliang Li.

(ICDE Tutorial) Machine Learning for Data Management: A System View. [paper] [slide]

Guoliang Li, Xuanhe Zhou.

2021

Guoliang Li, Xuanhe Zhou (first student author), Ji Sun, Xiang Yu, Yue Han, Lianyuan Jin, Wenbo Li, Tianqing Wang, Shifu Li.

Xuanhe Zhou, Lianyuan Jin, Ji Sun, Xinyang Zhao, Xiang Yu, Shifu Li, Tianqing Wang, et al.

(SIGMOD Tutorial) AI Meets Database: AI4DB and DB4AI. [paper] [slide]

Guoliang Li, Xuanhe Zhou, Lei Cao, Chengliang Chai.

(VLDB Tutorial) Machine Learning for Databases. [paper]

Guoliang Li, Xuanhe Zhou, Lei Cao.

Guoliang Li, Xuanhe Zhou.

2020

Xuanhe Zhou, Ji Sun, Guoliang Li, Jianhua Feng.

(TKDE) Database meets artificial intelligence: A survey. [paper]

Xuanhe Zhou, Chengliang Chai, Guoliang Li, Ji Sun.

(Chinese Journal of Computers) Overview of database technology based on machine learning. [paper]

Guoliang, Li, Xuanhe Zhou, Sun Ji, Yu Xiang, Yuan Haitao, Liu Jiabin, and Han Yue.

2019

Guoliang Li, Xuanhe Zhou, Shifu Li, Bo Gao.

Guoliang Li, Xuanhe Zhou, Sihao Li.

(Journal of Computer) A Survey of Machine-Learning-based Database Techniques. [paper]

Guoliang Li, Xuanhe Zhou, Ji Sun, Xiang Yu, Haitao Yuan, Jiabin Li, Yue Han.

Honors and Awards

2025 - ACM Jim Gray Dissertation Honorable Mention Award (fisrt winner in China)

2024 - CCF Doctoral Dissertation Award (CCF 优博, 10 recipients in China)

2024 - Tsinghua Doctoral Dissertation Award (清华优博)

2023 - VLDB Best Industry Paper RunnerUp Award (first author)

2023 - Top 100 Open Source Achievements by Benchcouncil

2022 - Outstanding Scholarship of Tsinghua University (清华特奖)

2022 - ByteDance Fellowship (10 recipients in China)

2022 - MSRA Fellowship (12 recipients in the Asia-Pacific region)

2021 - Zhongshimo Fellowship (钟士模奖学金)

2021 - Apple Scholars in AI/ML Nomination

2023, 2017 - National Scholarship

Activities (archive)

2021.10 - Tutorial of ML for Databases, AIMLSystems Conference. [website]

2021.08 - Invited Talk, The LADSIOS Workshop, VLDB Conference. [website]

2021.06 - SIGMOD Onsite Volunteer. [website]

Services

Conferences - ICDE'26; ICDE'25 (session chair); DBML'23; AIDB'23

Journals - TKDE, VLDB Journal, ACM CSUR, Transactions on Computers

Datasets

https://github.com/TsinghuaDatabaseGroup/datasets (Public archive)

Teaching

2019-2022 Database Systems (THU/30240262), teaching assistant

  • Online Tutorial for Basic functions: https://thu-db.github.io/dbs-tutorial/

  • Online Tutorial for Advanced functions: https://thu-db.github.io/dbtrain-tutorial/

self prediction self optimization self configuration data structure data framework others

(SIGMOD) Robustness of Updatable Learning-based Index Advisors. [paper] [code]

(VLDB) D-Bot: Database Diagnosis System using Large Language Models. [paper] [code] BenchCouncil Top 100 Open Achievements

(VLDB) Breaking It Down: An In-depth Study of Index Advisors [EA&B]. [code] [pypi] Direct Accept with Shepherding

(VLDB Demo) Chat2Data: An Interactive Data Analysis System with RAG, Vector Databases and LLMs. [paper] [code]

(VLDB Tutorial) LLM for Data Management. [slides] [repo]

(ICDE) TRAP: Tailored Robustness Assessment for Index Advisors via Adversarial Perturbation. [paper] [code]

(TKDE) Automatic Index Tuning: A Survey. [paper]

(CSUR) Tool Learning with Foundation Models. [paper] [repo] 2.9k stars

(SIGMOD) Grep: A Graph Learning Based Database Partitioning System. [paper] [code]

(VLDB Industry) FEBench: A Benchmark for Real-Time Relational Data Feature Extraction. [paper] [code] Best Industry Paper Runnerup Award

(VLDB Demo) A Learned Query Rewrite System. [demo]

(VLDB) Learned Index: A Comprehensive Experimental Evaluation. [paper] [code]

(ICDE) DBAugur: An Adversarial-based Trend Forecasting System for Diversified Workloads. [paper] [code]

(TKDE) Automatic Database Knob Tuning: A Survey. [paper] [code]

(DSE) DB-GPT: Large Language Model Meets Database. [paper] [code]

(SIGMOD) LearnedSQLGen: Constraint-Aware SQL Generation using Reinforcement Learning. [paper] [code]

(VLDB) A Learned Query Rewrite System using Monte Carlo Tree Search. [paper] [code]

(ICDE) AutoIndex: An Incremental Index Management System for Dynamic Workloads. [paper] [code]

(ICDE) Adaptive Code Learning for Spark Configuration Tuning. [paper] [code]

(VLDB Industry) openGauss: An Autonomous Database System. [paper] [code] Over 2.7k stars

(VLDB Demo) DBMind: A Self-Driving Platform in openGauss. [paper] [code]

(Journal of Software) Survey of Data Management Techniques for Supporting Artificial Intelligence. [paper]

(VLDB) Query Performance Prediction for Concurrent Queries using Graph Embedding. [code] [paper]

(VLDB) QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. [paper]

(Data Eng.) Xuanyuan: An AI-Native Database. [paper]

🟡
🟢
🔴
🔵
⚫
🟤
🔴
🟢
🔴
🟤
🟤
🔴
🔴
🟤
🔴
🟤
🟢
🔵
🟡
🔴
🟢
🔴
🟤
🟢
🔴
🔴
⚫
⚫
🟤
🟡
🔴
⚫
5MB
dbmind-video-final.mp4