Homepage

Advisor: Prof. Guoliang Li
Tsinghua Database Group

Brief Biography

I am Xuanhe Zhou, a Ph.D. candidate in CS of Tsinghua, mentored by Prof. Guoliang Li. My research interest lies in autonomous databases, i.e., automating and optimizing the logical/physical designs of databases so as to better serve AI/DB users. I have published 10+ papers in SIGMOD/VLDB/ICDE conferences and 2 surveys in TKDE (700+ citations in total). I have led several open projects, e.g., openGauss-dbmind, febench, dbgpt (d-bot). My works have been applied in enterprises like 4Paradigm and Intel. Honors include VLDB Best Industry Paper Runner-up (first author), Microsoft Scholar, Tsinghua Outstanding Scholarship, etc.
周煊赫,清华大学计算机系博士,师从李国良教授,专攻自治数据库系统,即自动化数据库系统的逻辑和物理设计。发表顶级数据库论文10余篇,引用700余次。主导多个开源项目,如 openGauss-dbmind、febench、dbgpt (d-bot) 。 成果在邮储、工行、第四范式、英特尔等重要企业应用。曾获 VLDB最佳工业论文提名、微软学者、清华特奖等荣誉。

Research Directions

The relevant research problems include but are not limited to :
(1) Database Optimization (e.g., query rewrite);
(2) Database Configuration (e.g., parameter/index/partition);
(3) System Diagnosis (e.g., root cause analysis);
(4) AI-relevant Techniques. I also explore interesting AI techniques that can benefit above aspects (e.g., online feature store, tool4LLM, multi-LLM).

Open Projects

One of our current targets is to open-source a series of practical, out-of-the-box tools (outside/inside database kernels) for both AI and DB users with my excellent collaborators, e.g.,
An LLM-based database administrator that can acquire database maintenance experience from textual sources, and provide reasonable, well-founded, in-time diagnosis advice for databases.
https://github.com/TsinghuaDatabaseGroup/DB-GPT
A novel benchmark specifically designed for real-time feature extraction (RTFE) within the domain of online AI inference services. These services are rapidly being deployed in diverse applications, including finance, retail, manufacturing, energy, media, and etc.
https://github.com/decis-bench/febench
Full-process autonomous database operation and maintenance capabilities, e.g., slow SQL root cause analysis, workload index recommendation, multi-metric correlation mining, fault self-repair, anomaly detection and root cause analysis, and etc.

Peer-reviewed Papers

🟡
self monitoring
🟢
self optimization
🔴
self configuration
🔵
data structure design
auto database
🟤
data generation 🖐🏻 AI techniques
(*indicates equal contribution)

2023

(SIGMOD) Grep: A Graph Learning Based Database Partitioning System. [paper] [code]
🔴
Xuanhe Zhou, Guoliang Li, Wei Guo, Luyang Liu.
(VLDB Industry) FEBench: A Benchmark for Real-Time Relational Data Feature Extraction. [paper] [code] 🖐🏻 Best Industry Paper Award (Runner up)
Xuanhe Zhou*, Cheng Chen*, Kunyi Li, Bingsheng He, Mian Lu, Qiaosheng Liu, Wei Huang, Guoliang Li, Zhao Zheng, Yuqqiang Chen.
(VLDB Demo) A Learned Query Rewrite System. [demo]
🟢
Xuanhe Zhou, Guoliang Li, Jianming Wu, Jiesi Liu, Zhaoyan Sun, Xinning Zhang.
(VLDB) Learned Index: A Comprehensive Experimental Evaluation. [paper] [code]
🔵
Zhaoyan Sun, Xuanhe Zhou, Guoliang Li.
(ICDE) DBAugur: An Adversarial-based Trend Forecasting System for Diversified Workloads. [paper] [code]
🟡
Yuanning Gao, Xiuqi Huang, Xuanhe Zhou, Xiaofeng Gao, Guoliang Li.
(NeurIPS) Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs. [paper] [code] 🖐🏻 Spotlight
Led by Damo Academy (Binyuan Hui, Yongbin Li, et al) and University of Hong Kong (Jinyang Li)
(TKDE) Automatic Database Knob Tuning: A Survey. [paper] [code]
🔴
Xinyang Zhao*, Xuanhe Zhou*, Guoliang Li.
(DSE) DB-GPT: Large Language Model Meets Database. [paper] [code]
🟢
🔴
Xuanhe Zhou, Zhaoyan Sun, Guoliang Li.

2022

(SIGMOD) LearnedSQLGen: Constraint-Aware SQL Generation using Reinforcement Learning. [paper] [code]
🟤
Lixi Zhang, Chengliang Chai, Xuanhe Zhou, Guoliang Li.
(VLDB) A Learned Query Rewrite System using Monte Carlo Tree Search. [paper] [code]
🟢
Xuanhe Zhou, Guoliang Li, Chengliang Chai, Jianhua Feng.
(ICDE) AutoIndex: An Incremental Index Management System for Dynamic Workloads. [paper] [code]
🔴
Xuanhe Zhou, Luyang Liu, Wenbo Li, Lianyuan Jin, Tianqing Wang, Shifu Li.
(ICDE) Adaptive Code Learning for Spark Configuration Tuning. [paper] [code]
🔴
Chen Lin, Junqing Zhuang, Jiadong Feng, Hui Li, Xuanhe Zhou, Guoliang Li.
(ICDE Tutorial) Machine Learning for Data Management: A System View. [paper] [slide]
Guoliang Li, Xuanhe Zhou.

2021

(VLDB Industry) openGauss: An Autonomous Database System. [paper] [code]
Guoliang Li, Xuanhe Zhou (first student author), Ji Sun, Xiang Yu, Yue Han, Lianyuan Jin, Wenbo Li, Tianqing Wang, Shifu Li. over 1.5k stars
(VLDB Demo) DBMind: A Self-Driving Platform in openGauss. [paper] [code]
Xuanhe Zhou, Lianyuan Jin, Ji Sun, Xinyang Zhao, Xiang Yu, Shifu Li, Tianqing Wang, et al.
(SIGMOD Tutorial) AI Meets Database: AI4DB and DB4AI. [paper] [slide]
Guoliang Li, Xuanhe Zhou, Lei Cao, Chengliang Chai.
(VLDB Tutorial) Machine Learning for Databases. [paper]
Guoliang Li, Xuanhe Zhou, Lei Cao.
(Journal of Software) Survey of Data Management Techniques for Supporting Artificial Intelligence. [paper] 🖐🏻
Guoliang Li, Xuanhe Zhou.

2020

(VLDB) Query Performance Prediction for Concurrent Queries using Graph Embedding. [code] [paper]
🟡
Xuanhe Zhou, Ji Sun, Guoliang Li, Jianhua Feng.
(TKDE) Database meets artificial intelligence: A survey. [paper]
Xuanhe Zhou, Chengliang Chai, Guoliang Li, Ji Sun.
(Chinese Journal of Computers) Overview of database technology based on machine learning. [paper]
Guoliang, Li, Xuanhe Zhou, Sun Ji, Yu Xiang, Yuan Haitao, Liu Jiabin, and Han Yue.

2019

(VLDB) QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. [paper]
🔴
Guoliang Li, Xuanhe Zhou, Shifu Li, Bo Gao.
(Data Eng.) Xuanyuan: An AI-Native Database. [paper]
Guoliang Li, Xuanhe Zhou, Sihao Li.
(Journal of Computer) A Survey of Machine-Learning-based Database Techniques. [paper]
Guoliang Li, Xuanhe Zhou, Ji Sun, Xiang Yu, Haitao Yuan, Jiabin Li, Yue Han.

Activities

2021.10 - Tutorial of ML for Databases, AIMLSystems Conference. [website]
2021.08 - Invited Talk, The LADSIOS Workshop, VLDB Conference. [website]
2021.06 - SIGMOD Onsite Volunteer. [website]

Services

PC Member - DBML'23 (ICDE workshop), AIDB'23 (VLDB workshop);
Journal Reviewer - VLDB Journal, JCST

Selected Awards

"A best paper award committee comprising of 7 members from the program committee selected one paper for the best paper award, and two runner ups based on quality and impact."
2023 - National Scholarship
"The Outstanding Scholarship of Tsinghua University is the highest honor for students at Tsinghua University. Each year, 10 graduate students are selected as recipients of this award."
“This program initiated by ByteDance aims to support over 10 innovative and tech-savvy talents, encouraging students to contribute to society and lead the future through technology.”
“This year, 164 distinguished Ph.D. candidates from 47 leading research universities or institutions had applied for fellowships. ... and only 12 extremely outstanding students have been awarded fellowships.”
"Highest Honor in the Department of Computer Science and Technology of Tsinghua"
2021 - National Scholarship
2021 - 84 Future Innovation Scholarship, Tsinghua
2021 - Baidu Scholarship Top 40
2021 - Apple Scholars in AI/ML Nomination
2020 - 84 Future Innovation Scholarship, Tsinghua
2017 - National Scholarship

Open Datasets

A collection of open-source datasets (under construction)!
Scenarios: AI applications (
); DB Applications (
).

Teaching Assistant

2019-2022 Database Systems (THU/30240262)
Online tutorials on building a basic relational database are available!!
Advanced functions (by wenbo, haowen): https://thu-db.github.io/dbtrain-tutorial/
Last modified 17d ago