Homepage

Advisor: Prof. Guoliang Li
Tsinghua Database Group

Brief Biography

I am Xuanhe Zhou, a fifth-year Ph.D. student in Computer Science of Tsinghua. My research interest lies in autonomous database systems, i.e., automating and optimizing the logical/physical designs of database systems so as to better serve AI/DB users. I have published 8 full-length papers as the first author or first student author in top-tier database conferences and journals, with over 600 citations on Google Scholar. I have made major contributions in open projects (e.g., openGauss-dbmind, FEBench, DB-GPT, BMTools). My research works have been applied in 10+ scenarios. And the awards include VLDB Best Industry Paper Runner-up (as the first author), Tsinghua Outstanding Scholarship, Microsoft Research Asia Fellowship, ByteDance Scholarship, Zhong Shimou Scholarship, National Scholarship.
周煊赫,清华计算机系五年级博士生,研究方向为自治数据库系统,即自动化数据库系统的逻辑和物理设计,更好的服务AI和数据库用户。以第一和学生第一作者身份在顶级(CCF-A)国际会议、期刊发表 8 篇长文,谷歌学术引用量600 余次。主导和参与了openGauss-dbmind、FEBench、DB-GPT、BMTools等多个项目开源。研究成果在移动、工行、邮储银行(核心交易业务)等10 余个场景得到部署和落地。曾获VLDB 最佳工业论文Runnerup(第一作者)、清华特等奖学金、微软学者、字节跳动奖学金、钟士模奖学金、研究生国家奖学金。

Research Directions

The relevant research problems include but are not limited to :
(1) Self-optimization (e.g., query rewrite);
(2) Self-configuration (e.g., knob/index/partition management);
(3) Self-diagnosis (e.g., root cause analysis);
(4) AI Acceleration Techniques. I also explore interesting AI techniques that can benefit above aspects (e.g., real-time feature store, tool usage with foundation models).

Projects

One of our current targets is to open-source a series of practical, out-of-the-box tools (outside/inside database kernels) for both AI and DB users with my excellent collaborators, e.g.,
Full-process autonomous database operation and maintenance capabilities, e.g., slow SQL root cause analysis, workload index recommendation, multi-metric correlation mining, fault self-repair, anomaly detection and root cause analysis, and etc.
A LLM-based database administrator that can acquire database maintenance experience from textual sources, and provide reasonable, well-founded, in-time diagnosis advice for databases.
A novel benchmark specifically designed for real-time feature extraction (RTFE) within the domain of online AI inference services. These services are rapidly being deployed in diverse applications, including finance, retail, manufacturing, energy, media, and etc.

Peer-reviewed Papers

🟡
self monitoring
🟢
self optimization
🔴
self configuration
🔵
data structure design
auto database
🟤
data generation 🖐🏻 AI techniques
(*indicates equal contribution)

2023

(SIGMOD) Grep: A Graph Learning Based Database Partitioning System. [paper] [code]
🔴
Xuanhe Zhou, Guoliang Li, Wei Guo, Luyang Liu.
(VLDB Industry) FEBench: A Benchmark for Real-Time Relational Data Feature Extraction. [paper] [code] 🖐🏻 Best Industry Paper Award (Runner up)
Xuanhe Zhou*, Cheng Chen*, Kunyi Li, Bingsheng He, Mian Lu, Qiaosheng Liu, Wei Huang, Guoliang Li, Zhao Zheng, Yuqqiang Chen.
(VLDB Demo) A Learned Query Rewrite System. [demo]
🟢
Xuanhe Zhou, Guoliang Li, Jianming Wu, Jiesi Liu, Zhaoyan Sun, Xinning Zhang.
(VLDB) Learned Index: A Comprehensive Experimental Evaluation. [paper] [code]
🔵
Zhaoyan Sun, Xuanhe Zhou, Guoliang Li.
(ICDE) DBAugur: An Adversarial-based Trend Forecasting System for Diversified Workloads. [paper] [code]
🟡
Yuanning Gao, Xiuqi Huang, Xuanhe Zhou, Xiaofeng Gao, Guoliang Li.
(NeurIPS) Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs. [paper] [code] 🖐🏻 Spotlight
Led by Damo Academy (Binyuan Hui, Yongbin Li, et al) and University of Hong Kong (Jinyang Li)
(TKDE) Automatic Database Knob Tuning: A Survey. [paper] [code]
🔴
Xinyang Zhao*, Xuanhe Zhou*, Guoliang Li.
(DSE) DB-GPT: Large Language Model Meets Database. [paper] [code]
🟢
🔴
Xuanhe Zhou, Zhaoyan Sun, Guoliang Li.

2022

(SIGMOD) LearnedSQLGen: Constraint-Aware SQL Generation using Reinforcement Learning. [paper] [code]
🟤
Lixi Zhang, Chengliang Chai, Xuanhe Zhou, Guoliang Li.
(VLDB) A Learned Query Rewrite System using Monte Carlo Tree Search. [paper] [code]
🟢
Xuanhe Zhou, Guoliang Li, Chengliang Chai, Jianhua Feng.
(ICDE) AutoIndex: An Incremental Index Management System for Dynamic Workloads. [paper] [code]
🔴
Xuanhe Zhou, Luyang Liu, Wenbo Li, Lianyuan Jin, Tianqing Wang, Shifu Li.
(ICDE) Adaptive Code Learning for Spark Configuration Tuning. [paper] [code]
🔴
Chen Lin, Junqing Zhuang, Jiadong Feng, Hui Li, Xuanhe Zhou, Guoliang Li.
(ICDE Tutorial) Machine Learning for Data Management: A System View. [paper] [slide]
Guoliang Li, Xuanhe Zhou.

2021

(VLDB Industry) openGauss: An Autonomous Database System. [paper] [code]
Guoliang Li, Xuanhe Zhou (first student author), Ji Sun, Xiang Yu, Yue Han, Lianyuan Jin, Wenbo Li, Tianqing Wang, Shifu Li. over 1.5k stars
(VLDB Demo) DBMind: A Self-Driving Platform in openGauss. [paper] [code]
Xuanhe Zhou, Lianyuan Jin, Ji Sun, Xinyang Zhao, Xiang Yu, Shifu Li, Tianqing Wang, et al.
(SIGMOD Tutorial) AI Meets Database: AI4DB and DB4AI. [paper] [slide]
Guoliang Li, Xuanhe Zhou, Lei Cao, Chengliang Chai.
(VLDB Tutorial) Machine Learning for Databases. [paper]
Guoliang Li, Xuanhe Zhou, Lei Cao.
(Journal of Software) Survey of Data Management Techniques for Supporting Artificial Intelligence. [paper] 🖐🏻
Guoliang Li, Xuanhe Zhou.

2020

(VLDB) Query Performance Prediction for Concurrent Queries using Graph Embedding. [code] [paper]
🟡
Xuanhe Zhou, Ji Sun, Guoliang Li, Jianhua Feng.
(TKDE) Database meets artificial intelligence: A survey. [paper]
Xuanhe Zhou, Chengliang Chai, Guoliang Li, Ji Sun.
(Chinese Journal of Computers) Overview of database technology based on machine learning. [paper]
Guoliang, Li, Xuanhe Zhou, Sun Ji, Yu Xiang, Yuan Haitao, Liu Jiabin, and Han Yue.

2019

(VLDB) QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. [paper]
🔴
Guoliang Li, Xuanhe Zhou, Shifu Li, Bo Gao.
(Data Eng.) Xuanyuan: An AI-Native Database. [paper]
Guoliang Li, Xuanhe Zhou, Sihao Li.
(Journal of Computer) A Survey of Machine-Learning-based Database Techniques. [paper]
Guoliang Li, Xuanhe Zhou, Ji Sun, Xiang Yu, Haitao Yuan, Jiabin Li, Yue Han.

Activities

2021.10 - Tutorial of ML for Databases, AIMLSystems Conference. [website]
2021.08 - Invited Talk, The LADSIOS Workshop, VLDB Conference. [website]
2021.06 - SIGMOD Onsite Volunteer. [website]

Services

PC Member - DBML'23 (ICDE workshop), AIDB'23 (VLDB workshop);
Journal Reviewer - VLDB Journal, JCST

Major Awards

"A best paper award committee comprising of 7 members from the program committee selected one paper for the best paper award, and two runner ups based on quality and impact."
"The Outstanding Scholarship of Tsinghua University is the highest honor for students at Tsinghua University. Each year, 10 graduate students are selected as recipients of this award."
“This program initiated by ByteDance aims to support over 10 innovative and tech-savvy talents, encouraging students to contribute to society and lead the future through technology.”
“This year, 164 distinguished Ph.D. candidates from 47 leading research universities or institutions had applied for fellowships. ... and only 12 extremely outstanding students have been awarded fellowships.”
"Highest Honor in the Department of Computer Science and Technology of Tsinghua"
2021 - National Scholarship
2021 - 84 Future Innovation Scholarship, Tsinghua
2021 - Baidu Scholarship Top 40
2021 - Apple Scholars in AI/ML Nomination
2020 - 84 Future Innovation Scholarship, Tsinghua
2017 - National Scholarship

Datasets

A collection of open-source datasets (under construction)!
Scenarios: AI applications (
); DB Applications (
).

Teaching

2019-2022 Database Systems (THU/30240262), teaching assistant
Online tutorials on building a basic relational database are available!!
Advanced functions (by wenbo, haowen): https://thu-db.github.io/dbtrain-tutorial/
Last modified 1d ago