About me

Ziyu Li

I'm currently a postdoctoral researcher at the Web Information Systems group at Delft University of Technology, actively seeking employment opportunities.

About me

I am a PhD student at the Web Information Systems group, Department of Software Technology, Faculty of EEMCS, Delft University of Technology. The PhD project belongs to the HyperEdge project with Cognizant. I’m supervised by Alessandro Bozzon and Asterios Katsifodimos.


My research lies at the intersection of Machine Learning and data analysis. In particular, my reserach investigates how to apply metadata of different artifacts (e.g., model, data, hardware settings) to improve the effectiveness and efficiency of machine learning workflows.

Publications

Colorful Image

Ziyu Li , Hilco van der Wilk, Danning Zhan, Megha Khosla, Alessandro Bozzon, Rihan Hai
In Proceedings of the 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024

Colorful Image

Ziyu Li, Wenbo Sun, Danning Zhan, Yan Kang, Lydia Chen, Alessandro Bozzon, and Rihan Hai.
IEEE Transactions on Knowledge and Data Engineering (2024).

Colorful Image

Rihan Hai, Christos Koutras, Andra Ionescu, Ziyu Li, Wenbo Sun, Jessie van Schijndel, Yan Kang, Asterios Katsifodimos
In Proceedings of the 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023 p.3729-3739, IEEE .

Colorful Image

Metadata Representations for Queryable Repositories of Machine Learning Models

Ziyu Li, Henk Kant, Rihan Hai, Asterios Katsifodimos, Marco Brambilla, Alessandro Bozzon
In IEEE Access Volume 11 p.125616-125630 (2023).

Colorful Image

Macaroni: Crawling and Enriching Metadata from Public Model Zoos

Ziyu Li, Rihan Hai, Asterios Katsifodimos, Alessandro Bozzon
In International Conference on Web Engineering p.376-380. (2023)

Colorful Image

Optimizing ML Inference Queries Under Constraints

Ziyu Li, Wenbo Sun, Rihan Hai, Alessandro Bozzon, Asterios Katsifodimos
In International Conference on Web Engineering p.51-66 (2023).

Colorful Image

Optimizing Machine Learning Inference Queries for Multiple Objectives

Ziyu Li, Mariette Schonfeld, Rihan Hai, Alessandro Bozzon, Asterios Katsifodimos
In Proceedings - 2023 IEEE 39th International Conference on Data Engineering Workshops, ICDEW 2023 p.74-78, Institute of Electrical and Electronics Engineers (IEEE) (2023)

Colorful Image

Susceptible-infected-spreading-based network embedding in static and temporal networks

Xiu Xiu Zhan, Ziyu Li, Naoki Masuda, Petter Holme, Huijuan Wang
In EPJ Data Science Volume 9 (2020).

Projects

Model Selection with Model Zoo via Graph Learning

Description:

In this study, we introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem. TransferGraph constructs a graph using extensive metadata extracted from models and datasets, while capturing their inherent relationships.

Effectiveness:

Through comprehensive experiments across 16 real datasets, both images and texts, we demonstrate TransferGraph’s effectiveness in capturing essential model-dataset relationships, yielding up to a 32% improvement in correlation between predicted performance and the actual fine-tuning results compared to the state-of-the-art methods

Optimizing ML Inference Queries under Constraints

Description:

We propose a method for optimizing ML inference queries that selects the most suitable ML models to use, as well as the order in which those models are executed. We formally define the constraint-based ML inference query optimization problem, formulate it as a Mixed Integer Programming (MIP) problem.

Effectiveness:

We develop an optimizer that maximizes accuracy given constraints. This optimizer is capable of navigating a large earch space to identify optimal query plans on various model zoos.

Macaroni: Metadata Representations for Queryable Repositories of Machine Learning Models

Description:

The metadata serves crucial roles for reporting, auditing, ensuring reproducibility, and enhancing interpretability. Despite the growing adoption of descriptive formats like datasheets and model cards, the metadata available in existing model zoos remains notably limited. Moreover, existing formats have limited expressiveness, thus constraining the potential use of model repositories, extending their purpose beyond mere storage for pre-trained models.

Contributions:

This paper proposes a unified metadata representation format for model zoos. We illustrate that comprehensive metadata enables a diverse range of applications, encompassing model search, reuse, comparison, and composition of ML models. We also detail the design and highlight the implementation of an advanced model zoo system built on top of our proposed metadata representation

Explore more on the projects with the posts

View all posts »
Amalur

Amalur

The Convergence of Data Integration an Machin Learning