Zhijia Zhao

/* zh sounds like j */

Associate Professor
RIPLE Research Group
Computer Science and Engineering, UC Riverside

Short Bio: I received my Ph.D. in Computer Science from the College of William and Mary in 2015 under the supervision of Prof. Xipeng Shen. In the same year, I joined UC Riverside as an assistant professor, and later became an associate professor in 2020. I am fortunate and grateful to be a recipient of NSF CAREER Award (2018), Regents Faculty Fellowship (2018), Hellman Fellowship (2018), Best Paper Runner-up Award at MobiSys'18, and Best Paper Award at ASPLOS'20 and ASPLOS'22.

Research Interests

My research group develops algorithms and software libraries for high-performance data analytics, focusing on non-tabular (NoSQL) data such as graphs, semi-structured data (e.g., JSON/XML), markup data (e.g., HTML), raw text, and binary serialization formats (e.g., ProtoBuf). We enhance performance through parallelism, redundancy elimination, and improved data locality for efficient traversal. Additionally, we explore program analysis and AI/ML techniques to address correctness and performance challenges in real-world software systems.

Program optimization and parallelization (with multicore, SIMD, and GPU)
Data analytics (e.g., for JSON and graph data)
Program analysis (e.g., for mobile and web apps)
AI (knowledge representation and reasoning, and LLM)

Selected Works

Streaming and Indexing of Semi-Structured Data [PPoPP'17, ASPLOS'19, VLDB'21, ASPLOS'22 (Best Paper Award), SIGMOD'23]

Semi-structured data, like JSON and its variations, is the de facto standard for information exchange on the web and the basis for modern NoSQL document stores (e.g., MongoDB and Firebase). With nested structures, it is difficult to process such data in a scalable way. To address this, we leverage the automata theory and parallelization techniques to automatically generate data-parallel code with small memory footprint.

Graph Analytics: GPU Acceleration [ASPLOS'18, EuroSys'20] , Incremental [EuroSys'21, SoCC'24] and Concurrent Evaluation [ASPLOS'23]

Many real-world computation problems can be modeled as graph problems. As the graphs are often large and irregular by nature, it is difficult to process them efficiently. We developed novel graph transformations to make graph more ``regular'' so that they can be processed efficiently on GPUs. To deal with the case that the graph is too large to fit into GPU memory, we designed a technique that can efficiently seperate the active part of the graph and only load that into the GPU memory. As the graph changes dynamically, we found a general solution that can incrementally evaluate graph queries (based on graph triangle inequalities) to reduce the latency. Finally, in the senario of concurrent graph queries, we proposed an alignment method that reduces the ``misalignments'' among graph queries that are evaluated together.

Analyzing and Preserving Mobile App State [MobiSys'18 (Best Paper Runner-up Award), OOPSLA'20]

Mobile systems (e.g., Android) expose changes on the device configuration (e.g., orientation, screen size, or keyboard availability) to the apps at runtime to enable rich user experiences. When handled inappropriately, these changes can cause unresponsiveness, state loss, or even app crash. In this work, we analyze the common causes to various runtime change issues and design novel runtime change handling solutions to address them.

Parallelizing Finite-State Machine (FSM) Computations [ASPLOS'14, ASPLOS'15, PACT'16, ICS'17, ASPLOS'20 (Best Paper Award), ASPLOS'21]

FSM serves as the computation model for a variety of performance-critical applications (e.g., pattern matching and decoding). However, the model runs in serial – each transition depends on the prior state. In a series of works, we "broke" the state dependencies with disciplined speculation (ASPLOS'14 and '15), introduced multi-level parallelism (PACT'16), examined and improved the scalability of FSM parallelization (ICS'17, ASPLOS'21). In addition, we found that bitstream programs that rely on bitwise operations (e.g., AND, SHIFT, and XOR) to process long sequences of bits, can be effectively modeled by FSMs, hence parallelizable with FSM parallelization techniques. To achieve this, we proposed bit-level dependency analysis and modeling techniques to reduce bitstream programs to FSMs (ASPLOS'20).