lijl at comp.nus.edu.sg
COM2 #03-16
15 Computing Drive
Singapore 117418
I am an Assistant Professor in the School of Computing at the National University of Singapore. Before joining NUS, I received my PhD from the University of Washington, working with Dan Ports, Arvind Krishnamurthy, and Tom Anderson. Before that, I received my bachelor's degree from the University of Michigan where I did some undergraduate research with Valeria Bertacco and Peter Chen.
I am broadly interested in operating systems and distributed systems research. More recently, I have worked on the following research topics: systems design for reconfigurable hardware, co-designing distributed systems with datacenter networks, and system software testing.
Pegasus: Tolerating Skewed Workloads in Distributed Storage with In-Network Coherence Directories
Jialin Li, Jacob Nelson, Ellis Michael, Xin Jin, and Dan R. K. Ports
Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI '20)
pdf
Meerkat: Scalable Replicated Transactions Following the Zero-Coordination Principle
Adriana Szekeres, Michael Whittaker, Naveen Kr. Sharma, Jialin Li, Arvind Krishnamurthy, Irene Zhang, and Dan R. K. Ports
Proceedings of the 15th ACM SIGOPS EuroSys (EuroSys '20), Heraklion, Crete, Greece, April 2020
pdf
Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection
Hang Zhu, Zhihao Bai, Jialin Li, Ellis Michael, Dan R. K. Ports, Ion Stoica, and Xin Jin
Proceedings of the VLDB Endowment vol. 13 (3) , November 2019 (pp. 376--389)
pdf
Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control
Jialin Li, Ellis Michael, and Dan R. K. Ports
Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17)
pdf
Just Say NO to Paxos Overhead: Replacing Consensus Overhead with Network Ordering
Jialin Li, Ellis Michael, Naveen Kr. Sharma, Adriana Szekeres, and Dan R. K. Ports
Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16)
pdf
Specifying and Checking File System Crash-consistency Models
James Bornholt, Antoine Kaufmann, Jialin Li, Arvind Krishnamurthy, Emina Torlak, and Xi Wang
Proceedings of the 21st International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'16)
pdf
Arrakis: The Operating System Is the Control Plane
Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe
ACM Transactions on Computer Systems vol. 33 (4)
pdf
Designing Distributed Systems Using Approximate Synchrony in Data Center Networks
Dan R. K. Ports, Jialin Li, Vincent Liu, Naveen Kr. Sharma, and Arvind Krishnamurthy
Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI '15)
Best Paper Award
pdf
Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency
Jialin Li, Naveen Kr. Sharma, Dan R. K. Ports, and Steven D. Gribble
Proceedings of the 5th Symposium on Cloud Computing (SOCC '14)
pdf
Arrakis: The Operating System is the Control Plane
Simon Peter, Jialin Li, Irene Zhang, Dan R. K. Ports, Doug Woos, Arvind Krishnamurthy, Thomas Anderson, and Timothy Roscoe
Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14)
Best Paper Award
pdf
Towards High-Performance Application-Level Storage Management
Simon Peter, Jialin Li, Doug Woos, Irene Zhang, Dan R. K. Ports, Thomas Anderson, Arvind Krishnamurthy, and Mark Zbikowski
Proceedings of the 5th Hot Topics in Storage and File Systems (HotStorage '14)
pdf
Bridging Pre- and Post-silicon Debugging with BiPeD
Andrew DeOrio, Jialin Li, and Valeria Bertacco
Proceedings of the International Conference on Computer-Aided Design (ICCAD '12)
pdf
NOPaxos
NOPaxos takes a new approach to achieving replication in
the datacenter without the performance cost of traditional
methods, by carefully dividing replication responsibility
between the network and protocol layers. The network orders
requests but does not ensure reliably delivery, using a new
ordered unreliable multicast (OUM) primitive. Our new
replication protocol, Network-Ordered Paxos (NOPaxos),
exploits network ordering to provide strongly consistent
replication without coordination. The resulting system
yields throughput within 2% and latency within 16us of an
unreplicated system.
Eris
Eris takes a different approach to strongly consistent
distributed transactions. It moves a core piece of
concurrency control functionality into the datacenter
network. This network primitive takes on the responsibility
for consistently ordering transactions, and a new
lightweight transaction protocol ensures atomicity. The
resulting system avoids both replication and transaction
coordination overhead: it achieves performance within 3% of
a non-transactional, unreplicated system on TPC-C.
Speculative Paxos
Speculative Paxos explores the co-design of distributed
systems with their network layer. We leveraged properties
of the datacenter network to build a Mostly-Ordered
Multicast primitive which provides a best-effort ordering
guarantee. We then co-designed a new replication protocol,
Speculative Paxos, that relies on the network to order
requests in the normal case. The resulting system provides
substantially higher throughput and lower latency than the
standard Paxos protocol.
Arrakis
Arrakis is a new operating system that eliminates the OS
kernel entirely from fast-path I/O operations. The kernel
only sets up execution environments and interacts with an
application in rare cases where resources need to be
reallocated or name conflicts need to be resolved.
Applications get the full power of unmediated hardware
through an application-specific library, providing
significantly better performance, reliability and
customizability.
Predictable Tail-Latency Systems
Modern datacenter applications struggle with the need to
access thousands of servers while still providing a fast
response time to the user. In this project, we conducted an
extensive measurement study at the operating system level
to identify factors that can cause lower-latency
application to have orders of magnitude worse latency tail
than the median. Using a set of modifications to the kernel
scheduler, network stack, and application architecture, we
can reduce the tail latency to within a few percent of
optimal.