I am an Assistant Professor in the Computer Science
Department at Tufts University. My research
focuses on building low-latency, scalable systems. I am broadly interested in operating systems,
distributed systems, and systems for ML. My goal is to build the next generation of adaptive and resilient
systems capable of meeting the ever-growing demands of modern datacenter applications.
[
July 2017
] My first paper got accepted in APSys 2017.
Publications
Refereed conference publications
Dynamic NUMA-Aware Replication for Data Structures
Erika Hunhoff, Zack McKevitt, Ankit Bhardwaj, Reto Achermann, Gerd Zellweger, Marcos Aguilera,
Eric Keller
40th International Parallel & Distributed Processing Symposium (IPDPS '26)
[ Paper ]
[ Talk ]
Star
Unleashing The Potential of Datacenter SSDs by Taming Performance Variability
Gohar Irfan Chaudhry, Ankit Bhardwaj, Zhenyuan Ruan, Adam Belay
23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI '26)
[ Paper ]
[ Talk ]
Star
Checkmate: Zero Performance Overhead Model Checkpointing via Network Gradient Replication
Ankit Bhardwaj*, Weiyang Wang*, Jeremy Carin, Adam Belay, Manya Ghobadi
(*equal contribution)
23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI '26)
[ Preprint ]
[ Talk ]
[ Webpage ]
Star
Auto-reconfiguration for Latency Minimization in CPU-based DNN Serving
Ankit Bhardwaj, Amar Phanishayee, Deepak Narayanan, Ryan Stutsman
42nd International Conference on Machine Learning (ICML '25)
[ Paper ]
[ Poster ]
[ Webpage ]
Star
NrOS: Effective Replication and Sharing in an Operating System
Ankit Bhardwaj, Chinmay Kulkarni, Reto Achermann, Irina Calciu, Sanidhya Kashyap, Ryan Stutsman,
Amy Tai, and Gerd Zellweger
15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21)
[ Paper ]
[ Talk ]
[ Webpage ]
Star
Adaptive Placement for In-memory Storage Functions
Ankit Bhardwaj, Chinmay Kulkarni, and Ryan Stutsman
2020 USENIX Annual Technical Conference (ATC '20)
[ Paper ]
[ Talk ]
Star
Refereed workshop publications
ObjectTier: Non-invasively Boosting Memory Tiering Performance
Vinita Pawar, Ankit Bhardwaj, Ryan Stutsman
8th Workshop on Hot Topics in Cloud Computing Performance (HotCloudPerf '25)
[ Paper ]
Cache-Coherent Accelerators for Persistent Memory Crash Consistency
Ankit Bhardwaj, Todd Thornley, Vinita Pawar, Reto Achermann, Gerd Zellweger, Ryan Stutsman
14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage '22)
Best Paper Award
[ Paper ]
[ Slides ]
[ Talk ]
On the Impact of Isolation Costs on Locality-aware Cloud Scheduling
Ankit Bhardwaj, Meghana Gupta, and Ryan Stutsman
12th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '20)
[ Paper ]
[ Talk
]
A Preliminary Performance Model for Optimizing Software Packet Processing
Pipelines
Ankit Bhardwaj, Atul Shree, Bhargav Reddy V, and Sorav Bansal
8th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys '17)
[ Paper ]
[ Poster ]
[ Slides ]
[ Talk ]
[ Full Paper (unpublished) ]
Research
I build low-latency, scalable systems for modern datacenters. My current work spans ML infrastructure,
storage systems, and OS design.
Checkmate
enables zero-overhead model checkpointing for distributed training by replicating gradients
over the network, eliminating training pauses caused by checkpoint I/O.
[NSDI '26]
Packrat
is an auto-reconfiguration framework for CPU-based DNN serving that discovers optimal
operator parallelism and batching strategies to minimize latency.
[ICML '25]
Sandook
addresses SSD performance variability in datacenters, unlocking their full potential for
latency-sensitive applications.
[NSDI '26]
NrOS
is a Rust-based operating system that replicates kernel state per NUMA node and uses
operation logging to synchronize replicas, scaling efficiently across multi-socket machines.
[OSDI '21]
Splinter
is a multi-tenant in-memory key-value store that leverages Rust's type system to provide
microsecond-scale tenant isolation while enabling runtime storage function extensibility.
[ATC '20]
Bacus
applies compiler-driven optimizations to software packet processing, improving the
performance and usability of DSLs such as P4.
[APSys '17]
vLAB is a system for provisioning and managing
large numbers of virtual machines on commodity hardware using distributed object storage.