I am an Assistant Professor in the Computer Science Department at Tufts University. My research focuses on building low-latency, scalable systems. I am broadly interested in operating systems, distributed systems, and systems for ML. My goal is to build the next generation of adaptive and resilient systems capable of meeting the ever-growing demands of modern datacenter applications.

Earlier, I was a postdoctoral researcher at MIT. I earned my Ph.D. from the University of Utah, where I worked on controlled data and compute placement techniques to scale datacenter applications.

Contact
ankit@cs.tufts.edu
JCC #465, 177 College Ave
Medford, MA 02155

Hiring: I am hiring 2+ new PhDs. If you are interested in building scalable systems, apply to the Tufts CS PhD program and mention my name.

Updates

  • [ Dec 2025 ] Sandook paper is accepted to appear at NSDI 2026.
  • [ Aug 2025 ] Started as an Assistant Professor at Tufts University.
  • [ July 2025 ] Checkmate paper is accepted to appear at NSDI 2026.
  • [ May 2025 ] Packrat paper is accepted to appear at ICML 2025.
  • [ Sept 2023 ] Started postdoc position at MIT CSAIL.
  • [ June 2023 ] Successfully defended my PhD dissertation.
  • [ May 2022 ] Started internship at Microsoft Research, hosted by Amar Phanishayee.
  • [ May 2022 ] PAX paper is accepted to appear at HotStorage '22.
  • [ Feb 2022 ] Named 2022 Meta PhD Research Fellowship finalist.
  • [ May 2021 ] Started internship at VMware Research, hosted by Irina Calciu and Gerd Zellweger.
  • [ March 2021 ] NrOS paper is accepted to appear at OSDI '21.
  • [ May 2020 ] Started internship at VMware Research Group, hosted by Gerd Zellweger.
  • [ May 2020 ] Sandstorm paper is accepted to appear at HotCloud '20.
  • [ Apr 2020 ] ASFP paper is accepted to appear at ATC '20.
  • [ Aug 2018 ] Joined the School of Computing, University of Utah as a PhD student.
  • [ Apr 2018 ] Received an offer to join the University of Massachusetts, Amherst as a graduate student.
  • [ Jan 2018 ] Received an offer to join the University of Utah as a graduate student.
  • [ Sep 2017 ] Attended APSys 2017 and presented our work.
  • [ July 2017 ] My first paper got accepted in APSys 2017.

Publications

Refereed conference publications
  1. Dynamic NUMA-Aware Replication for Data Structures
    Erika Hunhoff, Zack McKevitt, Ankit Bhardwaj, Reto Achermann, Gerd Zellweger, Marcos Aguilera, Eric Keller
    40th International Parallel & Distributed Processing Symposium (IPDPS '26)
    [ Paper ] [ Talk ] Star

  2. Unleashing The Potential of Datacenter SSDs by Taming Performance Variability
    Gohar Irfan Chaudhry, Ankit Bhardwaj, Zhenyuan Ruan, Adam Belay
    23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI '26)
    [ Paper ] [ Talk ] Star

  3. Checkmate: Zero Performance Overhead Model Checkpointing via Network Gradient Replication
    Ankit Bhardwaj*, Weiyang Wang*, Jeremy Carin, Adam Belay, Manya Ghobadi
    (*equal contribution)
    23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI '26)
    [ Preprint ] [ Talk ] [ Webpage ] Star

  4. Auto-reconfiguration for Latency Minimization in CPU-based DNN Serving
    Ankit Bhardwaj, Amar Phanishayee, Deepak Narayanan, Ryan Stutsman
    42nd International Conference on Machine Learning (ICML '25)
    [ Paper ] [ Poster ] [ Webpage ] Star

  5. NrOS: Effective Replication and Sharing in an Operating System
    Ankit Bhardwaj, Chinmay Kulkarni, Reto Achermann, Irina Calciu, Sanidhya Kashyap, Ryan Stutsman, Amy Tai, and Gerd Zellweger
    15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21)
    [ Paper ] [ Talk ] [ Webpage ] Star

  6. Adaptive Placement for In-memory Storage Functions
    Ankit Bhardwaj, Chinmay Kulkarni, and Ryan Stutsman
    2020 USENIX Annual Technical Conference (ATC '20)
    [ Paper ] [ Talk ] Star

Refereed workshop publications
  1. ObjectTier: Non-invasively Boosting Memory Tiering Performance
    Vinita Pawar, Ankit Bhardwaj, Ryan Stutsman
    8th Workshop on Hot Topics in Cloud Computing Performance (HotCloudPerf '25)
    [ Paper ]

  2. Cache-Coherent Accelerators for Persistent Memory Crash Consistency
    Ankit Bhardwaj, Todd Thornley, Vinita Pawar, Reto Achermann, Gerd Zellweger, Ryan Stutsman
    14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage '22)
    Best Paper Award
    [ Paper ] [ Slides ] [ Talk ]

  3. On the Impact of Isolation Costs on Locality-aware Cloud Scheduling
    Ankit Bhardwaj, Meghana Gupta, and Ryan Stutsman
    12th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '20)
    [ Paper ] [ Talk ]

  4. A Preliminary Performance Model for Optimizing Software Packet Processing Pipelines
    Ankit Bhardwaj, Atul Shree, Bhargav Reddy V, and Sorav Bansal
    8th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys '17)
    [ Paper ] [ Poster ] [ Slides ] [ Talk ] [ Full Paper (unpublished) ]

Research

I build low-latency, scalable systems for modern datacenters. My current work spans ML infrastructure, storage systems, and OS design.

  • Checkmate enables zero-overhead model checkpointing for distributed training by replicating gradients over the network, eliminating training pauses caused by checkpoint I/O. [NSDI '26]
  • Packrat is an auto-reconfiguration framework for CPU-based DNN serving that discovers optimal operator parallelism and batching strategies to minimize latency. [ICML '25]
  • Sandook addresses SSD performance variability in datacenters, unlocking their full potential for latency-sensitive applications. [NSDI '26]
  • NrOS is a Rust-based operating system that replicates kernel state per NUMA node and uses operation logging to synchronize replicas, scaling efficiently across multi-socket machines. [OSDI '21]
  • Splinter is a multi-tenant in-memory key-value store that leverages Rust's type system to provide microsecond-scale tenant isolation while enabling runtime storage function extensibility. [ATC '20]
  • Bacus applies compiler-driven optimizations to software packet processing, improving the performance and usability of DSLs such as P4. [APSys '17]
  • vLAB is a system for provisioning and managing large numbers of virtual machines on commodity hardware using distributed object storage.

Teaching