2026
- MLSys | Stream2LLM: Overlap Context Streaming and Prefill for Reduced Time-to-First-Token The Ninth Annual Conference on Machine Learning and Systems (MLSys'26) Rajveer Bachkaniwala, Chengqi Luo, Richard So, Divya Mahajan, Kexin Rong [Will be available soon!]
2024
- HotInfra | Lotus: Characterize Architecture Level CPU-based Preprocessing in Machine Learning Pipelines The 2nd Workshop on Hot Topics in System Infrastructure (HotInfra’24), co-located with SOSP’24 Rajveer Bachkaniwala, Harshith Lanka, Kexin Rong, Ada Gavrilovska [PDF] [Project Page] [Code] [Slides] [DeepWiki]
- IISWC | Lotus: Characterization of Machine Learning Preprocessing Pipelines via Framework and Hardware Profiling 2024 IEEE International Symposium on Workload Characterization (IISWC 2024) Rajveer Bachkaniwala, Harshith Lanka, Kexin Rong, Ada Gavrilovska [PDF] [Project Page] [Code] [Slides] [DeepWiki] Artifact Info: Available, Reviewed, Reproduced Awards: 🏆 Best paper nominee