Data, Systems and HPC (DaSH) Lab

BITS Pilani K. K. Birla Goa Campus, India


Dynamic virtual machine placement in cloud computing

Arnab K. Paul, Bibhudatta Sahoo

Resource Management and Efficiency in Cloud Computing Environments, pp. 136-167, IGI Global, 2017.


2022 - Present

2024


Tarazu: An Adaptive End-to-End I/O Load Balancing Framework for Large-Scale Parallel File Systems

Arnab K. Paul, Sarah Neuwirth, Bharti Wadhwa, Feiyi Wang, Sarp Oral, Ali R Butt

ACM Transactions on Storage, 2024

An End-to-End High-Performance Deduplication Scheme for Docker Registries and Docker Container Storage Systems

Nannan Zhao, Muhui Lin, Hadeel Albahar, Arnab K. Paul, Zhijie Huang, Subil Abraham, Keren Chen, Vasily Tarasov, Dimitrios Skourtis, Ali Anwar, Ali R Butt

ACM Transactions on Storage, 2024

I/O Performance Analysis of Machine Learning Workloads on Leadership Scale Supercomputer

Ahmad Maroof Karimi, Arnab K. Paul, Feiyi Wang

Performance Evaluation Journal (PEVA), 2022

2015 - 2020 (Ph.D.)

2020


Large-scale analysis of docker images and performance implications for container storage systems

Nannan Zhao, Vasily Tarasov, Hadeel Albahar, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Arnab K. Paul, Keren Chen, Ali R Butt

IEEE Transactions on Parallel and Distributed Systems (Volume: 32, Issue: 4, 01 April 2021)


2022 - Present

2025


  1. Sarang S, Druva Dhakshinamoorthy, Aditya Shiva Sharma, Yuvraj Singh Bhadauria, Siddharth Chaitra Vivek, Arihant Bansal, Arnab K. Paul, “UnifyFL: Enabling Decentralized Cross-Silo Federated Learning”, arXiv preprint arXiv:2504.18916.
  2. Sarang S, Harsh D Chothani, Qilei Li, Ahmed M Abdelmoniem, Arnab K. Paul, “Benchmarking Mutual Information-based Loss Functions in Federated Learning”, arXiv preprint arXiv:2504.11877.
  3. Aishwarya Parab, Prakhar Pradhan, Yogesh Simmhan, Arnab K. Paul, “A Blockchain-Enabled Framework for Storage and Retrieval of Social Data”, arXiv preprint arXiv:2503.20497.
  4. Saumya Mathkar, Shreya Aiyer, Yashovardhan Bapat, Pinki Pinki, Arnab K. Paul, Vinayak Naik, “Scheduling Big Machine Learning Tasks on Clusters of Heterogeneous Edge Devices”, 2025 17th International Conference on COMmunication Systems and NETworks (COMSNETS).
  5. Ahmad Hossein Yazdani, Arnab K. Paul, Ahmad Maroof Karimi, Feiyi Wang, Ali Butt, “User-based I/O Profiling for Leadership Scale HPC Workloads”, Proceedings of the 26th International Conference on Distributed Computing and Networking.

2024


  1. Vijay Dharmaji, Manit Tanwar, Subroto Majumder, M Mustafa Rafique, Arnab K. Paul, “Towards Pre-Training Data Evaluation for Client Selection in Federated Learning”, 2024 IEEE 31st International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW).
  2. Shashank Rana, Vimarsh Shah, Aishwarya Jayashankar, Ayush Bhardwaj, Arnab K. Paul, “Understanding Infrastructure Drift in Federated Learning Systems”, 2024 IEEE 31st International Conference on High Performance Computing, Data and Analytics Workshop (HiPCW).
  3. Advik Raj Basani, Siddharth Chaitra Vivek, Advaith Krishna, Arnab K. Paul, “When less is more: Achieving faster convergence in distributed edge machine learning”, 2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics (HiPC).
  4. Redwan Ibne Seraj Khan, Arnab K. Paul, Yue Cheng, Xun Steve Jian, Ali R Butt, “FedCaSe: Enhancing federated learning with heterogeneity-aware caching and scheduling”, Proceedings of the 2024 ACM Symposium on Cloud Computing.
  5. Arnav Gupta, Druva Dhakshinamoorthy, Arnab K. Paul, “Studying the Effects of Asynchronous I/O on HPC I/O Patterns”, 2024 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops).

2023


  1. Arnav Borkar, Joel Tony, Tushar Barman, Yash Bhisikar, TM Sreenath, Arnab K. Paul, “Does Varying BeeGFS Configuration Affect the I/O Performance of HPC Workloads?”, 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops).
  2. Debasmita Biswas, Sarah Neuwirth, Arnab K. Paul, Ali R Butt, “An I/O Performance Evaluation of Varying CephFS Striping Patterns”, 2023 IEEE International Conference on Cluster Computing Workshops (CLUSTER Workshops).
  3. Ahmad Maroof Karimi, Arnab K. Paul, Jong Youl Choi, Lipeng Wan, Feiyi Wang, “Analyzing File Access Patterns on Large-Scale HPC Systems: Opportunities for File Prefetching”, 2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).
  4. Debasmita Biswas, Arnab K. Paul, Sarah Neuwirth, Ali R Butt, “Modeling the Impact of System-Level Parameters on I/O Performance of HPC Applications”, 2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).
  5. S Sai Vineet, Natasha Meena Joseph, Kunal Korgaonkar, Arnab K. Paul, “A Data-Centric Approach for Analyzing Large-Scale Deep Learning Applications”, Proceedings of the 24th International Conference on Distributed Computing and Networking.
  6. Natasha Meena Joseph, S Sai Vineet, Kunal Korgaonkar, Arnab K. Paul, “Characteristics of Deep Learning Workloads in Industry, Academic Institutions and National Laboratories”, Proceedings of the 24th International Conference on Distributed Computing and Networking.
  7. Redwan Ibne Seraj Khan, Ahmad Hossein Yazdani, Yuqi Fu, Arnab K. Paul, Bo Ji, Xun Jian, Yue Cheng, Ali R Butt, “{SHADE}: Enable Fundamental Cacheability for Distributed Deep Learning Training", 21st USENIX Conference on File and Storage Technologies (FAST 23).

2022


  1. Awais Khan, Arnab K. Paul, Christopher Zimmer, Sarp Oral, Sajal Dash, Scott Atchley, Feiyi Wang, “HVAC: Removing i/o bottleneck for large-scale deep learning applications”, 2022 IEEE International Conference on Cluster Computing (CLUSTER).
  2. Arnab K. Paul, Jong Youl Choi, Ahmad Maroof Karimi, Feiyi Wang, “Machine Learning Assisted HPC Workload Trace Generation for Leadership Scale Storage Systems”, Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing.
  3. Jean Luca Bez, Ahmad Maroof Karimi, Arnab K. Paul, Bing Xie, Suren Byna, Philip Carns, Sarp Oral, Feiyi Wang, Jesse Hanley, “Access Patterns and Performance Behaviors of Multi-layer Supercomputer I/O Subsystems under Production Load”, Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing.
  4. Hadeel Albahar, Shruti Dongare, Yanlin Du, Nannan Zhao, Arnab K. Paul, Ali R Butt, “Schedtune: A heterogeneity-aware gpu scheduler for deep learning”, 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid).

2021 (Post Doc)

2021


  1. Debasmita Biswas, Sarah Neuwirth, Arnab K. Paul, Ali R Butt, “Bridging Network and Parallel I/O Research for Improving Data-Intensive Distributed Applications”, 2021 IEEE Workshop on Innovating the Network for Data-Intensive Science (INDIS).
  2. Arnab K. Paul, Ahmad Maroof Karimi, Feiyi Wang, “Characterizing machine learning i/o workloads on leadership scale hpc systems", 2021 29th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS).
  3. Sarah Neuwirth, Arnab K. Paul, “Parallel I/O Evaluation Techniques and Emerging HPC Workloads: A Perspective”, 2021 IEEE International Conference on Cluster Computing (CLUSTER).

2015 - 2020 (Ph.D.)

2020


  1. Arnab K. Paul, Olaf Faaland, Adam Moody, Elsa Gonsiorowski, Kathryn Mohror, Ali R Butt, “Understanding HPC Application I/O Behavior using system level statistics”, 2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC).
  2. Breno Dantas Cruz, Arnab K. Paul, Zheng Song, Eli Tilevich, “Stargazer: A deep learning approach for estimating the performance of edge-based clustering applications", 2020 IEEE International Conference on Smart Data Services (SMDS).
  3. Subil Abraham, Arnab K. Paul, Redwan Ibne Seraj Khan, Ali R Butt, "On the use of containers in high performance computing environments", 2020 IEEE 13th International Conference on Cloud Computing (CLOUD).
  4. Arnab K. Paul, “An application-attuned framework for optimizing hpc storage systems”, Ph.D. Dissertation.
  5. Arnab K. Paul, Brian Wang, Nathan Rutman, Cory Spitz, Ali R Butt, “Efficient metadata indexing for HPC storage systems”, 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID).

2019


  1. Arnab K. Paul, Ryan Chard, Kyle Chard, Steven Tuecke, Ali R Butt, Ian Foster, “Fsmonitor: Scalable file system monitoring for arbitrary storage systems”, 2019 IEEE International Conference on Cluster Computing (CLUSTER).
  2. Bharti Wadhwa, Arnab K. Paul, Sarah Neuwirth, Feiyi Wang, Sarp Oral, Ali R. Butt, Jon Bernard, Kirk W. Cameron, “IEZ: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems", IEEE International Parallel and Distributed Processing Symposium (IPDPS).
  3. Hyogi Sim, Arnab K. Paul, Eli Tilevich, Ali R Butt, Muhammad Shahzad, “Cslim automated extraction of IoT functionalities from legacy C codebases", Proceedings of the 20th International Conference on Distributed Computing and Networking.
  4. Arnab K. Paul, Olaf Faaland, Adam Moody, Elsa Gonsiorowski, Kathryn Mohror, Ali R Butt, "Improving I/O Performance of HPC Applications Using Intra-Job Scheduling", Proceedings of the 4th Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISC'19) in conjunction with SC'19

2017


  1. Arnab K. Paul, Arpit Goyal, Feiyi Wang, Sarp Oral, Ali R Butt, Michael J Brim, Sangeetha B Srinivasa, “I/O Load Balancing for Big Data HPC Applications”, 2017 IEEE International Conference on Big Data (Big Data).
  2. Arnab K. Paul, Steven Tuecke, Ryan Chard, Ali R Butt, Kyle Chard, Ian Foster, “Toward Scalable Monitoring on Large-scale Storage for Software defined Cyberinfrastructure”, Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems.

2016


  1. Arnab K. Paul, Wenjie Zhuang, Luna Xu, Min Li, M Mustafa Rafique, Ali R Butt, “Chopper: Optimizing data partitioning for in-memory data analytics frameworks”, 2016 IEEE International Conference on Cluster Computing (CLUSTER).

2015


  1. Arnab K. Paul, “Dynamic virtual machine placement in cloud computing”, Masters' Dissertation.

2014

2014


  1. Arnab K. Paul, Sourav Kanti Addya, Bibhudatta Sahoo, Ashok Kumar Turuk, “Application of greedy algorithms to virtual machine distribution across data centers”, 2014 Annual IEEE India Conference (INDICON).
  2. Arjun Datta, Arnab K. Paul, "Online Compiler as a Cloud Service", 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies