Folgen
Kathryn Mohror
Titel
Zitiert von
Zitiert von
Jahr
Design, modeling, and evaluation of a scalable multi-level checkpointing system
A Moody, G Bronevetsky, K Mohror, BR De Supinski
SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010
8362010
There goes the neighborhood: performance degradation due to nearby jobs
A Bhatele, K Mohror, SH Langer, KE Isaacs
Proceedings of the International Conference on High Performance Computing …, 2013
2532013
Design and modeling of a non-blocking checkpointing system
K Sato, N Maruyama, K Mohror, A Moody, T Gamblin, BR de Supinski, ...
SC'12: Proceedings of the International Conference on High Performance …, 2012
1482012
An ephemeral burst-buffer file system for scientific applications
T Wang, K Mohror, A Moody, K Sato, W Yu
SC'16: Proceedings of the International Conference for High Performance …, 2016
1452016
MCREngine: A scalable checkpointing system using data-aware aggregation and compression
TZ Islam, K Mohror, S Bagchi, A Moody, BR De Supinski, R Eigenmann
SC'12: Proceedings of the International Conference on High Performance …, 2012
1372012
Veloc: Towards high performance adaptive asynchronous checkpointing at large scale
B Nicolae, A Moody, E Gonsiorowski, K Mohror, F Cappello
2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2019
992019
A large-scale study of MPI usage in open-source HPC applications
I Laguna, R Marshall, K Mohror, M Ruefenacht, A Skjellum, N Sultana
Proceedings of the International Conference for High Performance Computing …, 2019
982019
The popper convention: Making reproducible systems evaluation practical
I Jimenez, M Sevilla, N Watkins, C Maltzahn, J Lofstead, K Mohror, ...
2017 ieee international parallel and distributed processing symposium …, 2017
882017
A 1 PB/s file system to checkpoint three million MPI tasks
R Rajachandrasekar, A Moody, K Mohror, DK Panda
Proceedings of the 22nd international symposium on High-performance parallel …, 2013
832013
ADAPT: Algorithmic differentiation applied to floating-point precision tuning
H Menon, MO Lam, D Osei-Kuffuor, M Schordan, S Lloyd, K Mohror, ...
SC18: International Conference for High Performance Computing, Networking …, 2018
812018
A user-level infiniband-based file system and checkpoint strategy for burst buffers
K Sato, K Mohror, A Moody, T Gamblin, BR De Supinski, N Maruyama, ...
2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid …, 2014
812014
Entropy-aware I/O pipelining for large-scale deep learning on HPC systems
Y Zhu, F Chowdhury, H Fu, A Moody, K Mohror, K Sato, W Yu
2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation …, 2018
762018
I/o characterization and performance evaluation of beegfs for deep learning
F Chowdhury, Y Zhu, T Heer, S Paredes, A Moody, R Goldstone, ...
Proceedings of the 48th International Conference on Parallel Processing, 1-10, 2019
752019
Evaluating and extending user-level fault tolerance in MPI applications
I Laguna, DF Richards, T Gamblin, M Schulz, BR de Supinski, K Mohror, ...
The International Journal of High Performance Computing Applications 30 (3 …, 2016
602016
Evaluating similarity-based trace reduction techniques for scalable performance analysis
K Mohror, KL Karavanic
Proceedings of the conference on high performance computing networking …, 2009
522009
Managing I/O interference in a shared burst buffer system
S Thapaliya, P Bangalore, J Lofstead, K Mohror, A Moody
2016 45th International Conference on Parallel Processing (ICPP), 416-425, 2016
502016
Efficient user-level storage disaggregation for deep learning
Y Zhu, W Yu, B Jiao, K Mohror, A Moody, F Chowdhury
2019 IEEE International Conference on Cluster Computing (CLUSTER), 1-12, 2019
462019
Ad hoc file systems for high-performance computing
A Brinkmann, K Mohror, W Yu, P Carns, T Cortes, SA Klasky, A Miranda, ...
Journal of Computer Science and Technology 35, 4-26, 2020
442020
Recorder 2.0: Efficient parallel I/O tracing and analysis
C Wang, J Sun, M Snir, K Mohror, E Gonsiorowski
2020 IEEE International Parallel and Distributed Processing Symposium …, 2020
392020
Fmi: Fault tolerant messaging interface for fast and transparent recovery
K Sato, A Moody, K Mohror, T Gamblin, BR de Supinski, N Maruyama, ...
2014 IEEE 28th International Parallel and Distributed Processing Symposium …, 2014
392014
Das System kann den Vorgang jetzt nicht ausführen. Versuchen Sie es später erneut.
Artikel 1–20