Enhancing I/O observability and storage reliability in high-performance computing systems