

Visualization for Supercomputer System: A Survey

  • 摘要: 随着科技飞速发展,超级计算机系统不断向着更大规模迈进,给系统使用和维护带来困难.可视化技术以直观易懂的方式展示超级计算机系统的运行状态、异常情况等,让用户更加深入地理解系统内部运行机制,有助于更好地使用和优化系统.通过文献调研,分析了超级计算机系统可视化的相关工作,并根据系统使用流程将其分为状态监控、性能优化和系统维护3个方面.分析表明,可视化在状态监控方面可帮助分析系统运行时内部进程和数据的状态;在性能优化方面可帮助定位性能瓶颈并优化;在系统维护方面可帮助发现运行异常以及对异常溯源.最后,详细阐述了未来超级计算机系统可视化在大规模数据处理、全局优化、下游任务迁移等方向上面临的机遇和挑战.


    Abstract: With the rapid development of science and technology, supercomputer systems are growing in size and complexity, which poses challenges for their usage and maintenance. Visualization provides an intuitive way to reveal the status of supercomputer systems. It enables users to gain a deeper understanding of the internal mechanisms of the systems, making it easier to use and optimize them. This survey summarizes the related work on supercomputer system visualization and classifies them into three categories: status monitoring, performance optimization, and system maintenance. Our survey shows that visualization helps analyze the status of internal processes and data for status monitoring; identify and optimize performance bottlenecks for performance optimization; and discover and analyze anomalies for system maintenance. Finally, we discuss future research opportunities, such as large-scale data processing, global optimization, and downstream task migration.


