ModelLogVis: 面向模型服务的日志异常可视分析方法

卢裕弘; 朱琳; 封颖超杰; 王斯加; 林正轩; 潘嘉铖; 陈为

doi:10.3724/SP.J.1089.2024.19902

ModelLogVis: 面向模型服务的日志异常可视分析方法

ModelLogVis: Log Anomaly Detection Visual Analysis Method for Model Service

摘要

摘要: 利用深度学习模型训练和运行维护过程产生的海量日志信息, 进行模型的优化与故障排查, 是当前人工智能运维的研究热点. 针对现有工作缺少模型工作流分析的问题, 提出面向模型服务的日志异常可视分析方法ModelLogVis. 该方法采用日志异常检测方法定位模型工作流中的潜在故障, 帮助用户聚焦主要的故障类型; 支持用户从数据流、状态、实例性能和原始日志等多个角度对工作流中的事件进行交互式可视化与分析, 快速、准确地排查问题. 通过真实的模型服务数据的案例研究和专家访谈, 证明ModelLogVis方法可高效地辅助用户快速挖掘日志中的异常信息.

Abstract: Recently it is a hot topic to utilize massive log information of deep learning models for model optimization and troubleshooting in artificial intelligence operation. To address the challenge of model workflow analysis, we propose ModelLogVis, a visual analysis approach for diagnosing log abnormality in model services. Our approach employs a log anomaly detection method to locate the potential faults in the model workflow, guiding users to focus on the significant fault types. We integrated visual interface illustrates events of the workflow from multiple perspectives, including dataflow, status, instance performance, and original logs, and supports users to progressively analyze the faults in the workflow. Case studies of real datasets and expert interviews demonstrate that our approach is highly efficient in helping users quickly uncover anomalous information in logs.

HTML全文

参考文献(26)

施引文献

资源附件(0)