Abstract:
Recently it is a hot topic to utilize massive log information of deep learning models for model optimization and troubleshooting. To address the challenge of model workflow analysis, we propose ModelLogVis, a visual analysis approach for diagnosing log abnormality in model services. Our approach employs a log anomaly detection method to locate the potential faults in the model workflow, guiding users to focus on the significant fault types. Our integrated visual interface illustrates events of the workflow from multiple perspectives, including dataflow, status, instance performance, and original logs, and supports users to progressively analyze the faults in the workflow. Case studies of real datasets and expert interviews demonstrate that our approach is highly efficient in helping users quickly uncover anomalous information in logs.