Abstract:
Survival prediction is vital for treating gastric cancer patients. As one of the gold standards for tumor diagnosis, histopathological images have received much attention for tumor prognosis prediction in recent years. However, conventional survival prediction methods usually utilize only unimodal data, ignoring the correlation and complementarity between different data. Moreover, most pathology images lack pixel-level labels, which poses a great challenge to performing effective supervised learning. Therefore, this paper proposes a survival prediction algorithm for gastric cancer patients based on multi-modal multi-instance learning. The method first extracts features from clinical data and histopathology images, and then adopts global-aware multi-instance learning to extract bag-level embeddings under high magnification, while using the average pooling method to obtain instance-level embeddings of histopathology images under low magnification. Next, a multi-modal fusion approach is used to fuse the bag-level features, instance-level features, and clinical data features in order to achieve information interaction between different data and to fully utilize the image information under different magnifications. The experimental results show that, compared with traditional unimodal approaches, the proposed multi-modal multi-instance learning method significantly improves the survival prediction accuracy of gastric cancer patients.