Traditional image processing methods or machine learning approaches for addressing crack detection problems typically cater to specific scenarios. As the scene changes, the detection accuracy of such methods is significantly affected, lacking robustness across multiple application scenarios. To adapt to diverse application contexts, improvements have been made upon the original crack detection method, CrackFormer, leading to the proposal of a detection approach based on the Gaussian Scale Mixture (GSM) model, known as GSM-CrackFormer. Firstly, a Gaussian Scale Mixture model is constructed to describe Gaussian distributions related to crack features. Subsequently, a signal transformer is designed, incorporating a gating mechanism to convert crack feature information generated by the distributions into guiding signals for enhancing semantic features. Additionally, a novel up-down sampling strategy is introduced to further balance the relationship between model receptive fields and its capability to capture detailed features. By adjusting the loss function, the imbalance issue between crack and non-crack pixels is effectively mitigated. Finally, we conduct experiments on the CrackSeg9k dataset to evaluate the performance of our proposed method. Our experimental results demonstrate that GSM-CrackFormer outperforms state-of-the-art methods, achieving a global best (ODS) of 0.784, a monograph best (OIS) of 0.785, and a mean intersection ratio (MIoU) of 0.828.