高级检索
余敏槠, 余晓敏, 王杨, 陈恺心, 单桂华, 金钟. 文献聚类结果可视分析方法研究[J]. 计算机辅助设计与图形学学报, 2020, 32(10): 1645-1654. DOI: 10.3724/SP.J.1089.2020.18469
引用本文: 余敏槠, 余晓敏, 王杨, 陈恺心, 单桂华, 金钟. 文献聚类结果可视分析方法研究[J]. 计算机辅助设计与图形学学报, 2020, 32(10): 1645-1654. DOI: 10.3724/SP.J.1089.2020.18469
Yu Minzhu, Yu Xiaomin, Wang Yang, Chen Kaixin, Shan Guihua, Jin Zhong. Research on the Visual Analysis of Literature Clustering Results[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(10): 1645-1654. DOI: 10.3724/SP.J.1089.2020.18469
Citation: Yu Minzhu, Yu Xiaomin, Wang Yang, Chen Kaixin, Shan Guihua, Jin Zhong. Research on the Visual Analysis of Literature Clustering Results[J]. Journal of Computer-Aided Design & Computer Graphics, 2020, 32(10): 1645-1654. DOI: 10.3724/SP.J.1089.2020.18469

文献聚类结果可视分析方法研究

Research on the Visual Analysis of Literature Clustering Results

  • 摘要: 在信息化时代,文献数据呈爆炸式增长.面对海量无标签的文献数据,无监督文本聚类能够快速、高效地对大规模数据重新组织和归纳.然而,影响文献聚类效果的因素是多方面的,从数据处理到文本表示方法到聚类算法的选择,在任意一个环节不同的选择产生的结果可能大相径庭;且在各环节方法种类多样使得文献聚类结果难以解释和评估,对做好文献聚类工作造成了很大困扰.为此,提出了一个完整的文献聚类结果可视分析框架.该框架包含数据预处理、文本表示、文本聚类、聚类结果可视分析各个环节,采用语料结构可视化、语料内容可视化、文本向量维度可视化以及可视化交互对聚类结果进行解释、分析、评估、调整和优化.基于该框架,设计并实现了文献聚类结果可视分析系统,研究了采用不同文本表示方法、不同聚类算法对聚类结果产生的影响.最后,通过3个案例,验证了该框架有效性.

     

    Abstract: In the information age,literature data is growing explosively.In the face of massive unlabeled literature data,unsupervised text clustering can quickly and efficiently reorganize and summarize large-scale data.However,there are many factors that affect the effect of literature clustering results.From data preprocessing to text representation to text clustering,the results of different selection in these steps may be quite different.Moreover,the variety of methods in each step and the difficulty in explaining and evaluating the results of text clustering have caused great difficulties for literature clustering.Therefore,this paper proposes a complete visual analysis framework of literature clustering results.The framework includes data preprocessing,text representation,text clustering and visual analysis of clustering results.Visual analysis method is used to interpret,analyze,evaluate,adjust and optimize the clustering results.Based on this framework,this paper designs and implements a visual analysis system of literature clustering results,and studies the influence of different text representation methods and clustering algorithms on clustering results.Finally,three cases are used to verify the effectiveness of the framework.

     

/

返回文章
返回