计算机集成制造系统 ›› 2023, Vol. 29 ›› Issue (12): 4256-4266.DOI: 10.13196/j.cims.2022.1045

• • 上一篇    下一篇

面向故障短文本的改进图节点嵌入与聚类方法

邱竞雄1,2,孙林夫1,2+,韩敏1,2   

  1. 1.西南交通大学计算机与人工智能学院
    2.四川省制造业产业链协同与信息化支撑技术重点实验室
  • 出版日期:2023-12-31 发布日期:2024-01-11
  • 基金资助:
    国家重点研发计划资助项目(2018YFB1701500,2018YFB1701502)。

Improved graph node embedding and clustering method for fault short text

QIU Jingxiong1,2,SUN Linfu1,2+,HAN Min1,2   

  1. 1.School of Computing and Artificial Intelligence,Southwest Jiaotong University
    2.Manufacturing Industry Chains Collaboration and Information Support Technology Key Laboratory of Sichuan Province
  • Online:2023-12-31 Published:2024-01-11
  • Supported by:
    Project supported by the National Key Research and Development Program,China(No.2018YFB1701500,2018YFB1701502).

摘要: 为有效挖掘故障短文本中跨文本的词汇间关联,构建故障实体节点的全局特征表示,从而获取故障实体节点聚类标签,提出一种面向故障短文本的改进图节点嵌入与聚类方法。该方法首先在图结构构建过程中创新边权重计算方法,用以区分同一窗口下不同距离的词汇间关联;其次改进图节点结构特征获取方法,从而体现节点度值差异对嵌入的影响;通过融合节点的结构特征与关系特征,增强具有相似邻居节点的同类节点之间的相似性表现;在聚类阶段设计备选节点数参数以缓解截断距离的敏感性。该方法在公开数据集和真实业务数据上进行了参数分析和性能评估,结果表明该方法可获取精准有效的故障实体节点聚类结果。

关键词: 故障短文本, 图节点嵌入, 局部密度, 图节点聚类

Abstract: To effectively mine the cross-text vocabulary association in fault short text,the global feature representation of fault entity nodes was constructed,and the fault entity node clustering label was obtained.An improved graph node embedding and clustering method for fault short text was proposed.In this method,the calculation method of edge weight was innovated in the process of graph construction to distinguish the association between words with different distances under the same window.The graph node structure feature acquisition method was improved to reflect the influence of node value differences on embedding.Then,the structural features and relational features of nodes were fused to enhance the similarity between nodes with similar neighbor nodes.In the clustering stage,a parameter called alternative nodes number was designed to alleviate the sensitivity of cut-off distance.The parameter analysis and performance evaluation were carried out on the open data set and real business data,and the results showed that the proposed method could obtain accurate and effective clustering results of fault entity nodes.

Key words: fault short text, graph node embedding, local density, graph node clustering

中图分类号: