›› 2021, Vol. 27 ›› Issue (9): 2636-2646.DOI: 10.13196/j.cims.2021.09.016

Previous Articles     Next Articles

Algorithm for single-firing sequence sudden drift detection

  

  • Online:2021-09-30 Published:2021-09-30
  • Supported by:
    Project supported by the National Natural Science Foundation,China (No.62002310),the Major Project of Science and Technology of Yunnan Province,China (No.202002AD080002),the Yunnan Provincial Natural Science Foundation,China (No.202101AT070004,2019FB135),the Open Fund Project of Yunnan Provincial Software Engineering Key Laboratory,China (No.2020SE404),the Data-Driven Software Engineering Provincial Science and Technology Innovation Team Project of Yunnan University,China (No.2017HC012),the “Dong Lu Young-backbone Teacher” Training Program of Yunnan University ,China (No.C176220200),and the Yunnan Provincial Philosophy and Social Science Youth Project,China (No.QN2020024).

单触发序列突发漂移检测算法

原佳怡1,6,朱锐1,2+,林雷蕾3,李彤2,4,郑明5   

  1. 1.云南大学软件学院
    2.云南省软件工程重点实验室
    3.清华大学软件学院
    4.云南农业大学大数据学院
    5.云南大学信息学院
    6.山西师范大学教师教育学院
  • 基金资助:
    国家自然科学基金资助项目(62002310);云南省重大科技专项计划资助项目(202002AD080002);云南省自然科学基金基础研究面上资助项目(202101AT070004,2019FB135);云南省软件工程重点实验室开放基金项目(2020SE404);云南大学数据驱动的软件工程省科技创新团队资助项目(2017HC012);云南大学“东陆中青年骨干教师”培养计划资助项目(C176220200);云南哲学社会科学青年项目(QN2020024)。

Abstract: In view of the fact that the existing drift detection algorithms were not suitable for solving the drift problem of single firing sequence,a sudden drift detection method based on the change of active distance was proposed.The active relation matrix in each sliding window was extracted to obtain the feature vector of the relationship.To reduce the dimension of the relational matrix,the active relational matrix was converted into Jaccard distance distribution matrix by calculating the active Jaccard distance between sliding Windows.Kullback-Leibler (KL) divergence was used to compare the variation of probability distribution in the adjacent distance matrix to locate the drift interval.To solve the problem of the uncertainty caused by size of the particle size,the intersection of drift interval was obtained by traversing the window size in turn.Simulated data sets with 12 change patterns and 5 logs of different sizes for each pattern and real data sets of execution logs for two software repositories were evaluated.The results showed that the proposed method could effectively locate the sudden drift of single firing sequence.

Key words: sudden drift, single firing sequence, Jaccard distance, Kullback-Leibler divergence, drift detection algorithm

摘要: 针对现有的漂移检测算法不适用于解决单触发序列的漂移问题,提出一种基于活动距离变化的突发漂移检测方法。首先,提取每个滑动窗口中活动的关系矩阵来获取关系的特征向量;其次,为了降低关系矩阵的维度,通过计算滑动窗口之间活动的杰卡德距离,将活动的关系矩阵转换为杰卡德距离分布矩阵;然后,采用KL散度比较相邻距离矩阵中概率分布的变化来定位漂移区间;最后,为了解决粒度大小引起的不确定性问题,以循环关系的位置为窗口大小依次遍历并求得漂移区间的交集来定位漂移点。通过实验对包含12种变更模式且每种模式有5个不同大小日志的模拟数据集和两个软件仓库的执行日志的真实数据集进行了评估。结果表明,该方法可以对单触发序列的突发漂移进行有效定位。

关键词: 突发漂移, 单触发序列, 杰卡德距离, KL散度, 漂移检测算法

CLC Number: