计算机集成制造系统 ›› 2022, Vol. 28 ›› Issue (9): 2927-2938.DOI: 10.13196/j.cims.2022.09.023

• • 上一篇    下一篇

数据污染情形下的全局灵敏度分析

谢恩,马义中+,刘丽君,林成龙   

  1. 南京理工大学经济管理学院
  • 出版日期:2022-09-30 发布日期:2022-10-10
  • 基金资助:
    国家自然科学基金资助项目(71871119);江苏省卓越博士后计划资助项目(2022ZB259);江苏省研究生科研与实践创新计划资助项目(KYCX19_0349)。

Global sensitivity analysis under data contamination#br#

XIE En,MA Yizhong+,LIU Lijun,LIN Chenglong#br#   

  1. School of Economics and Management,Nanjing University of Science and Technology
  • Online:2022-09-30 Published:2022-10-10
  • Supported by:
    Project supported by the National Natural Science Foundation,China(No.71871119),the Jiangsu Funding Program for Excellent Postdoctoral Talent,China(No.2022ZB259),and the Post-graduate Research and Practice Innovation Program of Jiangsu Province,China(No.KYCX19_0349).

摘要: 基于方差的全局灵敏度分析方法在很多领域得到了广泛应用。为解决数据污染情形下样本均值和方差易受异常值影响的问题,提出将稳健统计量和双循环重排序方法相结合改进Sobol蒙特卡洛仿真计算方法。该方法采用样本中位数或Hodges-Lehmann替代样本均值估计位置参数,用中位数绝对偏差或Shamos替代样本方差估计尺度参数。在数据污染或偏态分布情形下基于改进的稳健双循环重排序方法能正确识别模型输入因子的重要度,实现了模型输出不确定性溯源。仿真结果表明:当数据中异常值的比例不超过293%时,所提方法是稳健的,具有较强的抵抗异常值能力;该方法不但在数据理想无污染情形下有效,而且在数据污染或偏态分布情形下表现良好。

关键词: 数据污染, 全局灵敏度分析, 稳健统计量, Sobol指数

Abstract: The variance-based method of global sensitivity analysis has been widely used in many fields.To solve the problem that the sample mean and variance were susceptible to outliers in the sample under data contamination,an improved Sobol' Monte Carlo simulation method was proposed which integrated the robust statistic methodology with double loop reordering approach.The sample median and Hodges-Lehmann were introduced to substitute the sample mean to estimate the location parameters,and the median absolute deviation and Shamos were introduced to substitute the variance to estimate the scale parameters.Factor priorization and factor mapping were realized under the proposed robust double loop reordering approach in the situation of data contamination or skewed distribution.Simulation results showed that the proposed method was robust yet also outlier-resistant when the proportion of outliers in the data did not exceed 29.3%.It was not only efficient when the data were normal distribution but also had a good performance when the data were contaminated or skewed distribution.

Key words: data contamination, global sensitivity analysis, robust statistic, Sobol'indices

中图分类号: