Parallel frequent pattern growth algorithm optimization in cloud manufacturing environment

• Article •

Parallel frequent pattern growth algorithm optimization in cloud manufacturing environment

WANG Jie, DAI Qing-hao , ZENG Yu,YANG Dong-ri

1.School of Management, Capital Normal University, Beijing 100089, China；2.Beijing Computing Center, Beijing 100094, China;3　College of Computing and Communication Engineering,Graduate University of the Chinese Academy of Sciences,Beijing 100049,China

Online:2012-09-15 Published:2012-09-25

云制造环境下并行频繁模式增长算法优化

王洁，戴清灏，曾宇，杨东日

1.首都师范大学管理学院，北京100089；2.北京市计算中心，北京100094；3　中国科学院研究生院计算与通信工程学院，北京100049

Abstract

Abstract: Aiming at the massive data mining task in cloud manufacturing environment, the realization of existing parallel frequent pattern growth algorithm and its disadvantages were analyzed. By using key value store system, its counting and grouping parts were optimized. Based on simple, auto-increment and orderly manner of key value store system, the information of counting and grouping was stored on key value database. Through reducing the read-write of Distributed File System (DFS) and parallel executing the process of counting and grouping, the network and memory cost of storage node was decreased by optimization algorithm. On real datasets, the performance and file system I/O cost of algorithms before and after optimization were compared by experiments.

Key words: cloud manufacturing, parallel frequent pattern growth algorithm, key-value storage system, data mining, algorithm optimization

摘要： 针对云制造环境下的海量数据挖掘，分析了现有并行频繁模式增长算法的实现和不足。研究了利用键值存储系统对其中的计数和分组部分进行优化。利用键值型数据库存储简单、自动增长且有序的方式，将计数和分组的信息存储在了键值型数据库上。通过减少对分布式文件系统的读写，并将计数过程和排序过程并行化执行，优化后的算法减小了存储节点的网络及内存开销。在真实数据集上，通过实验对比了优化前后算法的性能以及对于文件系统I/O的开销。

关键词: 云制造, 并行频繁模式增长算法, 键值存储系统, 数据挖掘, 算法优化

CLC Number:

TP312

WANG Jie, DAI Qing-hao , ZENG Yu,YANG Dong-ri. Parallel frequent pattern growth algorithm optimization in cloud manufacturing environment[J]. .

王洁，戴清灏，曾宇，杨东日. 云制造环境下并行频繁模式增长算法优化[J]. .

[1]	. Prediction of college entrance examination results based on multi-feature perception network [J]. , 2021, 27(9): 2741-2748.
[2]	. Task scheduling method for large-scale factory access in cloud and edge collaborative computing architecture [J]. , 2021, 27(8): 2282-2294.
[3]	. Novel semantic retrieval approach for semi-structured knowledge in industrial software development [J]. , 2021, 27(8): 2371-2381.
[4]	. Multi-timing production of differentiated products in cloud manufacturing mode [J]. , 2021, 27(6): 1681-1692.
[5]	. Cloud manufacturing service composition optimization based on reliability and credibility analysis [J]. , 2021, 27(6): 1780-1798.
[6]	. Evaluation of cloud manufacturing service quality evaluation based on intuitionistic fuzzy cosine similarity [J]. , 2021, 27(4): 1128-1134.
[7]	. Assembly gross error identification of small sample aircraft structure driven by inspection data and expert knowledge [J]. , 2021, 27(12): 3462-3474.
[8]	. Incorporating service scarcity and quality loss into cloud manufacturing service composition [J]. , 2021, 27(12): 3639-3650.
[9]	. Cost allocation of customer collaboration in cloud manufacturing environment [J]. , 2021, 27(10): 3004-3013.
[10]	. Perception and access of manufacturing resources and intelligent gateway technology for edge computing [J]. , 2020, 26(第1): 40-48.
[11]	. Pricing strategy of cloud manufacturing platform based on two-sided market theory [J]. , 2020, 26(第1): 268-278.
[12]	. Cloud manufacturing service recommendation based on scenario recognition [J]. , 2020, 26(8): 2007-2019.
[13]	. Demand forecasting based optimization of service renting configuration for cloud manufacturing [J]. , 2020, 26(11): 2944-2954.
[14]	. Decentralized access control with policy updating in cloud manufacturing [J]. , 2019, 25(第9): 2280-2290.
[15]	. Outlier detection based on cluster outlier factor and mutual density [J]. , 2019, 25(第9): 2314-2323.