计算机集成制造系统 ›› 2023, Vol. 29 ›› Issue (8): 2550-2562.DOI: 10.13196/j.cims.2023.08.004

• • 上一篇    下一篇

基于深度Q网络的多起点多终点AGV路径规划

黄岩松,姚锡凡+,景轩,胡晓阳   

  1. 华南理工大学机械与汽车工程学院
  • 出版日期:2023-08-31 发布日期:2023-09-11
  • 基金资助:
    国家自然科学基金委员会与英国爱丁堡皇家学会合作交流资助项目(51911530245);广东省基础与应用基础基金资助项目(2021A1515010506,2022A1515010095)。

DQN-based AGV path planning for situations with multi-starts and multi-targets

HUANG Yansong,YAO Xifan+,JING Xuan,HU Xiaoyang   

  1. School of Mechanical and Automobile Engineering,South China University of Technology
  • Online:2023-08-31 Published:2023-09-11
  • Supported by:
    Project supported by the Cooperation and Exchange Foundation between the National Natural Science Foundation of China and the Royal Society of Edinburgh,China(No.51911530245),and the Guangdong Provincial Basic and Applied Basic Research Foundation,China(No.2021A1515010506,2022A1515010095).

摘要: 自动引导小车(AGV)在工厂中承担不同节点之间的物料运输工作,在考虑全局路径最优的情况下需要对AGV进行多起点多终点的路径规划。针对现有深度强化学习算法研究多考虑单起点达到单终点的路径规划情况,涉及多起点多终点的情况时泛化性能较差的问题,提出一种基于深度Q网络(DQN)的AGV全局路径规划求解模型。首先通过改进算法的输入的AGV状态和改进奖励函数的设置提升算法收敛的效率;再利用改变训练初始点位置的方式提升数据的丰富度和模型对环境的感知程度,并以此提升模型对不同起点单个终点环境下路径规划的泛化能力;最后在训练过程中插入不同终点下AGV的状态数据,以获得模型对多终点路径规划的能力。通过在不同规模环境下的仿真与A*算法和快速扩展随机树算法的对比实验和模型的扩展性实验,验证了该方法在多终点情况下的路径规划能力。

关键词: 深度强化学习, 深度Q网络, 多终点, 自动引导小车, 路径规划

Abstract: The Automatic Guided Vehicle (AGV) undertakes the material transportation between different places in a factory.For the optimal global path,AGV needs to perform path planning with multiple starting points and multiple ending points.The existing deep reinforcement learning algorithms mostly consider the path planning from a single starting point to a single ending point,which have the poor generalization performance for the situation with multiple ending points.Aiming at this problem,a Deep Q-Network (DQN) based solution model for the global optimization of AGV path planning was proposed.The algorithm convergence was promoted by improving the input AGV state and reward function setting;then the starting point position was altered to enhance the data richness and the models perception of the environment,which aimed to improve the models generalization ability of path planning under the environment of different starting points and a single ending point;finally,states of AGVs at different targets were inserted in the training process to achieve the multi-target path planning.Compared with A* and Rapidly-exploring Random Tree (RRT) algorithm under different scale environments and model expansion experiments,the proposed path planning method was verified in the case of multiple targets.

Key words: deep reinforcement learning, deep Q-networks, multi-targets, automatic guided vehicle, path planning

中图分类号: