Computer Integrated Manufacturing System ›› 2024, Vol. 30 ›› Issue (5): 1571-1586.DOI: 10.13196/j.cims.2024.0098

Previous Articles     Next Articles

Neural-symbolic system for multimodal visual reasoning towards digital twin

ZHENG Hangbin1,LIU Tianyuan1,ZHENG Hanyao1,ZUO Daiyue1,BAO Jinsong1+,WANG Sen2   

  1. 1.School of Mechanical Engineering,Donghua University
    2.Shanghai Baosight Software Limited Company
  • Online:2024-05-31 Published:2024-06-12

数字孪生多模态视觉推理的神经-符号系统

郑杭彬1,刘天元1,郑汉垚1,左戴悦1,鲍劲松1+,王森2   

  1. 1.东华大学机械工程学院
    2.上海宝信软件股份有限公司
  • 作者简介:郑杭彬(1998-),男,浙江绍兴人,博士研究生,研究方向:多模态智能、数字孪生,E-mail:zhb@mail.dhu.edu.cn; 刘天元(1992-),男,安徽阜阳人,副教授,博士,研究方向:工业人工智能、计算视觉,E-mail:tianyuan.liu@dhu.edu.cn; 郑汉垚(1999-),男,江西上饶人,硕士研究生,研究方向:多模态智能、医学图像分析,E-mail:854789617@qq.com; 左戴悦(2000-),女,上海人,硕士研究生,研究方向:多模态智能,E-mail:zuodaiyueszx@163.com; +鲍劲松(1972-),男,安徽庐江人,教授,博士,研究方向:工业智能、智能制造系统,通讯作者,E-mail:bao@dhu.edu.cn; 王森(1966-),男,上海人,高级工程师,博士,研究方向:数字孪生、信息化软件工程,E-mail:734364221@qq.com。

Abstract: Faced with the complexities of fusing heterogeneous multimodal visual data in digital twins,a novel neuro-symbolic approach for combining the analytical capabilities of deep learning with the structured reasoning of symbolic intelligence was proposed.This approach employed deep neural networks to analyze the visual data in real-time and supplemented autonomous management of complex reasoning processes by the knowledge and event-response rules stored in a symbolic system.To enhance the system's adaptability for the physical world changes,an augmented reasoning mechanism integrating multimodal information with external knowledge was proposed.This mechanism effectively consolidated real-time sensor data with information from historical knowledge bases to support more accurate and rational decision-making.The efficacy of the proposed method was demonstrated through a case study on the disassembly of retired lithium batteries,and its capability to achieve high accuracy in identifying and analyzing multimodal data was illustrated.Furthermore,the coherent and logical operational recommendations based on the reasoning capabilities were generated,which significantly improved disassembly efficiency and safety.

Key words: digital twin, multi-modal, visual reasoning, neural-symbolic system, lithium battery disassembly

摘要: 面对数字孪生在多模态视觉数据融合中的异质性和动态性挑战,提出一种结合深度学习与符号智能的方法。该方法通过深度神经网络对视觉数据进行实时解析,并借助符号系统存储的知识和事件响应规则,实现对复杂推理过程的自主管理。为提高系统对物理世界变化的适应性,提出一种融合多模态信息和外部知识的增强推理机制,该机制能有效地整合来自传感器的实时数据和历史知识库中的信息,以支持更加准确和合理的决策制定。以退役锂电池拆解过程为案例验证表明,该方法不仅能够在多模态数据环境中实现高准确率的识别和分析,还能够基于推理机制生成合理且逻辑一致的操作建议,有效提升了拆解效率和安全性。

关键词: 数字孪生, 多模态, 视觉推理, 神经符号系统, 锂电池拆解

CLC Number: