面向海洋的多模态智能计算:挑战、进展和展望
聂婕1, 左子杰1, 黄磊1, 王志刚1, 孙正雅2, 仲国强1, 王鑫3, 王玉成4, 刘安安5, 张弘6, 董军宇1, 魏志强1,4(1.中国海洋大学, 青岛 266100;2.中国科学院自动化研究所, 北京 100190;3.清华大学计算机科学与技术系, 北京 100084;4.青岛海洋科学与技术试点国家实验室, 青岛 266061;5.天津大学电气自动化与信息工程学院, 天津 300072;6.北京航空航天大学宇航学院, 北京 100083) 摘 要
海洋是高质量发展的要地,海洋科学大数据的发展为认知和经略海洋带来机遇的同时也引入了新的挑战。海洋科学大数据具有超多模态的显著特征,目前尚未形成面向海洋领域特色的多模态智能计算理论体系和技术框架。因此,本文首次从多模态数据技术的视角,系统性介绍面向海洋现象/过程的智能感知、认知和预知的交叉研究进展。首先,通过梳理海洋科学大数据全生命周期的阶段演进过程,明确海洋多模态智能计算的研究对象、科学问题和典型应用场景。其次,在海洋多模态大数据内容分析、推理预测和高性能计算3个典型应用场景中展开现有工作的系统性梳理和介绍。最后,针对海洋数据分布和计算模式的差异性,提出海洋多模态大数据表征建模、跨模态关联、推理预测以及高性能计算4个关键科学问题中的挑战,并提出未来展望。
关键词
Marine oriented multimodal intelligent computing:challenges, progress and prospects
Nie Jie1, Zuo Zijie1, Huang Lei1, Wang Zhigang1, Sun Zhengya2, Zhong Guoqiang1, Wang Xin3, Wang Yucheng4, Liu An'an5, Zhang Hong6, Dong Junyu1, Wei Zhiqiang1,4(1.Ocean University of China, Qingdao 266100, China;2.Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;3.Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;4.Pilot National Laboratory for Marine Science and Technology (Qingdao), Qingdao 266061, China;5.School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China;6.School of Astronautics, Beihang University, Beijing 100083, China) Abstract
The marine-oriented research is essential to high-quality of human-based development.But,the current recognition of the ocean system is less than 5%.To understand the ocean,big marine data is acquired from observation,monitoring,investigation and statistics.Thanks to the development of the multi-scaled ocean observation system,the extensive of multi-modal marine oriented data has developed via remote sensing image,spatio-temporal analysis,simulation data,literature review and video&audio monitoring.To resilient the sustainable development of human society,current deep analysis and multimodal ocean data mining method has promoted the marine understanding on the aspects of ocean dynamic processes,energy and material cycles,the evolution of blue life,scientific discovery,healthy environment,and the quick response of extreme weather and climate change.Compared to traditional big data,the multi-modal big ocean data has its unique features,such as the super-giant system (covering 71% of the earth's surface,daily increment (10 TB),super multi-perspectives ("land-sea-air-ice-earth based" coupling, "hydrometeorological-acoustical-optical and electromagnetic-based" polymorphism),super spatial scale ("centimeter to hundreds kilometer based"),and temporal scale ("micro-second to inter-decadal based").These features-derived challenges of existing multi-modal intelligent computing technology have to deal with such problems as cross-scale and multi-modal fusion analyses,multi-disciplinary and multi-domain coordinated reasoning,large computing power based multi-architecture compatible applications.We systematically introduce the cross-cutting researches of intelligent perception,cognition,and prediction for marine phenomena/processes based on multimodal data technology.First,we clarify the research objects,scientific problems,and typical application scenarios of marine multimodal intelligent computing through the evolution analysis of the lifecycle of marine science big data.Next,we target the differences between ocean data distribution and calculation patterns.We illustrate the uniqueness and scientific challenges of multimodal big marine data on the basis of modeling description,cross-modal correlation,inference prediction,and high-performance computing.1) To bridge the"task gap "between big data and specific tasks for modeling description,we focus on effective feature extraction for related tasks of causality,differentiation,significance and robustness.The ocean-oriented differences and challenges are mainly discussed from six aspects including dynamic changes of physical structure,complex environmental noise,large intra-class differences,lack of reliable labels,unbalanced samples,and less public datasets.2) To construct multi-circle layer,multi-scale and multi-perspective heterogeneous data,the cross-modal correlation modeling is obtained for reasonable integration of multi-model,effective reasoning of cross-model,and the multi-modalities of" heterogeneous gap bridging "through task matching,semantic consistency,and spatio-temporal correlation.The ocean field issue is mainly affected by four aspects of uneven data,large scale span,strong constraints of temporal and spatial,and high correlation of dimensions.3) To fill the" unknown gap "of spatio-temporal information loss in the evolution of ocean,the reasoning and prediction requires the prior knowledge,experience,and reasoning ability in the field of modeling.The main differences of ocean fields are reflected in the three issues of dynamic evolution,spatio-temporal heterogeneity,and non-independent samples.4) To reduce the" computing gap"between complex computing and real-time online analysis of marine super-giant systems,it is necessary to deal with the huge amount of data challenges in high-performance computing problems like increased resolution and the ocean processes refinement of online response analysis.In addition,we sort out and introduce existing work of typical application scenarios,such as marine multimedia content analysis,visual analysis,big data prediction,and high-performance computing.1) Multimedia content analysis:we compare the technical features of existing marine research methods on the five aspects of target recognition,target re-identification,target retrieval,phenomenon/process recognition,and open datasets.2) Visual analysis of marine big data:we summarize the matching issues of dynamic changes of physical structure,high correlation dimensions,and large-scale spans from the perspective of visualization,visualization analysis,and visualization system.3) Ocean multimodal big data reasoning prediction:we review the existing research work from the perspectives of data-driven prediction and prediction of marine environmental variables,construction of marine knowledge graph,and knowledge reasoning.4) High-performance computing issues of ocean multi-modal big data:we introduce and compare the relevant work on the three perspectives of memory-computing collaboration,multi-model acceleration,and giant system evaluation.Finally,we predict the ocean multimodal intelligent computing issues to be resolved and the future direction of it.
Keywords
marine big data multimodal marine multimedia content analysis marine knowledge graph marine big data prediction marine oriented high performance computing re-identification of marine object
|