2026-06-26 04:00 UTCOriginal source2 min readUpdated: 2026-06-26 08:08 UTC

A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding

This paper introduces an innovative multi-task deep learning model that accurately predicts penetration state, depth, and weld seam morphology in laser penetration welding. The model uses weld pool images from a CMOS camera and welding parameters, integrating spatiotemporal features via CNNs and state space models. Test results show 99.35% accuracy for penetration state, 1.79 mm error for depth, and 95.65% accuracy for weld cross-section reconstruction.

SourcearXiv Computer VisionAuthor: Sen Li, Haichao Cui, Chendong Shao, Yaqi Wang, Xinhua Tang

[2606.26260] A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding

[Submitted on 24 Jun 2026]

Title:A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding

View a PDF of the paper titled A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding, by Sen Li and 4 other authors

View PDF

Abstract:In laser penetration welding, the assessment of penetration state and weld seam morphology plays a crucial role in determining the weld quality. This paper presents a comprehensive introduction of the innovative muti-task deep learning model that has the capability to predict penetration state, depth, and weld seam morphology with high accuracy. The monitoring platform relies on weld pool images captured during the laser welding process using a complementary metal-oxide-semiconductor camera. The proposed model integrates spatiotemporal features extracted from top weld pool images along with welding parameters, establishing a deep learning framework based on convolutional neural networks and state space models for more efficient extraction and processing of spatial-temporal information. Furthermore, a reliable method for constructing the dataset is proposed to enhance both robustness and generalization capability of the developed model. Validation results on the test set demonstrate that prediction accuracy for penetration state can reach 99.35%, while prediction error for penetration depth is 1.79 millimeter, and accuracy of reconstructing the weld cross-section is 95.65%. This study provides new insights and methodologies for in-situ quality control strategies in laser penetration welding systems.

Subjects:

Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)

Cite as: arXiv:2606.26260 [cs.CV]

(or arXiv:2606.26260v1 [cs.CV] for this version)

https://doi.org/10.48550/arXiv.2606.26260

arXiv-issued DOI via DataCite (pending registration)

Related DOI:

https://doi.org/10.1016/j.engappai.2025.113641

DOI(s) linking to related resources

Submission history

From: Sen Li [view email] [v1] Wed, 24 Jun 2026 18:02:53 UTC (4,356 KB)

Full-text links:

Access Paper:

View a PDF of the paper titled A multi-task spatiotemporal deep neural network for predicting penetration depth and morphology in laser welding, by Sen Li and 4 other authors

View PDF

view license

Current browse context:

cs.CV

new | recent | 2026-06

Change to browse by:

cs cs.AI

References & Citations

NASA ADS

Google Scholar

Semantic Scholar

Data provided by:

Bibliographic Tools

Bibliographic and Citation Tools

Bibliographic Explorer Toggle

Bibliographic Explorer (What is the Explorer?)

Connected Papers Toggle

Connected Papers (What is Connected Papers?)

Litmaps Toggle

Litmaps (What is Litmaps?)

scite.ai Toggle

scite Smart Citations (What are Smart Citations?)

Code, Data, Media

Code, Data and Media Associated with this Article

alphaXiv Toggle

alphaXiv (What is alphaXiv?)

Links to Code Toggle

CatalyzeX Code Finder for Papers (What is CatalyzeX?)

DagsHub Toggle

DagsHub (What is DagsHub?)

GotitPub Toggle

Gotit.pub (What is GotitPub?)

Huggingface Toggle

Hugging Face (What is Huggingface?)

ScienceCast Toggle

ScienceCast (What is ScienceCast?)

Demos

Replicate Toggle

Replicate (What is Replicate?)

Spaces Toggle

Hugging Face Spaces (What is Spaces?)

Spaces Toggle

TXYZ.AI (What is TXYZ.AI?)