Skip to main content

Three-Dimensional Reconstruction of Welding Pool Surface by Binocular Vision


Current research of binocular vision systems mainly need to resolve the camera’s intrinsic parameters before the reconstruction of three-dimensional (3D) objects. The classical Zhang’ calibration is hardly to calculate all errors caused by perspective distortion and lens distortion. Also, the image-matching algorithm of the binocular vision system still needs to be improved to accelerate the reconstruction speed of welding pool surfaces. In this paper, a preset coordinate system was utilized for camera calibration instead of Zhang’ calibration. The binocular vision system was modified to capture images of welding pool surfaces by suppressing the strong arc interference during gas metal arc welding. Combining and improving the algorithms of speeded up robust features, binary robust invariant scalable keypoints, and KAZE, the feature information of points (i.e., RGB values, pixel coordinates) was extracted as the feature vector of the welding pool surface. Based on the characteristics of the welding images, a mismatch-elimination algorithm was developed to increase the accuracy of image-matching algorithms. The world coordinates of matching feature points were calculated to reconstruct the 3D shape of the welding pool surface. The effectiveness and accuracy of the reconstruction of welding pool surfaces were verified by experimental results. This research proposes the development of binocular vision algorithms that can reconstruct the surface of welding pools accurately to realize intelligent welding control systems in the future.

1 Introduction

Gas metal arc welding (GMAW) is widely applied in modern manufacturing industries. To improve the weld quality of manual GMAW, welders correct either the welding parameters or the position of welding gun based on information from the welding pool surface acquired by sight and their expertise [1]. This means that the welding pool surface contains important visual information that allows skilled welders to control GMAW process. To decrease the workload and occupational risks for welders, much research has been done in the last few decades to transfer this inspection work to intelligent control systems [2].

Caprio [3] utilized the oscillation of the welding pool to estimate its penetration during the Laser Powder Bed Fusion process. Thomas [4] reconstructed the thermal images of cold metal transfer process to monitor the porosity defects and improper weld beads. Vasilev [5] utilized an ultrasonic thickness measurement system to control the welding current and welding speed. These methods have mostly focused on improving the welding process with information from one-dimensional or two-dimensional welding pool surface data. However, it is clear that three-dimensional (3D) information of the welding pool surface can better reflect the weld quality.

The 3D reconstruction method for welding pool can be categorized into the 3D structured light method [6], shape from shading method [7], and binocular stereo vision method [8]. To avoid arc interference during measurement, Zhang et al. [9] proposed a novel-structured laser vision system for reconstructing 3D welding pool surface. In their experiments, a structured laser was projected onto the welding pool surface and was specularly reflected onto a preset imaging plane. The images of the laser points on the plate were captured by camera. Finally, the geometry of the welding pool surface was reconstructed by the Edge-point algorithm and One-point algorithm [10]. However, the deviation of this method can reach a maximum of 1.22 mm. Chen [11] proposed a new image processing framework that could extract key characteristics of 3D welding pool geometry through its two-dimensional passive vision images. It was found that the support vector machine classification model was accurate enough to obtain the width and height of the welding pool. The camera’s exposure time had to be carefully adjusted for different welding currents, as it remarkably affected the image quality of the welding pool and the accuracy of the results. Zhong et al. [12] put forward measures to increase the effectiveness and accuracy of shape from shading (SFS) method. An improved algorithm was used to reconstruct the weld surface of aluminum alloy for gas tungsten arc welding (GTAW). However, the SFS method is difficult to be used in GMAW process as its spatters, arc interference, and welding stability is much worse than that in GTAW. An accurate and anti-jamming 3D reconstruction method for GMAW is still needed for the suppression of strong electromagnetic interference, high heat radiation interference, weld fumes, and other conditions.

The binocular stereo vision method is more accurate and efficient in acquiring reconstruction results. Mnich [13] attempted to use this method to reconstruct a GMAW pool, yet results showed that the binocular vision system needed to be improved, as extra-bright areas on pool surface could not be reconstructed. Richardson [14] used a particle image velocimeter (PIV) to track the movement of oxide particles to determine the velocity field on welding pool and then transferred the above data into a 3D fluid velocity by the stereo vision method. The calibration process was the main factor that influenced measurement precision during 3D reconstruction; the average distance error between the tested corner points was 0.14 mm. Liang [15] established a biprism stereo vision system to characterize weld penetration based on the geometry of the welding pool surface under pulsed GMAW. A two-step stereo matching algorithm was proposed to reconstruct the 3D shape of the welding pool surface. However, some regions on the surface were discontinuous, and the accuracy of this algorithm was not discussed. Xiong [16] also developed a biprism stereo vision system to reconstruct welding pool geometry in GMAW. A global-based iterative matching algorithm and triangle measuring method were optimized. The results were validated by a reconstructed standard cylinder with clear checkboard, yet the maximum height error was 4.15%, showing that the accuracy and usability of a global-based iterative matching algorithm for GMAW process without checkboard still needs to be verified and improved.

In this paper, an improved binocular vision system was developed to observe the welding pool surface. An automatic 3D reconstruction method was proposed to suppress arc interference on welding pool information. The mathematical models including detection, description and matching of feature points were established to effectively and robustly calculate the world coordinates of welding pool surface against different welding conditions. This study was foundation of on-line intelligent control of welding pool behavior for GMAW process.

2 Experimental System

The binocular vision system is given in Figure 1. Two Basler acA2000-165uc cameras (color) were used, and their CMOS sizes were both 2/3". The focal length of the camera lens was 35 mm with a 650-nm cut-off filter and 7# welding goggle. The aperture was F/5.6. Two cameras were connected with a synchronizer trigger to generate two synchronous images. The frequency of sampling was 200 frames/s with a resolution of 600×500 pixels and exposure time of 40 μs.

Figure 1
figure 1

Binocular vision system

The weld workpiece was fixed on a moveable workbench. Bead-on-plate welding tests were carried out on Q235 mild steel plates with dimensions of 250.0 mm × 70.0 mm × 5.0 mm. The welding wire material was H08Mn2Si with a diameter of 1.2 mm. The distance between the welding wire tip and workpiece surface was 18.0 mm. The chemical compositions of the welding wire and base metal are presented in Table 1. The direct current electrode negative mode was used, and the distance from the wire tip to the workpiece was 18.0 mm. Other welding parameters are listed in Table 2. The torch and cameras were stationary during the welding process. The workpiece and workbench were moved at a preset welding speed.

Table 1 Chemical compositions of welding wire and base metal (wt%)
Table 2 Welding parameters

3 Calibration System

In order to effectively avoid errors caused by perspective distortion and lens distortion (including radial distortion, centrifugal distortion, and prism distortion), a preset coordinate system for camera calibration [17] was used to establish the corresponding relationship between the world coordinates and the image coordinates via a calibration target with a point matrix at different positions. For example, Figure 1 illustrates how two cameras were fixed on a tripod. Figure 2 shows how the target paper was moved along the y direction from y=0.0 mm to y=20.0 mm with a moving step length of 0.5 mm. The target paper was captured by the two cameras at each position. The red dot on the target paper represents the coordinate origin of the point matrix. Finally, Oxyz and Ox0y0z0 were the world coordinate system and pixel coordinate system, respectively.

Figure 2
figure 2

The 3D point array

The world coordinates of black dots in each target image were calculated by their distance from the red dot. That is, when the black dot Pb was located in the a row to the left of the red dot, its world coordinate was xd= a×1 mm = a mm; when the black dot was located in the b row above the red dot, its world coordinate was zd = b×1 mm = b mm, as shown in Figure 3. It can be seen that the world coordinates of the black dots are [xd, yd, zd]=[a, ym, b]. When the world coordinates of all grid points on the graph were assigned, the world coordinate system of the calibration system was successfully constructed.

Figure 3
figure 3

Captured images of the calibration target by two cameras at ym

It is assumed that a point Pa was located in the area of calibration that was captured by two cameras, as shown in Figure 4.

Figure 4
figure 4

A schematic of the mapping between the two image coordinates and their world coordinates

The world coordinates of Pa could be reconstructed from the following procedure:

(1) The image coordinates of Pa were P1(X1, Y1) in camera 1 and P2(X2, Y2) in camera 2. The four nearest neighbors of P1 in the captured image were A(xA, yA), B(xB, yB), C(xC, yC), and D(xD, yD). Their world coordinates, which were calibrated prior, were A′(xA', yp, zA'), B′(xB', yp, zB'), C′(xC', yp, zC'), and D′(xD', yp, zD'). P′(xyp, yp, zyp) could be calculated by the four nearest neighbors of P1 as follows:

$$\begin{gathered} \left[ {\begin{array}{*{20}c} {x_{{A^{\prime}}} } & {x_{{B^{\prime}}} } & {x_{{C^{\prime}}} } & {x_{{D^{\prime}}} } \\ {z_{{A^{\prime}}} } & {z_{{B^{\prime}}} } & {z_{{C^{\prime}}} } & {z_{{D^{\prime}}} } \\ \end{array} } \right] \times \left[ {\begin{array}{*{20}c} {X_{A} } & {X_{B} } & {X_{C} } & {X_{D} } \\ {Y_{A} } & {Y_{B} } & {Y_{C} } & {Y_{D} } \\ \end{array} } \right]^{ - 1} \hfill \\ \times \left[ {\begin{array}{*{20}c} {X_{1} } \\ {Y_{1} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {x^{\prime}_{yp} } \\ {z^{\prime}_{yp} } \\ \end{array} } \right]. \hfill \\ \end{gathered}$$

Similarly, all the corresponding world coordinates of P2 at 41 planes can be obtained as P′′(x′′yp, yp, z′′yp)(yp=0.5i, i = –20, –19, …, 19, 20).

(2) The distance Dyp between P′(x′yp, yp, z′yp) and P′′(x′′yp, yp, z′′yp) at all the y-planes was calculated as follows:

$$\begin{gathered} D_{yp} = \sqrt {(x^{\prime}_{yp} - x^{\prime\prime}_{yp} )^{2} + (z^{\prime}_{yp} - z^{\prime\prime}_{yp} )^{2} } , \hfill \\ (y_{p} = 0.5i,\,\;i = - 20,\; - 19, \cdots ,\;19,\;20). \hfill \\ \end{gathered}$$

When Pa was in a certain y-plane (y = 0.5i), P′ and P′′ overlapped, and Dyp was 0. Then x′yp = x′′yp and z′yp = z′′yp; the world coordinates of Pa were (x'yp, yp, z'yp). Otherwise, Pa was located between two adjacent plane matrices where the sum of Dyp1 and Dyp2 represented the minimum. If yp1=0.5i and yp2=0.5(i+1), the world coordinates were as follows:

$$x = \frac{1}{2}(x^{\prime\prime}_{yp1} + x^{\prime}_{yp1} ),$$
$$y = \frac{{(z^{\prime}_{yp2} - z^{\prime\prime}_{yp2} ) \times y_{p1} + (z^{\prime\prime}_{yp1} - z^{\prime}_{yp2} ) \times y_{p2} }}{{(z^{\prime}_{yp2} - z^{\prime\prime}_{yp2} ) + (z^{\prime\prime}_{yp1} - z^{\prime}_{yp2} )}},$$
$$z = \frac{{(z^{\prime}_{yp2} - z^{\prime\prime}_{yp2} ) \times z^{\prime}_{yp1} + (z^{\prime\prime}_{yp1} - z^{\prime}_{yp2} ) \times z^{\prime\prime}_{yp2} }}{{(z^{\prime}_{yp2} - z^{\prime\prime}_{yp2} ) + (z^{\prime\prime}_{yp1} - z^{\prime}_{yp2} )}},$$

where P′yp1 and P′yp2 were points captured by camera 1, and their world coordinates are (x′yp1, yp1, z′yp1) and (x′yp2, yp2, z′yp2), respectively. Similarly, P′′yp1 and P′′yp2 were points captured by camera 1, and their world coordinates were (x′′yp1, yp1, z′′yp1) and (x′′yp2, yp2, z′′yp2), respectively. Thus, the world coordinates of point Pa were successfully achieved. Verification experiments showed that the maximum relative error between the recovered and measured results was less than 0.6% [17].

4 Pre-preprocessing of Welding Pool Image

The size of the welding pool under the arc is small, and there are various noises in the process of image acquisition and quantification. In addition, the existence of a strong arc will also affect the extracting and matching processes of feature points, which decreases the accuracy of 3D reconstruction. Therefore, image processing algorithms, which mainly include gray-level transformations, image filtering, and edge detection, are needed to pre-process the welding pool images [18]. Figure 5 shows the original images of the welding pool under the arc. Figure 6 shows the processed images, which can be used for feature-point detection.

Figure 5
figure 5

Captured images of the welding pool under the arc (240 A, 33 V, 0.6 m/min)

Figure 6
figure 6

Pre-processed images of the welding pool under the arc (240 A, 33 V, 0.6 m/min)

5 SURF-BRISK-KAZE Feature Point Matching Algorithm

Finding the matching feature points accurately and efficiently is a high priority for welding pool surface reconstruction. The algorithms need to adapt to various requirements, such as the small scale of the welding pool, affine transformation of camera angles, noise interference of spatter and arc electromagnetism, severe illumination of arc, and nonlinear image distortion.

5.1 SURF Feature Point Detection

For some welding pool images captured by cameras 1 and 2, box filters [19] were used to approximately replace the Gaussian second order derivatives. The different box filters were selected to establish the scale space for feature-point detection. The feature points were extracted by a Hessian matrix and a threshold on each scale space layer. For an image I(x, y), a Hessian matrix with scale \(\sigma\) at point (x, y) was defined as follows:

$$H = \left[ {\begin{array}{*{20}c} {L_{xx} (x,y,\lambda )} & {L_{xy} (x,y,\lambda )} \\ {L_{yx} (x,y,\lambda )} & {L_{yy} (x,y,\lambda )} \\ \end{array} } \right],$$

where Lxx is the convolution of the Gaussian second order partial derivative (Gaussian filter), and the input image I(x, y) at point (x, y), the meaning of Lxy, Lyy, and Lyx are the same as Lxx. The Gaussians is defined as

$$g(\lambda ) = 0.5\exp ( - \frac{{x^{2} + y^{2} }}{{2\lambda^{2} }})/(2{\uppi }\lambda^{2} ).$$

To facilitate the calculation, the convolution Dxx, Dxy, and Dyy of the box filters and the input image was used to replace Lxy, Lyy, and Lyx to construct the fast Hessian matrix. Box filters in size 9×9 were substituted by the Gaussian filter when parameter λ was 1.2. The relationship between Dxx, Dyy and Lxx, Lyy is as follows:

$$\omega = \frac{{\left\| {L_{xy} (1.2)} \right\|_{F} \left\| {D_{xx} (9)} \right\|_{F} }}{{\left\| {L_{xx} (1.2)} \right\|_{F} \left\| {D_{xy} (9)} \right\|_{F} }} \approx 0.9,$$

where ||∙||F is the Frobenius norm, and ω is used to balance the relative weights in the expression for the Hessian’s determinant. In practical application, a value of 0.9 was adopted to obtain an approximate Hessian matrix determinant:

$$\det (H_{essian} ) = D_{xx} D_{yy} - (0.9D_{xy} )^{2} .$$

A threshold was needed to estimate the feature points detected by the Hessian matrix. When the value of Hessian’s determinant for the test point was greater than this threshold, a non-maximum suppression in a 3×3×3 neighborhood was applied. When the value of a text point was greater than that of 26 neighbor points, it was selected as an optimal feature point. When the maxima detected by Hessian matrix was less than the threshold, this maximum was excluded, so the speed of maxima detection was accelerated. Figure 7 shows the feature points extracted by the speeded up robust features (SURF) algorithm. It can be seen that the SURF feature points (red dots) were evenly distributed in the welding pool, having the advantage of affine invariance; however, it cannot fully reflect the edge profile.

Figure 7
figure 7

SURF feature points of the welding pool image under the arc (240 A, 33 V, 0.6 m/min)

5.2 BRISK Feature Point Detection

In order to obtain the characteristic of scale invariance on the edges of the welding pool, the scale space was composed of four inner layers ci and four middle layers di(i=0,1,2,3) in the frame structure of binary robust invariant scalable keypoint (BRISK) feature detection [20]. Each inner layer image was obtained by 0.5-times down sampling of the previous inner layer image, where the original image corresponded to the c0 layer. Each intermediate layer di was located between two adjacent inner layers ci and ci+1. The first intermediate layer d0 was obtained by 1.5-times down sampling of the original image c0, and the remaining intermediate layers were obtained by 0.5-times down sampling of the previous intermediate layer. It was assumed that δ represented the scale, where δ(ci)=2i and δ(di)=1.5×2i.

After the scale space of BRISK was constructed, the feature points were extracted on each scale. The extreme points detected in the spatial domain and the scale domain were regarded as feature points, so the BRISK corner points had scale invariance for the welding pool edges. BRISK feature point detection was determined by the following equation:

$$N = \sum {\left| {I(x) - I(R)} \right|} < \varepsilon ,$$

where I(R) was the gray value of the central pixel R, I(x) was the gray value of the pixel points surrounding P, and ε was the threshold, which was 0.00001 in this study. If N was greater than ε, the candidate point was the feature point. BRISK feature points on the welding pool images are shown in Figure 8. It can be seen that the BRISK feature points reflected the edge profile well. However, the resolution on each layer of scale space calculated by the SURF and BRISK algorithms was blurry, which affected the accuracy of feature points. It also affected the number of feature points detected in the blurry area, created by welding pool oscillation.

Figure 8
figure 8

BRISK feature points of the welding pool image under the arc (240 A, 33 V, 0.6 m/min)

5.3 KAZE Feature Point Detection

The additive operator splitting algorithm and the variable conductance diffusion method were used to construct nonlinear scale space in this study. The logarithmic steps were used in nonlinear scale space to generate the O group and S group [21]. Different groups and layers were identified by group index o and layer index s, respectively. The corresponding relationship between group, layer and scale parameter σ was expressed as follows:

$$\begin{gathered} \sigma_{i} (o,s) = \sigma_{0} 2^{o + s/S} ,o \in [0, \cdots ,O - 1], \hfill \\ s \in [0, \cdots ,S - 1],i \in [0, \cdots ,O \times S - 1], \hfill \\ \end{gathered}$$

where σ0 was the initial scale parameter, which flattens the original image to reduce the noise induced by the arc’s magnetic field. The relationship between time parameter t and scale parameter σi is:

$$t_{i} = \frac{1}{2}\sigma_{i}^{2} ,i \in [0, \cdots ,O \times S - 1].$$

The nonlinear scale space is expressed as:

$$L^{i + 1} = (I - (t_{i + 1} - t_{i} ) \cdot \sum\limits_{i = 1}^{m} {A_{i} L^{i} } )^{ - 1} L^{i} ,$$

where L is the luminance of the image and Ai is a matrix that encodes the image conductivities for each dimension.

Combining Eqs. (11) and (13), the KAZE feature points on two images captured by cameras 1 and 2 are shown in Figure 9. The results show that this method was effective to extract feature points from smooth surface with small brightness differences, especially for the white-dotted-line area.

Figure 9
figure 9

KAZE feature points of the welding pool image under the arc (240 A, 33 V, 0.6 m/min)

5.4 SURF-BRISK-KAZE Feature Description and Matching

In order to match the feature points in the SURF, BRLSK, and KAZE methods, a descriptor was determined for the extracted points. The details of the descriptor, which is a 64-dimensional vector, has been discussed in reference [19].

The 64-dimensional vector obtained by the above method only included grayscale information of the welding pool images; the color information of images were not considered, which might decrease the matching accuracy of welding pool images.

The information of the original welding pool images (color) was much richer than the pre-processed welding pool images (gray) in Section 4. The signal interference was also included in the original images. To utilize the color information of the original images effectively, only the information of feature points in Sections 5.1‒5.3 and their eight neighborhoods were used to improve the above 64-dimensional vector. The steps were as follows:

(1) It was assumed that r, g, b were the values of color information for the feature point (x, y). The value nc was the total number of the feature points (x, y) and their neighborhood pixels. Here, r, g, b[0, 255], and nc was set at 9 in this study.

(2) Calculating the mean values μr, μg, and μb of the feature point (x, y) and its eight neighborhoods. The value E was the 3D vector composed by μr, μg, and μb. The equations were detailed as follows:

$$\left\{ {\begin{array}{*{20}l} {\mu_{r} = \frac{1}{n}\sum\limits_{i = 1}^{n} {r_{i} } ,} \hfill \\ {\mu_{g} = \frac{1}{n}\sum\limits_{i = 1}^{n} {g_{i} } ,} \hfill \\ {\mu_{b} = \frac{1}{n}\sum\limits_{i = 1}^{n} {b_{i} } ,} \hfill \\ {E = (\mu_{r} ,\mu_{g} ,\mu_{b} ).} \hfill \\ \end{array} } \right.$$

(3) Calculating the variances δr, δg, and δb of the feature point (x, y) and its eight neighborhoods as follows. F was the 3D vector composed by δr, δg, and δb.

$$\left\{ {\begin{array}{*{20}l} {\delta_{r} = \frac{1}{n}\sum\limits_{i = 1}^{n} {(r_{i} - \mu_{r} )^{2} } ,} \hfill \\ {\delta_{g} = \frac{1}{n}\sum\limits_{i = 1}^{n} {(g_{i} - \mu_{g} )^{2} } ,} \hfill \\ {\delta_{b} = \frac{1}{n}\sum\limits_{i = 1}^{n} {(b_{i} - \mu_{b} )^{2} } ,} \hfill \\ {F = (\delta_{r} ,\delta_{g} ,\delta_{b} ).} \hfill \\ \end{array} } \right.$$

(4) Normalizing vector E and F, respectively.

$$\left\{ {\begin{array}{*{20}c} {E_{n} = \left( {\frac{{\mu_{r} }}{255},\frac{{\mu_{g} }}{255},\frac{{\mu_{{_{b} }} }}{255}} \right),} \\ {F_{n} = \left( {\frac{{\delta_{r} }}{255},\frac{{\delta_{g} }}{255},\frac{{\delta_{{_{b} }} }}{255}} \right).} \\ \end{array} } \right.$$

(5) The normalized mean value vector En and variance vector Sn were combined to form a six-dimensional RGB color classification descriptor vector Vc.

$$V_{c} = \left( {\frac{{\mu_{r} }}{255},\frac{{\mu_{g} }}{255},\frac{{\mu_{{_{b} }} }}{255},\frac{{\delta_{r} }}{255},\frac{{\delta_{g} }}{255},\frac{{\delta_{{_{b} }} }}{255}} \right).$$

(6) By substituting Eq. (17) into Vs=(i1, i2,…, i64), the descriptor V was expressed as follows:

$$V = \left( {i_{1} ,i_{2} , \cdots ,i_{64} ,\frac{{\mu_{r} }}{255},\frac{{\mu_{g} }}{255},\frac{{\mu_{{_{b} }} }}{255},\frac{{\delta_{r} }}{255},\frac{{\delta_{g} }}{255},\frac{{\delta_{{_{b} }} }}{255}} \right).$$

The matching process was carried out as follows. A feature point P1 (image of camera 1) was compared with a feature point P2 (image of camera 2) by calculating the Euclidean distance between their descriptor V. The Euclidean distance is:

$$d_{E} = \sqrt {\left( {i^{\prime}_{1} - i^{\prime\prime}_{1} } \right)^{2} + \left( {i^{\prime}_{2} - i^{\prime\prime}_{2} } \right)^{2} + \cdot \cdot \cdot + \left( {\frac{{\delta^{\prime}_{g} }}{255} - \frac{{\delta^{\prime\prime}_{g} }}{255}} \right)^{2} + \left( {\frac{{\delta^{\prime}_{b} }}{255} - \frac{{\delta^{\prime\prime}_{b} }}{255}} \right)^{2} } ,$$

where V′=(i1, i2,…, δg/255, δb/255) is the descriptor of P1 and V″=(I″1, I″2, …, δ″g/255, δ″b/255) is the descriptor of P2. A matching pair was detected if the ratio of their distance to the distance of the second nearest neighbor was less than 0.6.

Figure 10 shows the matching pairs obtained by the SURF-BRISK-KAZE algorithms whose descriptor was 64-dimensional. Figure 11 shows the matching pairs obtained by the improved 70-dimensional descriptor V. Though the number of mismatching pairs was decreased because of the use of V, some mismatching pairs still remained, and mismatch elimination was needed.

Figure 10
figure 10

Traditional SURF-BRISK-KAZE matching pairs and mismatching area (240 A, 33 V, 0.6 m/min)

Figure 11
figure 11

Improved SURF-BRISK-KAZE matching pairs and mismatching area (240 A, 33 V, 0.6 m/min)

6 Improved RANSAC Algorithm

There were some mismatching pairs (gross errors) in the above matching pairs. A random sample consensus algorithm was used to eliminate the mismatches in this work. The traditional RANSAC algorithm can complete the estimation for the model parameters through data iteration and result validation. It can also reduce the number of gross errors to increase the matching accuracy [22]. However, the traditional RANSAC algorithm also has shortcomings. For example, the iteration time of the traditional RANSAC algorithm depends on the experimental data and the valid data rate. When the number of valid data is small, the number of iterations of the algorithm will increase exponentially because of a large number of mismatching pairs, which greatly increases the running time of the algorithm. Meanwhile, the initial model parameters are calculated based on a subset of the experimental data. When the valid data rate of the subset is not high, the initial model will be extremely unreasonable, and the verification of this unreasonable model will consume a lot of time, which seriously affects the overall efficiency of the algorithm. The traditional RANSAC algorithm was improved as follows.

6.1 Data Preprocessing

The data pre-processing model was defined as M(P, C). P was the matching pairs (P1, P2) obtained by Section 5.4, which was written as:

$$P = \left\{ {P_{1} [i],P_{2} [j]|i = j = 0,\;1, \cdots ,\;n - 1} \right\}.$$

C was the criterion for eliminating mismatching point pairs of P, and it was expressed as:

$$C = (\left| {k_{i} - k} \right| \le \Delta k)\& \& (\left| {d_{i} - d} \right| \le \Delta d), \, i = 0,\;1, \cdots ,\;n - 1,$$

where ki is the ratio calculated by pixel coordinates, and k is the median of ki. Δk is the threshold value, which was 0.9 in this study. If di* is the square root calculated by pixel coordinates, di is the top 80% of di* value from small to large. d is the median of di, and Δd is the threshold value, which was 13 in this study. The ki and di were expressed as:

$$k_{i} = \frac{{y_{j} - y_{i} }}{{x_{j} - x_{i} }},$$
$$d_{i} = \sqrt {(x_{j} - x_{i} )^{2} + (y_{j} - y_{i} )^{2} } ,$$

where xi and yi are the pixel coordinates of P1[i] captured by camera 1, and xj and yj are the pixel coordinates of P2[i] captured by camera 2.

To increase the proportion of valid data in the data set of P, all the matching pairs were first calculated based on Eqs. (22) and (23). Any matching pair that did not satisfy Eq. (21) was deleted from the data set of P, and others were selected as a new data set of Q. The Q contained enough matching pairs, which can be used to estimate the truest homography matrix parameter model as much as possible. The proportion of valid data in the Q became larger, which greatly reduces the number of iterations to calculate the maximum valid data. As a result, the efficiency of the algorithm was improved. The above work completed the pre-purification of the raw data for the RANSAC algorithm. Figure 12 shows the matching pairs pre-processed by the data pre-processing model. Compared to Figure 11, the number of mismatching pairs was decreased.

Figure 12
figure 12

Matching pairs preprocessed by the data preprocessing model (240 A, 33 V, 0.6 m/min)

6.2 Pre-test Model

In the traditional RANSAC algorithm, the subset (S) is randomly extracted from experimental data (P) of Section 5.4, and the corresponding initial model is estimated by the subset. The initialization model is tested by all the remaining matching points (CPS), which are not belong to the above subset. This circulative progress needs much verification time because many initialization models are not reasonable, especially when the proportion of mismatching pairs is high.

To increase efficiency of the RANSAC algorithm, a pre-testing model was proposed to estimate the initial models before the traditional testing of RANSAC algorithm. The steps of the pre-testing model were as follws.

(1) n pairs of matching points were randomly selected from the new data set (Q) of Section 6.1.

(2) The improved initial model T was estimated by m pairs of matching points, which was written as follows, where m n:

$$T = \left[ {\begin{array}{*{20}c} {t_{11} } & {t_{12} } & {t_{13} } \\ {t_{21} } & {t_{22} } & {t_{23} } \\ {t_{31} } & {t_{32} } & {t_{33} } \\ \end{array} } \right],$$
$$\beta \left[ {\begin{array}{*{20}c} {x_{1} } \\ {y_{1} } \\ 1 \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {t_{11} } & {t_{12} } & {t_{13} } \\ {t_{21} } & {t_{22} } & {t_{23} } \\ {t_{31} } & {t_{32} } & {t_{33} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {x_{2} } \\ {y_{2} } \\ 1 \\ \end{array} } \right],$$

where, t11, t12,…, t33 are the parameters of T, (x1, y1) is the pixel coordinate of P1[i], and (x2, y2) is the pixel coordinate of P2[j]. m is the number of matching pairs needed to solve the Eq. (25), which was 4 in this study.

(3) The remaining n-m pairs were used to verify the reasonability of T by the following equation:

$$E_{s} = \sum\limits_{i = 1}^{n - m} {\left[ {\left( {x_{1i} \frac{{t_{11} x_{2i} + t_{12} y_{2i} + t_{13} }}{{t_{31} x_{2i} + t_{32} y_{2i} + t_{33} }}} \right)^{2} + \left( {y_{1i} \frac{{t_{21} x_{2i} + t_{22} y_{2i} + t_{23} }}{{t_{31} x_{2i} + t_{32} y_{2i} + t_{33} }}} \right)^{2} } \right]} ,$$

where Es is test error calculated by the matching pairs of Cnm. The threshold tp was 0.6 in this study.

(4) If the Es values of all n-m pairs were less than tp, the initial model T was started to be tested by the remaining matching points (CQn) of Section 6.1 as the traditional RANSAC algorithm; otherwise, the initialization model T was discarded directly, and n pairs of matching points were re-selected for the next cycle of iterative estimation.

The unreasonable initial models were discarded quickly by the above calculation, which reduced the detection time to verify the initial model. A flow chart of the improved RANSAC algorithm is shown in Figure 13. The real detection time of traditional and improved RANSAC algorithms are discussed in Section 6.3. Figure 14 is the matching point pairs of the welding pool under the arc processed by the improved RANSAC algorithm.

Figure 13
figure 13

Flow chat of the improved RANSAC algorithm

Figure 14
figure 14

Matching pairs of the improved RANSAC algorithm (240 A, 33 V, 0.6 m/min)

6.3 Estimation of Commutating Time

To obtain at least one subset that could pass through the verification of the traditional RANSAC algorithm under a certain confidence p (0.95‒0.99), the minimum sampling times k must meet the following requirement:

$$1 - (1 - (1 - \omega )^{{m_{t} }} )^{k} = P,$$

where ω is the proportion of the mismatching pairs in P, and mt is the minimum sample number for all subsets to calculate the initial model parameters of verification which was 4 in this study.

Supposing that the number of P is nP, it takes ts seconds to conduct a random sampling from the data set, and it takes tc seconds to calculate the initial model. tj is the time to test the initial model by one couple of remaining matching pairs. The total time to test the initial model is (nP−4)tj. Therefore, the calculation time Tt of traditional RANSAC algorithm with k random sampling times is:

$$T_{t} = k(t_{s} + t_{c} ) + k(n_{P} - m_{t} )t_{j} .$$

In this study, the average k was 500, ts was 2.21×10−4 s, tc was 5.45×10−5 s, and tj was 3.67×10−5 s. The computer processor used was an Intel(R) Core(TM) i7-9750H CPU @ 2.60 GHz with a memory of 8 GB.

For the improved RANSAC algorithm, the minimum sampling times k′ must meet Eq. (27) as well, where mt is n mentioned in Section 6.2. Eq. (28) is modified as:

$$T^{\prime} = k^{\prime}(t_{s} + t_{c} ) + k^{\prime}(n - m)t_{j} + k^{\prime}(n_{Q} - n)(1 - \omega_{Q} )^{n} t_{j} ,$$

where T′ is the calculation time of the improved RANSAC algorithm, nQ is the number of Q, and ωQ is the proportion of the mismatching pairs in Q.

The time reduced by the improved RANSAC algorithm is as follows:

$$\begin{gathered} \Delta T = T_{t} - T^{\prime} = (k - k^{\prime})(t_{s} + t_{c} ) + \hfill \\ \left[ {k(n_{P} - m_{t} ) - k^{\prime}(n - m) - k^{\prime}(n_{Q} - n)(1 - \omega_{Q} )^{n} } \right]t_{j} . \hfill \\ \end{gathered}$$

In this study, k′ was 1500. Though the total time of ts and tc was increased if k′ was larger than k, it was an order of magnitude smaller than the time for testing the initial model. Therefore, the improved RANSAC algorithm reduced a series of unnecessary operations and improved the efficiency of the algorithm. While its calculation accuracy was not smaller than that of the traditional RANSAC algorithm, the values of k′, n, and m were not optimal yet and can be optimized in the future.

Compared to the traditional RANSAC algorithm, the calculating efficiency in the improved RANSAC algorithm was increased by at least 160% in this study, as shown in Table 3, which showed a large potential to reduce the calculating time.

Table 3 Comparison between the improved RANSAC algorithm and the traditional RANSAC algorithm

7 Reconstruction Results and Validation

The GMAW experiments were carried out to capture pairs of the welding pool images under the arc. The parameters of the welding process are shown in Table 2. The feature-matching points on each pair of welding pool images were extracted according to Sections 5 and 6. The world coordinates of the welding pool surface were reconstructed according to Section 3. The results for 240 A and 33 V are shown in Figure 15.

Figure 15
figure 15

Point cloud of the welding pool under the arc

7.1 Surface Reconstruction

The way to transfer the point cloud into the 3D reconstructed surface was through the LOWESS regression smoothing filtering algorithm, which was a local scatter weighted estimation algorithm depending on the adjacent points [23]. The algorithm added a sliding window on the basis of the least square method. If the sliding window width was large, there would be more scattered points in the covered window, resulting in a smooth surface of welding pool and loss of large original data information. On the contrary, if the sliding window width was small, the welding pool surface was rough, which also increased the accuracy of reconstructed surface.

In this experiment, the window width was selected as 6 pixels based on the comparative tests, which provided high accuracy for the 3D reconstructed data and produced a smoothly reconstructed surface without trimming. The point with the smallest z value in the point cloud was selected as the center of the welding pool. It is assumed that the world coordinate of the point was (x0, y0, z0), and the final surface reconstruction was the symmetric treatment of the smoothly reconstructed surface in the plane where y=y0. The final surface reconstruction of Figure 15 is shown in Figure 16. It can be seen that the reconstructed welding pool surface was consistent with the actual welding pool surface in GMAW.

Figure 16
figure 16

3D reconstruction of the welding pool under the arc

7.2 Validation

In order to verify the validity of the above feature point extracting and matching algorithms for welding pool images pairs during GMAW, a verification experiment was carried out. The diameter of silicon nitride tracer particle was a little smaller than that of blind hole, so the tracer particle could be embedded tightly in the workpiece as shown in the Figure 17. These tracer particles would float on the surface of the welding pool when the solid metal near the blind hole was melted. This phenomenon was captured by the above binocular vision system, as shown in Figure 18. The feature points at the junction of the tracer particle and the welding pool can be selected manually, as shown in Figure 18 (yellow dots). It was clear that the yellow dots were the correct feature matching points. Their world coordinate values, calculated directly by Section 3, can be used to verify the accuracy of their neighbor feature matching points (red dots in Figure 18) extracted by the above improved algorithms. Here, the heights of the yellow dots and red dots were assumed to be similar as they were close to each other on the smooth surface of the welding pool, though their world coordinates in the x and y directions were different.

Figure 17
figure 17

Tracer particles embedded in workpiece

Figure 18
figure 18

Tracer particle verification (240 A, 33 V, 0.6 m/min)

Table 4 provides a comparison between the heights of the yellow dots and red dots. The maximum absolute error in the z direction was less than 0.07 mm, and the maximum relative error was smaller than 6.0%. All the results indicated that the proposed algorithms mentioned above can reconstruct the 3D surface of welding pool for GMAW with high efficiency.

Table 4 Comparison between the tracer particle and the matching point pairs

8 Conclusions

  1. (1)

    A new method was proposed to determine the feature matching of welding pool captured by a binocular stereo vision system. It included improved SURF-BRISK-KAZE algorithms, improved RANSAC algorithms, and surface reconstruction algorithms, which realized the 3D reconstruction of the welding pool surface under the GMAW arc.

  2. (2)

    The feature points descriptor was improved by considering the color information of images, which increased the matching accuracy of welding pool images. A data preprocessing and pre-test model were added to the RANSAC algorithm to improve the calculating efficiency of the algorithm.

  3. (3)

    The experimental results exhibited high accuracy and efficiency of the new method in reconstruction of welding pool surface. For the reconstructed data, the maximum relative error was smaller than 6.0%.

  4. (4)

    The quantitative relationship between welding pool surface and weld quality should be established based on the experience of a skilled welder, and the parallel computing method should be improved by the CUDA to realize the real-time 3D reconstruction of welding pool surface in the future.


  1. Y Liu, Y Zhang. Control of 3D welding pool surface. Control Engineering Practice, 2013, 21(11): 1469-1480.

    Article  Google Scholar 

  2. B C Wang, S J Hu, L Sun, et al. Intelligent welding system technologies: State-of-the-art review and perspectives. Journal of Manufacturing Systems, 2020, 56: 373-391.

    Article  Google Scholar 

  3. L Caprio, A G Demir, B Previtali. Observing molten pool surface oscillations during keyhole processing in laser powder bed fusion as a novel method to estimate the penetration depth. Additive Manufacturing, 2020, 36: 101470.

  4. K R Thomas, S Unnikrishnakurup, P V Nithin, et al. Online monitoring of cold metal transfer (CMT) process using infrared thermography. Quantitative Infrared Thermography Journal, 2017, 14(1): 68-78.

    Article  Google Scholar 

  5. M Vasilev, C MacLeod, Y Javadi, et al. Feed forward control of welding process parameters through on-line ultrasonic thickness measurement. Journal of Manufacturing Processes, 2021, 64: 576-584.

    Article  Google Scholar 

  6. R Zong, J Chen, C S Wu, et al. Undercutting formation mechanism in gas metal arc welding. Welding Journal, 2016, 95: 174s-184s.

    Google Scholar 

  7. L Yang, E Li, T Long, et al. A welding quality detection method for arc welding robot based on 3D reconstruction with SFS algorithm. International Journal of Advanced Manufacturing Technology, 2018, 94: 1209-1220.

    Article  Google Scholar 

  8. J Chen, Z L Feng. Strain and distortion monitoring during arc welding by 3D digital image correlation. Science and Technology of Welding and Joining, 2018, 23(6): 536-542.

    Article  Google Scholar 

  9. W J Zhang, J Xiao, Y M Zhang. A mobile sensing system for real-time 3D welding pool surface measurement in manual GTAW. Measurement Science and Technology, 2016, 27: 045102.

  10. J K Huang, W Pan, J S Chen, et al. The transient behaviours of free surface in a fully penetrated welding pool in gas tungsten arc welding. Journal of Manufacturing Process, 2018, 36: 405-416.

    Article  Google Scholar 

  11. Z Chen, J Chen, Z Feng. 3D welding pool surface geometry measurement with adaptive passive vision images. Welding Journal, 2019, 98: 379s-386s.

    Article  Google Scholar 

  12. J Zhong, C Yang, Y Xu, et al. Research on reconstruction of welding pool surface based on shape from shading during robot aluminum alloy pulse GTAW. Advances in Intelligent Systems and Computing, 2015, 363: 525-538.

    Article  Google Scholar 

  13. C Mnich, F Al Bayat, C Debrunner, et al. In situ welding pool measurement using stereovision. Japan-USA Symposium on Flexible Automation, Denver, USA, 2004.

  14. Z Liang, H Chang, Q Wang, et al. 3D reconstruction of welding pool surface in pulsed GMAW by passive biprism stereo vision. IEEE Robotics and Automation Letters, 2019, 4(3): 3091-3097.

    Article  Google Scholar 

  15. J Xiong, Y Liu, Z Yin. Passive vision measurement for robust reconstruction of molten pool in wire and arc additive manufacturing. Measurement, 2020, 153:107407.

  16. L Wang, J Chen, X H Fan, et al. Influence of fluid flow on humping bead during high-speed GMAW. Welding Journal, 2019: 315-327.

  17. R Mehrotra, S M Zhan. Computational approach to zero-crossing-based two-dimensional edge detection. Graphical Models and Image Processing, 1996, 58(1): 1-17.

    Article  Google Scholar 

  18. T Janumala, K B Ramesh. Development of an algorithm for vertebrae identification using speeded up robust features (SURF) technique in scoliosis x-ray images. Image Processing and Capsule Networks(ICIPCN 2020), Bangkok, Thailand, May 6-7, 2020: 54-62.

  19. A Khare, B R Mounika, M Khare. Keyframe extraction using binary robust invariant scalable keypoint features. Twelfth International Conference on Machine Vision (ICMV 2019), Amsterdam, Netherlands, Nov 16-18, 2019: UNSP 1143308.

  20. B Ramkumar, R Laber, H Bojinov, et al. GPU acceleration of the KAZE image feature extraction algorithm. Journal of Real-Time Image Processing, 2020, 17(5): 1169-1182.

    Article  Google Scholar 

  21. H C Shih, C H Ma, C L Lu. An efficient fragment reconstruction using RANSAC algorithm. 2019 IEEE 8th Global Conference on Consumer Electronics, Osaka, Japan, Oct. 15-18, 2019: 529-530.

  22. W S Cleveland. Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 1979, 74(368): 829-836.

    Article  MathSciNet  Google Scholar 

Download references


Not applicable.


Supported by National Natural Science Foundation of China (Grant No. 51775313), Major Program of Shandong Province Natural Science Foundation (Grant No. ZR2018ZC1760), and Young Scholars Program of Shandong University (Grant No. 2017WLJH24).

Author information

Authors and Affiliations



JC was in charge of the whole research; ZG and JC wrote the manuscript; CW assisted with sampling and laboratory analyses. All authors read and approved the final manuscript.

Authors’ information

Zunan Gu, born in 1995, is currently a master candidate at Key Laboratory for Liquid-Solid Structural Evolution & Processing of Materials, Ministry of Education, Institute of Materials Joining, Shandong University, China.

Ji Chen, born in 1982, is currently a professor at Key Laboratory for Liquid-Solid Structural Evolution & Processing of Materials, Ministry of Education, Institute of Materials Joining, Shandong University, China. He received his PhD degree from Shandong University, China, in 2009. His research interests include intelligent control in welding processes and numerical simulation.

Chuansong Wu, born in 1959, is currently a professor at Key Laboratory for Liquid-Solid Structural Evolution & Processing of Materials, Ministry of Education, Institute of Materials Joining, Shandong University, China. He received his PhD degree from Harbin Institute of Technology, China, in 1988. His main research interests include intelligent control in welding processes, solid-state welding, and numerical simulation.

Corresponding author

Correspondence to Ji Chen.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, Z., Chen, J. & Wu, C. Three-Dimensional Reconstruction of Welding Pool Surface by Binocular Vision. Chin. J. Mech. Eng. 34, 47 (2021).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: