 Original Article
 Open access
 Published:
Improving Ultrasonic Testing by Using Machine Learning Framework Based on Model Interpretation Strategy
Chinese Journal of Mechanical Engineering volumeÂ 36, ArticleÂ number:Â 127 (2023)
Abstract
Ultrasonic testing (UT) is increasingly combined with machine learning (ML) techniques for intelligently identifying damage. Extracting significant features from UT data is essential for efficient defect characterization. Moreover, the hidden physics behind ML is unexplained, reducing the generalization capability and versatility of ML methods in UT. In this paper, a generally applicable ML framework based on the model interpretation strategy is proposed to improve the detection accuracy and computational efficiency of UT. Firstly, multidomain features are extracted from the UT signals with signal processing techniques to construct an initial feature space. Subsequently, a feature selection method based on model interpretable strategy (FSMIS) is innovatively developed by integrating Shapley additive explanation (SHAP), filter method, embedded method and wrapper method. The most effective ML model and the optimal feature subset with better correlation to the target defects are determined selfadaptively. The proposed framework is validated by identifying and locating sidedrilled holes (SDHs) with 0.5Î» central distance and different depths. An ultrasonic array probe is adopted to acquire FMC datasets from several aluminum alloy specimens containing two SDHs by experiments. The optimal feature subset selected by FSMIS is set as the input of the chosen ML model to train and predict the times of arrival (ToAs) of the scattered waves emitted by adjacent SDHs. The experimental results demonstrate that the relative errors of the predicted ToAs are all below 3.67% with an average error of 0.25%, significantly improving the time resolution of UT signals. On this basis, the predicted ToAs are assigned to the corresponding original signals for decoupling overlapped pulseechoes and reconstructing highresolution FMC datasets. The imaging resolution is enhanced to 0.5Î» by implementing the total focusing method (TFM). The relative errors of hole depths and central distance are no more than 0.51% and 3.57%, respectively. Finally, the superior performance of the proposed FSMIS is validated by comparing it with initial feature space and conventional dimensionality reduction techniques.
1 Introduction
Recently, the demand for defect characterization and damage identification in materials/structures has been growing in various industrial applications, such as aerospace, nuclear, oil and gas, to ensure high performance and safety [1]. To this end, ultrasonic nondestructive testing (UT) is widely used owing to low cost, low power consumption and no change to materials/structures [2]. The scattered/diffracted/reflected waves are employed by UT to detect and characterize unknown defects by signal processing [3,4,5] and imaging processing techniques [6,7,8]. With the development of computer science and artificial intelligence, datadriven machine learning (ML) techniques have been adopted in UT area to facilitate signal interpretation [9, 10]. ML provides a powerful tool to find and establish the complicated nonlinear relationship between observed UT data and physical properties of the probed structures owing to its advantages in high speed and strong fitting ability [11]. Compared to manual interpretation, ML eliminates the influence of subjective factors and realizes the intelligent identification of defects [12,13,14]. For instance, Yuan et al. [15] proposed a neural network model to identify echoes from defects in Bscans of train wheels and the accuracy of defect recognition was improved to 92%. It should be noted that the performance of ML techniques is primarily dependent on the features extracted from UT data [16]. Original UT data contain a number of invalid and redundant features. Applying raw UT data directly in defect characterization increases the complexity and computational time of ML models [12, 16]. Therefore, the extraction and selection of defect features with meaningful information are crucial for improving defect characterization accuracy and computational efficiency [17].
Current researches on UT combined with ML typically focus on the extraction of sensitive features from the time domain, frequency domain and timefrequency domain by statistical techniques [10], Fourier transform [18], wavelet transform [19] or empirical mode decomposition (EMD) [20]. Then, the appropriate ML model, such as support vector machine (SVR) [21], artificial neural network (ANN) [22] or extreme learning machine (ELM) [23], is established by using these features to predict defect parameters. On this basis, feature selection methods preserve important information and remove redundant features without changing the physical meaning of original feature set, reducing the overfitting of the prediction model.
Ma et al. [24] developed a back propagation neural network optimizing Gaussian process regression (BPGPR) algorithm to predict the porosity of thermal barrier coating. The features extracted from ultrasonic reflection coefficient amplitude spectrum were optimized by combining BP neural network and high determination coefficient rule. The predictive accuracy of BPGPR was 32% and 48% higher than that predicted only by BP neural network or GPR algorithm, respectively. Bai et al. [25] extracted the scattering matrix from array data to characterize the sizes and orientation angles of small defects. A dimensionality reduction approach based on locality preserving projection was proposed to separate the scattering matrices of unfavorably oriented defects. In addition, the filter method [26], embedded method [27] and wrapper method [28] used in the field of fault diagnosis are typically implemented according to the divergence or correlation of features to optimize feature space. Nazir et al. [29] monitored the tool conditions of ultrasonic metal welding via sensor fusion and ML. The filter method, embedded method and wrapper method were employed to select ten sensitive features from the initial feature space containing 97 features. The classification accuracies of tool conditions for training and testing datasets were both close to 100%. Besides, dimensionality reduction techniques, such as principal component analysis (PCA) [30] and factor analysis (FA) [31], can also be used for feature selection by fusing the highdimensional feature set to the significant lowerdimensional features [16]. Lv et al. [32] adopted the noncontact laser ultrasonic technique and the identified ML algorithm to quantify the widths and depths of subsurface defects simultaneously. PCA was applied to reduce mutually correlated features and improve detection accuracy. The highest recognition rate of subsurface defects was 98.48%.
However, the applications of ML methods in UT still face some challenges, e.g., the hidden physics behind ML and the unexplained contribution of each feature [11]. The intrinsic blackbox character of ML models induces the reduction of the generalization performance and applicability [33, 34], and the lack of knowledge is a barrier to the deployment of ML in UT area. The higher the interpretability of an ML model, the easier it is for someone to comprehend certain decisions or predictions made, and the interpretability is strongly reliant on the contribution of each feature [35]. For example, Xu et al. [34] proposed an explainable ensemble tree model to identify pipeline leakage scenarios. The optimized feature space of each pipe leakage state was summarized and analyzed by Shapley additive explanation (SHAP). While retaining the advantages of ML, the method overcomes the problem that the correlation between the results brought by blackbox character and the feature space cannot be analyzed. Consequently, to obtain an interpretable ML model, various signal processing techniques should be applied to extract multidomain features for comprehensively mining the useful information and intrinsic properties from UT data. Then, the sensitive features in the initial feature space are selected by establishing the relation of features and predicted results based on the model interpretation strategy to eliminate the deficiencies, such as dependence on expert experiences and poor universal applicability of features.
In this paper, a generally applicable ML framework based on model interpretation strategy is proposed by combining the UT methods, signal processing techniques and ML algorithms for improving defect characterization accuracy and computational efficiency. The outline of this paper is organized as follows. Section 2 gives an overview of the proposed ML framework. In Section 3, two illustrative examples are given to show the effectiveness of the proposed framework by identifying and locating the sidedrilled holes (SDHs) with subwavelength spacing. In Section 4, some comparisons are conducted to highlight the superiority of the proposed feature selection method. Conclusions are drawn in the final section.
2 Generally Applicable ML Framework
The input pulse used in UT is transmitted into the material under test, and the presence of material discontinuities or defects gives rise to scattered/diffracted/reflected signals [36]. On this basis, the identification and characterization of defects are carried out by appropriate signal processing techniques. ML has the ability to obtain the complicated relationship between observed data and physical properties of the probed structure by adaptive learning and training, as shown in Figure 1, having been widely used in UT area. ML maps inputs (or features) to outputs (or target variables) during training to produce a model that accurately predicts the outputs of previously unseen input data [37]. However, the implementation process requires expertise to extract and select appropriate features from UT data as model inputs, determine the ML algorithm and find a suitable set of model hyperparameters.
The environmental noises accompanying measured UT signals obstruct damage diagnosis. The acquired signals are preprocessed firstly by filtering, smoothing and normalization to suppress noise. However, original UT signals contain invalid and redundant information. To reduce the complexity of ML models, various signal processing techniques are conducted on the preprocessed signals to deeply mine and extract the effective multidomain features (e.g., timedomain, frequencydomain and timefrequency domain features). Every raw UT signal is transformed into a set of features with physical and statistical meaning related to the target defect, and an initial highdimensional feature space is constructed.
Next, a feature selection method based on model interpretable strategy (FSMIS) is proposed to selfadaptively obtain the optimal feature subset with more physically interpretable. Filter method and embedded method [38] are used to perform feature preselection from the initial feature space by considering two aspects: (1) whether the feature diverges or converges; and (2) the correlation between the feature and the target. Moreover, the optimal ML model depends on the issue to be addressed and is determined by evaluating the predictive capability of several commonly used models in complex and nonlinear problems. Support vector regression (SVR) is a powerful learning model to minimize structural risk with better generalization capability based on statistical theory [21]. Gradient boosted regression (GBR) is an ensemble learning algorithm that promotes a series of weak learners to strong learners through iterative calculations [39]. As an extended variant of the bagging mode in ensemble learning, random forest regression (RFR) introduces random attribute selection in the training process of the decision tree to implement with powerful performance in prediction and regression [39]. The extreme gradient boosting (XGBoost) model uses a secondorder Taylor expansion to extend the loss function and add a regularization term, having the advantages of low computational complexity, fast running speed and high accuracy [40]. Backpropagation neural network (BPNN) is a multilayer feedforward neural network based on the error back propagation algorithm [41]. By continuously adjusting the weight values of the network, the final network outputs are as close as possible to the expected outputs to achieve the purpose of training. The hyperparameters of the aforementioned ML models are determined by grid search [42].
Two statistical indexes, mean squared error (MSE) and determination coefficient (R^{2}), are introduced to evaluate the model performance. The smaller MSE and the larger R^{2} indicate better reliability and predictive accuracy.
where m represents the number of samples; T_{i} and H_{i} are respectively the expected and predicted values, and \(\overline{T}\) and \(\overline{H}\) are the averages of expected and predicted values, respectively.
It is difficult to understand the model decisions and the influences of features due to the intrinsic blackbox character of ML models [2]. Therefore, SHAP [43] is incorporated to explore the importance of each feature on the predicted results and selfadaptively sort out highly sensitive features with more defect information, further reducing the dimension of feature space. SHAP is a model interpreter, a concept in game theory [33]. The SHAP value Ï•_{P} of feature P is the average of its marginal contributions across all possible permutations and combinations considered [33].
where F(V) corresponds to the output of the ML model to be explained using a set V of features, and n is the complete set of all features.
Noteworthy, features that appear irrelevant to the target singly may become highly relevant by taking with others [28]. The impact of feature combination should also be considered. Hence, the wrapper method is utilized to assess the potential feature subsets [38, 44]. Multiple combinations of the available features are tested, and the feature subset presenting the best performance is finally chosen [38]. In this paper, the optimal feature subset is determined by comparing the predictive performance of the feature subsets obtained by sequential forward selection (SFS) [45] and sequential backward selection (SBS) [46].
Finally, the ML model trained by the feature subset selected with FSMIS is determined whether it is optimal according to the predictive accuracy. If the outputs deviate greatly from the expected values, the initial features will be reextracted. Repeat the above processes until the most effective ML model and the optimal feature subset highly correlated to the target characteristics are acquired. Overall, Figure 2 shows the proposed ML framework, which can be applied to different UT scenarios for locating and characterizing defects quantitatively.
3 Experiments
3.1 Specimens and Experimental Details
To evaluate the superior performance of the proposed framework, the experiments were conducted on six 180Â mmÂ Ã—Â 95Â mmÂ Ã—Â 15Â mm 6061 aluminum alloy specimens containing adjacent SDHs. The longitudinal wave velocity was 6300 m/s, and the corresponding wavelength Î» in aluminum alloy was about 2.8Â mm at 2.25Â MHz inspection frequency. As schematically illustrated in Figure 3a, the central distances of the SDHs in specimens are 1.40 mm (0.5Î») ~ 2.80 mm (1.0Î») with a step of 0.28 mm (0.1Î»), and the diameter and central depth of the SDHs are 1.0 mm and 50 mm, respectively.
The full matrix capture (FMC) technique is introduced to capture all the possible independent information from the array elements and provide plenty of flexibility for postprocessing [47]. For an array with N elements, N^{2} signals are obtained by FMC. Figure 3a shows the ultrasonic path from the ith element (with coordinates (x_{i}, 0)) to the jth element (with coordinates (x_{j}, 0)) through a potential scatterer located at coordinates (x_{ref}, z_{ref}), and y_{ij}(t) denotes the corresponding Ascan signal. The Eddyfi M2M PANTHER and a linear array probe (64 elements, 0.6 mm pitch and 2.25 MHz central frequency) are employed to acquire FMC data with 100 MHz sampling frequency from the top and bottom surfaces of each specimen, as shown in Figure 3b. Therefore, the actual tobemeasured SDH depths included 45 mm and 50 mm. To reduce data redundancy, only the Ascan signals transmitted and received by the left 32 elements were considered according to the symmetry and reciprocity of the inspection model. Therefore, 12288 timedomain signals corresponding to 12 FMC datasets were obtained from experiments.
The representative time traces of the scattered waves are plotted in Figure 4a, where the pulses from two SDHs are overlapped due to the low time resolution [3]. The time resolution depends on the spatial pulse length (SPL) of the probing signal, and the theoretical resolution limit in UT is equal to half the SPL [48]. The SPL in this study was about 1.08 Î¼s, so the resolution limit was 0.54 Î¼s. Taking the SDHs with 0.5Î» central distance in 45 mm and 50 mm depths as examples, the pulseechoes in 2048 signals were strongly coupled, since the calculated interval of the times of arrival (ToAs) of scattered waves ranged from 0.0048 to 0.19 Î¼s. It is desirable to improve the time resolution of each Ascan signal in the FMC datasets for accurately locating the SDHs.
Moreover, postprocessing imaging techniques, such as the total focusing method (TFM), can be performed on the FMC data to obtain highresolution ultrasonic images [49]. TFM is a delayandsum beamforming algorithm, in which the array signals are synthetically focused on each point in the region of interest [50].
As shown in Figure 3a, the delay law is calculated based on the ray path from each array element to point Q, and the corresponding intensity I(x_{ref}, z_{ref}) is given by
where t_{ij} represents the travel time from the ith element through focus point Q to the jth element.
The TFM images of the SDHs with 0.5Î» central distance in 45 mm and 50 mm depths are presented in Figure 4b. It is challenging to distinguish and locate the SDHs with subwavelength spacing due to the diffraction limit [51]. Focusing on the above two basic issues in UT, the proposed ML framework based on model interpretation strategy is applied to ultrasonic signal analysis and image processing to simultaneously improve the time and imaging resolutions and verify the performance.
3.2 Construction of Feature Space
Considering that the key to improving time and imaging resolutions is to decouple the overlapped pulseechoes from two closely spaced scatterers [49], the outputs of the ML model adopted the corresponding ToAs t_{1} and t_{2}. As given by Eq. (5), the predicted ToAs of the scattered waves are assigned to the corresponding original signal to decouple the overlapped pulseechoes. If and only if t = t_{1} or t_{2}, the signal amplitude is 1; otherwise, the amplitudes are all equal to 0. The schematic diagrams of the raw and decoupled timedomain signals are shown in Figure 5.
The initial feature space was established by extracting 82 features from each Ascan signal in the FMC datasets based on various signal processing techniques. There were 21 statistical features associated with signal amplitude and time information extracted in the time domain, including peak value, ToA of peak value, rootmeansquare, peaktopeak value, variance and skewness [52], etc. Shannon entropy [53] is a measurement of uncertainty and depicts the distribution and variation of UT signals. The entropy at given scales of the UT signal from SDHs always varies with central distance and can be considered as another important feature for defect characterization [54].
Frequency domain analysis extracts the features advantageous in defect identification [16]. For example, the intervals between the extreme values in the frequency spectrum are related to the path/time difference of the scattered waves from adjacent defects [5]. A total of 22 features were extracted from the frequency spectrum obtained by fast Fourier transform (FFT), such as maximum amplitude, mean square frequency, âˆ’Â 6 dB bandwidth, resonant frequency, gravity frequency and frequency variance [10], etc. In addition, autoregressive (AR) spectrum extrapolation has the ability to extend the effective frequency band and compress timedomain pulse width to improve time resolution [55]. To this end, AR spectrum extrapolation was implemented on each Ascan signal. The AR parameters were determined by knowledgebased methods [49], and the AR coefficients were extracted as frequencydomain features [56, 57].
In timefrequency domain analysis, wavelet packet transform (WPT) with â€˜DB5â€™ mother wavelet and 4 deposition layers was used to decompose each Ascan signal into 16 frequency band signals. The Shannon entropy of each frequency band and the energy ratio in total energy were extracted as the timefrequency domain features, resulting in a total of 32 features. Furthermore, as an adaptive timefrequency analysis method, EMD [58] was introduced to decompose UT signals into a finite number of stationary intrinsic mode functions (IMFs). The largest eigenvalue of the covariance matrix constructed by all IMFs (except the residual IMF) [20], along with the normalized energy and energy moment of the first three IMFs, are adopted as the timefrequency features.
3.3 Selection of Features and Regression Model
Feature selection has significant influences on the predictive accuracy of ML models. Determining suitable features can reduce the complexity and overfitting, alleviate the effect of the curse of dimensionality and improve the generalization capability and interpretability [26]. In this paper, FSMIS was proposed by integrating SHAP, filter method, embedded method and wrapper method to reduce the dimension of initial feature space and make feature selection more physically interpretable. The optimal ML model was determined simultaneously in this process, and the sensitive features with minimum redundancy and maximum relevance to target defects were selected selfadaptively.
Firstly, the filter method was implemented to select features. The features whose variances dissatisfied the threshold of 0.05 were removed, and 66 features were retained, since the feature with low variance is not beneficial to the discrimination of different samples [29].
Mutual information (MI) was used to measure the linear or nonlinear relationship between each feature and ToAs. The irrelevant features with the maximal information coefficient (MIC) equal to 0 were removed from the feature space.
where p(a) and p(b) are respectively the probability of input a and output b, and p(a, b) is the joint distribution probability of a and b.
Embedded method integrates the feature selection and the training of the learner, which are completed in the same optimization process. A total of 20 important features higher than the average weight were determined by random forest method [59], as shown in Table 1.
A total of 12288 Ascan signals (12 FMC datasets) were acquired by experiments and randomly divided into 80% training data and 20% testing data. SVR, GBR, RFR, XGboost and BPNN were adopted to establish regression models. The hyperparameters of each model were found by grid search. Tenfold crossvalidatedaverage MSE and R^{2} were calculated to evaluate the accuracy of the above models. As shown in Figure 6, the BPNN model has the best overall performance with the lowest MSE and highest R^{2}, since it has a strong ability for data mining and solving inverse problems with highly nonlinear correlations [10] and the mapping between input and output data can be obtained by adaptive training with sufficient samples. Consequently, BPNN was chosen as the optimal ML model in the following parts.
In addition, strong correlations may exist among the 20 selected features. If one feature provides enough information, the other highly relevant features no longer provide additional contributions. Pearson correlation coefficient was calculated to select relevant features for overcoming the influence of multicollinearity. Figure 7a shows the correlation degree between features. The grids with crossed horizontal and vertical coordinates represent the Pearson correlation coefficient scores. The darker color indicates a higher correlation degree between the two features. The 20 features were divided into six groups of relevant features (the absolute value of Pearson correlation coefficient > 0.9) and five independent features (P_{10}, P_{11}, P_{12}, P_{13} and P_{20}). Subsequently, SHAP was incorporated to analyze the importance of each feature on the outputs in BPNN model. Figure 7b depicts the stacks of the mean absolute SHAP values of each feature for two outputs (t_{1} and t_{2}), and the higher sum indicates the greater impact during the prediction process [35]. It can be seen that the importance of features is different. For each group of the relevant features, the features with higher SHAP values (P_{4}, P_{5}, P_{3}, P_{6}, P_{16} and P_{19}) were selected and retained together with the other five independent features, resulting in a total of 11 features.
To further reduce the redundancy of feature space, it is necessary to consider the contribution of feature combination. Two greedy wrapper methods (SFS and SBS) were adopted to select the optimal feature subset from the 11 features. A total of 12288 Ascan signals were split into the training set and testing set at a ratio of 8:2, and the tenfold crossvalidatedaverage MSE and R^{2} were used to test the predictive accuracy of different feature subsets.

(1)
SFS starts with an empty set and iteratively selects one feature at a time until no improvement in predictive accuracy can be achieved. As shown in Figure 8a, the feature set P1 = (P_{3}, P_{4}, P_{5}, P_{6}, P_{11}, P_{12}, P_{13}, P_{19}, P_{20}) has the smallest MSE = 0.0050 and the largest R^{2} = 0.99197.

(2)
SBS starts with the set of all features and progressively eliminates the least promising one. This process stops if the performance of the learning algorithm drops below a given threshold. As shown in Figure 8b, the features set P2 = (P_{3}, P_{4}, P_{5}, P_{6}, P_{11}, P_{13}, P_{16}, P_{19}, P_{20}) has the smallest MSE = 0.0049 and the largest R^{2} = 0.99198.
The two feature subsets determined by SFS and SBS both contained nine features, of which only one feature was different. Considering that the MSE and R^{2} of P1 and P2 were almost the same, the feature subset P2 with relatively good performance was selected as the optimal feature subset in this study. The results demonstrated that timedomain features, frequencydomain features and the features obtained by wavelet decomposition were identified as the most significant features for predicting ToAs.
3.4 Experimental Results and Analysis
3.4.1 Enhancement of Time Resolution in UT
As mentioned in Section 3.1, 12288 Ascan signals (12 FMC datasets) were acquired from the aluminum alloy specimens, where the central distances of SDHs were varied from 0.5Î» to 1.0Î». Taking the 0.5Î» central distance SDHs in 45 mm and 50 mm depths as examples, some typical UT signals captured by different transmitterreceiver pairs are presented in Figures 9a, c, respectively. The pulseechoes from SDHs are overlapped, and it is challenging to extract the ToA of the respective scattered wave. Nine features (P_{3}, P_{4}, P_{5}, P_{6}, P_{11}, P_{13}, P_{16}, P_{19}, P_{20}) were extracted from each Ascan signal to construct the feature set used as the inputs of BPNN model. Meanwhile, the ToAs (t_{1} and t_{2}) of the scattered waves from adjacent SDHs were set as the outputs. The 10240 signals collected from the SDHs with 0.6Î» ~ 1.0Î» central distance were employed to train the model for obtaining the optimized weights and biases, while the remaining 2048 signals corresponding to 0.5Î» central distance SDHs were used to test the model.
The calculated R^{2} and MSE are respectively equal to 0.99 and 0.0055, indicating that the trained BPNN model has excellent predictive accuracy and generalization capability [21].
Figures 10a, b present the predicted ToAs of the testing data. The discrete points are well located around the solid line with a slope of 1, indicating that the predicted values are approximately the same as the expected values. The band lines in the figures show that about 98% of predicted values are within 1% deviation from the expected values. Figures 10c, d show the relative errors of the predicted ToAs, which are all below 3.67% with an average error of 0.25%. Such low errors suggest that the proposed ML framework based on model interpretation strategy effectively separates the overlapped UT signals and improves the time resolution, i.e., t_{2} â€“ t_{1}.
3.4.2 Enhancement of Imaging Resolution in UT
The predicted ToAs presented in Section 3.4.1 were applied to reconstruct new FMC datasets containing decoupled signals for TFM imaging. As shown in Figure 11a, the SDHs with 0.5Î» central distance at different depths are identified from the delayandsum images. The relative measurement errors of hole depths and central distances are no more than 0.51% and 3.57%, respectively.
Two key parameters, i.e., the peak to central intensity difference (Ï„) and the array performance indicator (API) [60], were introduced to describe the TFM images quantitatively. The smaller Ï„ and API values refer to better imaging performance. Figure 11b presents the crosssections taken through the centers of the SDHs in TFM images with raw FMC datasets and reconstructed highresolution FMC datasets. The API values for the latter are reduced by 92.71% and 87.39% compared to those for the former. It is difficult to determine Ï„ values from the original TFM images. In contrast, the Ï„ values for the TFM images with reconstructed FMC datasets are âˆ’17.32 dB and âˆ’Â 16.42 dB, less than âˆ’Â 6 dB. The experimental results demonstrate that the proposed framework is suitable for determining the optimal ML model and feature subset, accurately predicting the ToAs of the scattered waves from adjacent defects. The imaging resolution can be improved to subwavelengthscale by combining the proposed framework and TFM, breaking the diffraction limit and highlighting the target characteristics with accurate location.
4 Discussion
The proposed FSMIS was validated by comparing it with four commonly used feature selection methods, including PCA, FA, kernel principal component analysis (KPCA) and independent component analysis (ICA). PCA is a linear dimensionality reduction technique representing the maximum variance in the data [30]. KPCA is a nonlinear PCA developed with the kernel method by transforming the input features into a highdimensional space through the nonlinear mapping function and performing PCA to achieve feature fusion and dimension reduction [61]. FA describes the variability among the original features in terms of fewer variable factors [31]. The original features are modeled as the linear combinations of factors plus error. ICA is a statistical and computational technique for revealing hidden information underlying feature set [31]. The original features in ICA are transformed into new features which are mutually statistically independent [62]. In a word, these four methods integrate the highdimensional initial feature space to significant lowdimensional features.
The mentioned feature selection methods were used to reduce the dimensionality of the initial feature space. The first two eigenvalues in PCA exhibited the maximum cumulative proportion variation equal to 0.99 and were chosen for evaluation. The first five principal component features were obtained by KPCA with the polynomial kernel method. Ten factors were selected by FA according to the variance percentage. FastICA algorithm was applied to ICA, and five independent components were extracted from the initial feature space. The feature sets determined by the aforementioned methods were used independently as the inputs to predict the ToAs in BPNN model. The dataset with 12288 Ascan signals was randomly split into the training set and testing set at a ratio of 8:2. The tenfold crossvalidatedaverage MSE and R^{2} were employed to test the predictive performance of each feature set. As shown in Figure 12, FSMIS has the lowest MSE (0.0048) and the highest R^{2} (0.99), i.e., the best overall performance compared to other unsupervised techniques (PCA, KPCA, FA and ICA). The unsupervised dimensionality reduction is implemented based on the features rather than the effect of each feature and feature combination on the targets. In contrast, the proposed FSMIS method has the capability to selfadaptively obtain the optimal feature subset by integrating SHAP, filter method, embedded method and wrapper method, quantitatively analyzing the contributions of each feature and feature combination.
To demonstrate the advantages of FSMIS method in improving computational efficiency, we compared the performance of the BPNN models trained with nine features selected by FSMIS and all 82 initial features. For the 12288 experimental signals in Section 3, 10240 signals corresponding to the SDHs with 0.6Î» ~ 1.0Î» central distances were employed for training the model, and the remaining 2048 signals corresponding to the SDHs with 0.5Î» central distance were adopted to test the model. On this basis, 82 features extracted from each Ascan signal were used as the inputs to predict the ToAs. The statistical indexes R^{2} and MSE were equal to 0.99 and 0.0092, respectively. Compared to the evaluation results with nine features, the performance of the trained model with 82 features is still at a high level, and the predictive accuracy falls slightly. However, the training time is up to 167.25 s, while that of nine features is only 5.41 s. The results demonstrate that the proposed FSMIS method is beneficial to improve computational efficiency with high predictive accuracy.
As given by Eq. (5), the predicted ToAs using 82 features were also employed to decouple the overlapped pulseechoes. Figure 13 shows the relative errors of the testing dataset between predicted ToAs and expected values, where the average error is 0.37% and is increased by 0.12% compared to Figures 10c, d. Subsequently, the predicted ToAs were employed to reconstruct highresolution FMC datasets, and TFM imaging was conducted by delayandsum beamforming. As illustrated in Figure 14a, the SDHs with 0.5Î» central distance in 45 mm and 50 mm depths are resolved, but the maximum measurement errors of hole depths and central distance were 0.71% and 59.59%, much larger than those observed in Figure 11a. Figure 14b presents Ï„ values and API values of the TFM images obtained by different feature sets. Compared to the TFM images combined with 82 features, the Ï„ and API values corresponding to nine features are reduced significantly. The experimental results demonstrate that the feature subset selected by FSMIS excellently describes the intrinsic property of UT signals and accurately predicts the ToAs of the scattered waves from adjacent defects. The proposed ML framework based on model interpretation strategy is beneficial to improving the accuracy of defect characterization and calculation efficiency to meet the requirements of nondestructive testing and evaluation.
5 Conclusions and Further Work

(1)
A generally applicable ML framework for UT based on model interpretation strategy is proposed to improve the accuracy and efficiency of defect characterization. Signal processing techniques are conducted to extract multidomain features from the UT signals and construct typical feature space. FSMIS method is developed to selfadaptively determine the optimal feature subset showing better correlation with the target defects and make the feature selection more physically interpretable.

(2)
The experimental results indicate that the proposed framework has the capability to decouple the overlapped pulseechoes from the SDHs with 0.5Î» central distance and improve the time resolution of UT signals. The relative errors of the predicted ToAs are all below 3.67% with an average error of 0.25%. On this basis, the ultrasonic imaging resolution is enhanced to 0.5Î» by combining TFM. The relative measurement errors of hole depths and central distance are no more than 0.51% and 3.57%, respectively.

(3)
FSMIS is adopted to visualize the contributions of each feature and feature combination on targets by integrating the SHAP, filter method, embedded method and wrapper method. Compared to the initial feature space and the features determined by conventional dimensionality reduction techniques, the feature subset selected by FSMIS is beneficial to improving the predictive accuracy and computational efficiency of ML models.

(4)
In future work, more diverse datasets corresponding to the defects with various sizes, shapes and locations will be incorporated for accurately detecting and characterizing unknown damage. In addition, we will also explore the comprehensive impact of structural noise originating from grain boundaries and structural features in multiphase materials on the predictive performance of the ML framework.
Availability of Data and Materials
All data generated or analyzed during this study are included in this published article.
References
Z Wang, Z C Fan, X D Chen, et al. Modeling and experimental analysis of roughness effect on ultrasonic nondestructive evaluation of microcrack. Chinese Journal of Mechanical Engineering, 2021, 34: 114.
A L Bowler, M P Pound, N J Watson. A review of ultrasonic sensing and machine learning methods to monitor industrial processes. Ultrasonics, 2022, 124: 106776.
J Chen, E Y Wu, H T Wu, et al. Enhancing ultrasonic timeofflight diffraction measurement through an adaptive deconvolution method. Ultrasonics, 2019, 96: 175180.
X Sun, L Lin, Z Y Ma, et al. Enhancement of time resolution in ultrasonic timeofflight diffraction technique with frequencydomain sparsitydecomposability inversion (FDSDI) method. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2021, 68(10): 32043215.
S J Jin, B Zhang, X Sun, et al. Reduction of layered dead zone in TimeofFlight Diffraction (TOFD) for pipeline with spectrum analysis method. Journal of Nondestructive Evaluation, 2021, 40(2): 48.
C G Fan, L Yang, Y Zhao. Ultrasonic multifrequency timereversalbased imaging of extended targets. NDT & E International, 2020, 113: 102276.
J Rao, J Yang, M Ratassepp, et al. Multiparameter reconstruction of velocity and density using ultrasonic tomography based on full waveform inversion. Ultrasonics, 2020, 101: 106004.
X Sun, L Lin, S J Jin. Resolution enhancement in ultrasonic TOFD imaging by combining sparse deconvolution and synthetic aperture focusing technique (SparseSAFT). Chinese Journal of Mechanical Engineering, 2022, 35: 94.
H Lee, B Koo, A Chattopadhyay, et al. Damage detection technique using ultrasonic guided waves and outlier detection: Application to interface delamination diagnosis of integrated circuit package. Mechanical Systems and Signal Processing, 2021, 160: 107884.
K X Zhang, G L Lv, S F Guo, et al. Evaluation of subsurface defects in metallic structures using laser ultrasonic technique and genetic algorithmback propagation neural network. NDT & E International, 2020, 116: 102339.
J Tong, M Lin, X Wang, et al. Deep learning inversion with supervision: A rapid and cascaded imaging technique. Ultrasonics, 2022, 122: 106686.
S J Farley, J F Durodola, N A Fellows, et al. High resolution nondestructive evaluation of defects using artificial neural networks and wavelets. NDT & E International, 2012, 52: 6975.
F Nafiah, A Sophian, M R Khan, et al. Quantitative evaluation of crack depths and angles for pulsed eddy current nondestructive testing. NDT & E International, 2019, 102: 180188.
W Xu, X Li, J Zhang, et al. Ultrasonic signal enhancement for coarse grain materials by machine learning analysis. Ultrasonics, 2021, 117: 106550.
M Yuan, J Li, Y Liu, et al. Automatic recognition and positioning of wheel defects in ultrasonic Bscan image using artificial neural network and image processing. Journal of Testing and Evaluation, 2020, 48(1): 308322.
S Buchaiah, P Shakya. Bearing fault diagnosis and prognosis using data fusion based feature extraction and feature selection. Measurement : Journal of the International Measurement Confederation, 2022, 188: 110506.
C Yang, B Hou, B Ren, et al. CNNbased polarimetric decomposition feature selection for PolSAR image classification. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(11): 87968812.
J Liu, G Xu, L Ren, et al. Defect intelligent identification in resistance spot welding ultrasonic detection based on wavelet packet and neural network. The International Journal of Advanced Manufacturing Technology, 2017, 90: 25812588.
Y Wang. Wavelet transform based feature extraction for ultrasonic flaw signal classification. Journal of Computers, 2014, 9(3): 725732.
M Mousavi, M S Taskhiri, D Holloway, et al. Feature extraction of woodhole defects using empirical mode decomposition of ultrasonic signals. NDT & E International, 2020, 114: 102282.
L Lin, W Zhang, Z Y Ma, et al. Porosity estimation of abradable seal coating with an optimized support vector regression model based on multiscale ultrasonic attenuation coefficient. NDT & E International, 2020, 113: 102272.
D W Huang, S H Tang, D J Zhou, et al. NOx emission estimation in gas turbines via interpretable neural network observer with adjustable intermediate layer considering ambient and boundary conditions. Measurement, 2022, 189: 110429.
L C Silva, E F Simas Filho, M C S Albuquerque, et al. Segmented analysis of timeofflight diffraction ultrasound for flaw detection in welded steel plates using extreme learning machines. Ultrasonics, 2020, 102: 106057.
Z Y Ma, W Zhang, Z B Luo, et al. Ultrasonic characterization of thermal barrier coatings porosity through BP neural network optimizing Gaussian process regression algorithm. Ultrasonics, 2020, 100: 105981.
L Bai, M Liu, N Liu, et al. Dimensionality reduction of ultrasonic array data for characterization of inclined defects based on supervised locality preserving projection. Ultrasonics, 2022, 119: 106625.
K Zhang, Y Li, P Scarf, et al. Feature selection for highdimensional machinery fault diagnosis data using multiple models and Radial Basis Function networks. Neurocomputing, 2011, 74(17): 29412952.
C Lin, H Chen, Y Wu. Study of image retrieval and classification based on adaptive features using genetic algorithm feature selection. Expert Systems with Applications, 2014, 41(15): 66116621.
I A Gheyas, L S Smith. Feature subset selection in large dimensionality domains. Pattern Recognition, 2010, 43(1): 513.
Q Nazir, C Shao. Online tool condition monitoring for ultrasonic metal welding via sensor fusion and machine learning. Journal of Manufacturing Processes, 2021, 62: 806816.
H Abdi, L J Williams. Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2(4): 433459.
D SalasGonzalez, J M Gorriz, J Ramirez, et al. Feature selection using factor analysis for Alzheimerâ€™s diagnosis using 18FFDG PET images. Medical Physics, 2010, 37(11): 60846095.
G L Lv, S F Guo, D Chen, et al. Laser ultrasonics and machine learning for automatic defect detection in metallic components. NDT & E International, 2023, 133: 102752.
R RodrÃguezPÃ©rez, J Bajorath. Interpretation of machine learning models using shapley values: application to compound potency and multitarget activity predictions. Journal of ComputerAided Molecular Design, 2020, 34(10): 10131026.
W N Xu, S D Fan, C P Wang, et al. Leakage identification in water pipes using explainable ensemble tree model of vibration signals. Measurement, 2022, 194: 110996.
T Ye, M Dong, Y Liang, et al. Modeling and optimization of the NO_{X} generation characteristics of the coalfired boiler based on interpretable machine learning algorithm. International Journal of Green Energy, 2021: 115.
L Bai, F Le Bourdais, R Miorelli, et al. Ultrasonic defect characterization using the scattering matrix: A performance comparison study of Bayesian inversion and machine learning schemas. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2021, 68(10): 31433155.
Z Ge, Z Song, S X Ding, et al. Data mining and analytics in the process industry: The role of machine learning. IEEE Access, 2017, 5: 2059020616.
E M Mahjoub, Y Slah, R JosÃ©, et al. Feature selection techniques for identifying the most relevant damage indices in SHM using Guided Waves. 8th European Workshop on Structural Health Monitoring (EWSHM 2016), 2016.
Y Peng, H Liu, X Li, et al. Machine learning method for energy consumption prediction of ships in port considering green ports. Journal of Cleaner Production, 2020, 264: 121564.
M G Li, F Wang, X J Jia, et al. Multisource data fusion for economic data analysis. Neural Computing and Applications, 2021, 33: 47294739.
D Rumelhart, G E Hinton, R J Williams. Learning representations by back propagating errors. Nature, 1986, 323(6088): 533536.
S Bedi, A Samal, C Ray, et al. Comparative evaluation of machine learning models for groundwater quality assessment. Environmental Monitoring and Assessment, 2020, 192(12): 123.
L S Shapley. A value for nperson games. Contributions to the Theory of Games, Princeton Univ Press, Princeton, NJ, USA, 1953: 307317.
W C Zhao, C Zheng, B Xiao, et al. Composition refinement of 6061 Aluminum alloy using active machine learning model based on Bayesian optimization sampling. ACTA Metallurgica Sinica, 2021, 57 (6): 797810. (in Chinese)
A W Whitney. A direct method of nonparametric measurement selection. IEEE Transactions on Computers, 1971, 20(9): 11001103.
S F Cotter, K KreutzDelgado, B D Rao. Backward sequential elimination for sparse vector subset selection. Signal Processing, 2001, 81: 18491864.
H Zhou, Z Han, D Du. An improved ultrasonic imaging method for Austenitic welds based on grain orientation distribution inversion algorithm. Journal of Nondestructive Evaluation, 2020, 39(3): 54.
S K Shastri, S Rudresh, R Anand, et al. Axial superresolution in ultrasound imaging with application to nondestructive evaluation. Ultrasonics, 2020, 108: 106183.
S Q Shi, L Lin, Z B Luo, et al. Resolution enhancement of ultrasonic imaging at oblique incidence by using WTFM based on FMCAR. Measurement, 2021, 183: 109798.
X Y Zhao, Z M Ma, J Y Zhang. Simplified matrix focusing imaging algorithm for ultrasonic nondestructive testing. Chinese Journal of Mechanical Engineering, 2022, 35: 19.
N Laroche, S Bourguignon, E Carcreff, et al. An inverse approach for ultrasonic imaging from full matrix capture data. Application to resolution enhancement in NDT. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2020, 67(9): 18771887.
H Yun, R Rayhana, S Pant, et al. Nonlinear ultrasonic testing and data analytics for damage characterization: A review. Measurement, 2021, 186: 110155.
M Meng, Y J Chua, E Wouterson, et al. Ultrasonic signal classification and imaging system for composite materials via deep convolutional neural networks. Neurocomputing, 2017, 257: 128135.
Z Gao, Y Liu, Q Wang, et al. Ensemble empirical mode decomposition energy moment entropy and enhanced long shortterm memory for early fault prediction of bearing. Measurement : Journal of the International Measurement Confederation, 2022, 188: 110417.
F Honarvar, H Sheikhzadeh, M Moles, et al. Improving the timeresolution and signaltonoise ratio of ultrasonic NDE signals. Ultrasonics, 2004, 41(9): 755763.
P Li, Z Q Lang, L Zhao, et al. System identificationbased frequency domain feature extraction for defect detection and characterization. NDT & E International, 2018, 98: 7079.
G R B Ferreira, M G de Castro Ribeiro, A C Kubrusly, et al. Improved feature extraction of guided wave signals for defect detection in welded thermoplastic composite joints. Measurement, 2022, 198: 111372.
M Mousavi, A H Gandomi. Wood holedamage detection and classification via contact ultrasonic testing. Construction and Building Materials, 2021, 307: 124999.
S T Yang, L J Gu, X F Li, et al. Crop classification method based on optimal feature selection and hybrid CNNRF networks for multitemporal remote sensing imagery. Remote Sensing, 2020, 12(19): 3119.
C G Fan, M H Caleap, M C Pan, et al. A comparison between ultrasonic array beamforming and super resolution imaging algorithms for nondestructive evaluation. Ultrasonics, 2014, 54(7): 18421850.
J Shen, F Xu. Method of fault feature selection and fusion based on poll mode and optimized weighted KPCA for bearings. Measurement, 2022, 194: 110950.
J Lee, C Yoo, I Lee. Statistical process monitoring with independent component analysis. Journal of Process Control, 2004, 14: 467485.
Acknowledgements
Not applicable.
Funding
Supported by National Natural Science Foundation of China (Grant Nos. U22B2068,Â 52275520, 52075078), National Key Research and Development Program of China (Grant No. 2019YFA0709003).
Author information
Authors and Affiliations
Contributions
SS wrote the draft manuscript and conducted experiment; LL and SJ in charge of the whole trial; LL and SJ checked and improved the manuscript in writing. DZ, JL and DF gave some advices on the manuscript. All authors read and approved the final manuscript.
Authorsâ€™ Information
Siqi Shi, born in 1995,Â is currently a PhD candidate at NDT & E Laboratory, Dalian University of Technology, China. Her research interests include ultrasonic signal processing and machine learning.
Shijie Jin, born in 1984,Â is currently an associate professor at NDT & E Laboratory, Dalian University of Technology, China. His research interest is nondestructive testing and evaluation for materials.
Donghui Zhang, born in 1974,Â is currently a center manager at China Nuclear Industry 23 Construction Co., Ltd., China. His research interests include construction of nuclear engineering and nondestructive testing.
Jingyu Liao, born in 1985,Â is currently a director of research at China Nuclear Industry 23 Construction Co., Ltd., China. Her research interests include construction of nuclear engineering and nondestructive testing.
Dongxin Fu, born in 1996,Â is currently a verification service engineer at China Nuclear Industry 23 Construction Co., Ltd., China. Her main research interest is nondestructive testing and evaluation for materials.
Li Lin, born in 1970,Â is currently a Professor at NDT & E Laboratory, Dalian University of Technology, China. Her main research interest is nondestructive testing and evaluation for materials.
Corresponding authors
Ethics declarations
Competing Interests
The authors declare no competing financial interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Shi, S., Jin, S., Zhang, D. et al. Improving Ultrasonic Testing by Using Machine Learning Framework Based on Model Interpretation Strategy. Chin. J. Mech. Eng. 36, 127 (2023). https://doi.org/10.1186/s1003302300960z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1003302300960z