 Original Article
 Open access
 Published:
A DualTask Learning Approach for Bearing Anomaly Detection and State Evaluation of Safe Region
Chinese Journal of Mechanical Engineering volumeÂ 37, ArticleÂ number:Â 4 (2024)
Abstract
Predictive maintenance has emerged as an effective tool for curbing maintenance costs, yet prevailing research predominantly concentrates on the abnormal phases. Within the ostensibly stable healthy phase, the reliance on anomaly detection to preempt equipment malfunctions faces the challenge of sudden anomaly discernment. To address this challenge, this paper proposes a dualtask learning approach for bearing anomaly detection and state evaluation of safe regions. The proposed method transforms the execution of the two tasks into an optimization issue of the hypersphere center. By leveraging the monotonicity and distinguishability pertinent to the tasks as the foundation for optimization, it reconstructs the SVDD model to ensure equilibrium in the model's performance across the two tasks. Subsequent experiments verify the proposed method's effectiveness, which is interpreted from the perspectives of parameter adjustment and enveloping tradeoffs. In the meantime, experimental results also show two deficiencies in anomaly detection accuracy and state evaluation metrics. Their theoretical analysis inspires us to focus on feature extraction and data collection to achieve improvements. The proposed method lays the foundation for realizing predictive maintenance in a healthy stage by improving condition awareness in safe regions.
1 Introduction
The development of equipment maintenance techniques has experienced three stages so far, including reactive maintenance, preventive maintenance, and predictive maintenance, respectively [1]. Predictive maintenance effectively reduces maintenance costs [2], and its attainment relies on the awareness of state changes. Since the state change in the abnormal stage is more significant than the healthy stage, the existing research of predictive maintenance mainly focuses on the abnormal stage. Before the anomalies occur, anomaly detection provides a qualitative means of condition monitoring, which can reduce accidents before equipment damage [3]. Given its pivotal role in safeguarding the operation of industrial equipment and systems [4], anomaly detection has garnered extensive research attention. This paper organizes anomaly detection techniques into three groups of methods: reconstructionbased methods, classificationbased methods, and distancebased methods.
Reconstructionbased methods define anomalies by meticulously analyzing deviations in domain mapping and the data reconstruction processes. Huang et al. [5] advanced a memory residual regression autoencoder to improve the detection accuracy; it used the reconstruction errors and surprisal values to indicate the abnormal condition of the bearing. Jiang et al. [6] proposed a generative adversarial network to realize sample reconstruction and common feature learning, which overcomes the problem of data imbalance. Mao et al. [7] presented a new selfadaptive mapping strategy for incipient fault online detection. An autoencoder was used to extract features in data reconstruction, and a classifier was introduced to distinguish these features. An adaptive threshold method was proposed in Ref. [8] based on the extreme theory, and the network reconstruction residuals were used for anomaly detection and location. This methodâ€™s reconstruction error is generally insensitive to the incipient anomaly [5].
Classificationbased methods treat anomaly detection as a classification task, including oneclass classification and twoclass classification. The oneclass classification method arises from the reality that the collected operation data often only includes healthy data. Vos et al. [9] combined a long shortterm memory network and a oneclass support vector machine to solve this problem for vibration data. Combining the two methods can also be used for data sequences with variable length [10]. Zhang et al. [11] advanced an endtoend algorithm, which designed a novel loss function to jointly learn shapelets and support vector data description (SVDD) decision boundary. Zhao et al. proposed a dynamic radius SVDD [12] to detect the anomalies of aircraft engines; the angle calculation was introduced in feature space to solve the neglected irregularities of the hypersphere. When healthy and abnormal data are accessible, twoclass classification methods can be applied for anomaly detection. Song et al. [13] used a metalearningbased method to achieve fewshot anomaly detection. In Ref. [14], the oil and bearing temperatures were treated as two univariate variables to construct a dual support vector machine model and analyze the adaptive threshold of the binary classification model. In addition, the timefrequency analysis method can detect anomalies by diagnosing the specific fault [15]; it considers the equipment healthy when no faults are identified.
Distancebased methods detect anomalies by measuring distances in a designated metric space. Montechiesi et al. [16] advanced a modified artificial immune system to achieve anomaly detection through the similarity calculation between antigens based on Euclidean distance minimization. Liu et al. [17] decreased the false alarm rate of anomalies based on the information fusion of two distance dimensions: the spatial dimension considered the differences between different locations simultaneously. In contrast, the time dimension considered the differences between actual and prediction values at different times in the same location.
Regardless of the three types of methods, anomaly detection can be attributed to constructing a safe region. The differences between them are how they are constructed and the metric spaces they are constructed in. As illustrated in Figure 1, since the monitoring data in the healthy stage is generally stable, the safe region is treated as a black box where we do not care about its state changes before the anomalies occur. It assumes that the health state is immutable in a safe region. However, this qualitative way makes the identification of anomalies necessarily a sudden event. Correspondingly, preventive measures are taken to deal with this uncertainty, especially for scenarios where anomalies are not allowed or the abnormal stage is very short. These bring many disadvantages, including increased downtime, maintenance costs, safety risks, etc.
To deal with these problems, quantifying the timevarying health state is considered in implementing anomaly detection. That is, the health state evaluation in the safe region (HSESR) is introduced into the construction of the safe region. The state information helps us understand the operating status of equipment better so that reasonable maintenance can be performed to extend equipment life, reduce safety risks, and improve production efficiency. For HSESR, the main difficulty is that the health indicators are generally stable and cannot reflect the changing trend of the health state. Based on the irreversibility of mechanical degradation [18], monotonic feature extraction is the key problem of HSESR.
Boundaries tend to tightly wrap the healthy data in constructing the safe region to prevent misjudgment. The anomalies of bearings always lead to significant changes in the distribution of monitoring data. This change makes abnormal data distinguishable from healthy data. In these cases, a tight envelope is no longer necessary, and it is very costeffective to release the envelope for additional gain without affecting anomaly detection. Following this idea, the improvement of feature monotonicity is set as the â€˜additional gainâ€™ for HSESR.
Taking advantage of the good feature extraction performance in kernel space, we propose a new scheme of anomaly detection based on the variant of the classic SVDD modeling method. By introducing the timevarying health state into the construction of the safe region, the proposed scheme prevents the suddenness of anomalies without compromising anomaly detection. In this way, anomaly detection and HSESR are unified under the same framework. The contributions of this paper are summarized below.

1)
This paper opens a new way of thinking to deal with the suddenness of anomalies, that is, to explore trend indicators to characterize the state information in a safe region.

2)
This paper deduces the solution principle of the univariate SVDD model under the condition of a certain hypersphere center.

3)
A new framework is proposed by unifying anomaly detection and HSESR under the same framework.
2 Methodology
The proposed method is inspired by SVDD, which has been proven effective in detecting anomalies. It constructs a hypersphere in highdimension kernel space as the boundary of the safe region, and anomaly detection is achieved based on the distances defined in the kernel space. To achieve anomaly detection and HSESR simultaneously, we reorganize the hyperspheresolving process. In particular, the collaborative optimization of the center and boundary is changed to two independent optimizations. The center optimization obtains the health indicator (HI) sequence to evaluate the healthy state, and the boundary optimization combines the optimized center for anomaly detection. The framework of the proposed method is shown in Figure 2. It consists of four portions: date preparation, hypersphere center optimization, boundary solving, as well as the implementation of anomaly detection and HSESR. The following assumptions must be satisfied for application. (1) The chronological order of the data must be clear; (2) healthy and abnormal data can be distinguished well; (3) degradation features can be captured in a safe region.
2.1 Data Preparation
Data preparation is to obtain the data required by the model. It consists of three steps: dataset division, feature extraction, and feature selection.
Dataset division is to divide the data into a training dataset and a test dataset. Since healthy data accounts for the majority, whereas abnormal data is rare in real industrial data, half of the abnormal data are randomly assigned as the test data to prevent the number of either dataset from being too small. At the same time, the same amount of healthy and abnormal data is randomly sampled as the test data. Then, the remaining data is regarded as training data. Feature extraction is to calculate the features to reflect the health condition of the bearing. Feature selection is to screen the features that meet specific requirements. To present state information alongside anomaly detection, two feature indicators need to be determined: one emphasizes the realization of anomaly detection, the ability to distinguish between healthy and abnormal phases is the key for this feature; the other focuses on reflecting changes in health state, the monotonicity is prioritized for it because the degradation process of mechanical components is theoretically irreversible [18], and the true inherent health condition is commonly assumed to deteriorate over time. The monotonicity is generally defined as follows:
where K is the total number of samples; Î´(Â·) is the simple unit step function; and H(t) refers to the value of HI at time t.
However, it merely focuses on the local monotonicity but neglects the influence of each point on the global monotonicity. Figure 3 provides two examples, and their corresponding metric results are shown in Table 1. Line 2 of Figure 3(a) is regarded as having better monotonicity than the other according to Eq. (1) because only point C weakens its monotonicity while points b and d weaken Line1â€™s. This contradicts the observable fact. The mistake stems from neglecting the affection of points D, E, and F to the monotonicity from a global view.
Additionally, the monotonicity evaluation of each point in Eq. (1) is qualitative, which cannot precisely reflect the difference in their monotonicity. For another instance, the monotonicity of the two lines in Figure 3(b) is the same according to Eq. (1). They are not the same because the influences of the two local minimum points R and r to monotonicity differ.
To accurately describe the monotonicity, this paper introduces the inverse number as a supplement of Eq. (1). In mathematics, for a real array A, which contains N numbers, if i < j and A[i] > A[j], (A[i], A[j]) is called an inverse pair. The total number of inverse pairs in an array is called the inverse number. The inverse number considers the monotonicity from the perspective of adjacent points and the overall relationship among all points. On this basis, a new evaluation index for monotonicity has been devised, transforming the inverse number into a metric that falls within the [0,1] scope.
where I(A) returns the inverse number of array A.
The metrics values for the four lines depicted in Figure 3 are presented in Table 1, corroborating the preceding analysis. In Figure 3(a), points D, E, and F all produce inverse pairs in Line2 while only points c and e in Line1. Thus, we can correctly judge with the Mon2 to consider the global monotonicity. In addition, the Mon2 achieves a quantitative assessment of monotonicity. In Figure 3(b), point R produces one more inverse pair than point r, which allows us to distinguish the subtle difference in the monotonicity of the two figures. Accordingly, the inverse number selects the feature indicator that best reflects the health state change in a safe region.
After determining the feature indicators, the proposed method fuses them into the HI. We draw on the idea of SVDD to map input features to the hyperspace and utilize the distances in the hyperspace to construct HIs. Since only the envelope is concerned, the traditional SVDD model is achieved by balancing two core elements: the center a and the radius R. The model construction of the proposed method is transformed into independent computations to consider both anomaly detection and HSESR. Accordingly, it is attained by hypersphere center optimization and boundary construction.
2.2 Hypersphere Center Optimization
Anomaly identification and state assessment depend on the HI array, while the HI array has a onetoone correspondence with the hypersphere center. Therefore, the center optimization is a problem of HI array optimization.
2.2.1 Optimization Model
Let the sample feature set be denoted as {s_{i}}. Referring to the traditional SVDD model, the hypersphere center a is defined [19]:
where Î³_{i} is a weight factor, it reflects the contribution of the feature s_{i} to the center; Ï• is a mapping function, and it implicitly maps the data to feature space.
Then, the HI array is expressed by the distances between samples and the hypersphere center with
Bring Eq. (2) into Eq. (3), we have
where k(s_{i}, s_{j}) is the kernel function, and the Gaussian RBF kernel is adopted in this article.
where Ïƒ is the bandwidth, controlling the radial range of action.
If we denote the HI array as H, then H = {d_{1}^{2}, d_{2}^{2}, â€¦, d_{N}^{2}}, where N is the number of samples. Except for the hyperparameter Ïƒ, H is determined by the group of weight factors {Î³_{1}, Î³_{2}, â€¦, Î³_{N}}. Therefore, the problem of HI array optimization is further expressed by implicit function model G(Î³_{1}, Î³_{2}, â€¦, Î³_{N}).
The optimization of the model includes monotonicity optimization with only healthy data and distinguishability optimization with healthy and abnormal data.

(1)
Optimization Model Considering Monotonicity
To reflect the condition degradation inside the safe region, monotonicity is utilized for the model construction. Denote the feature set of healthy training data as {s_{i}^{h}, i = 1, 2, â€¦, NH}; the subscript h to the corresponding parameters of the healthy data. Bring them into Eqs. (2) and (4), we have
$${\varvec{a}}_{1} \, = \,\sum\limits_{i} {\gamma_{i}^{h} \phi (s_{i}^{h} )} ,\quad {\text{s}}.{\text{t}}.,\,\sum\limits_{i} {\gamma_{i}^{h} = 1,}$$(7)$$\left( {d_{i}^{h} } \right)^{2} = k\left( {s_{i}^{h} ,s_{i}^{h} } \right)  2\sum\limits_{j} {\gamma _{j}^{h} } k\left( {s_{i}^{h} ,s_{j}^{h} } \right) + \sum\limits_{{ij}} {\gamma _{i}^{h} \gamma _{j}^{h} } k{\mkern 1mu} \left( {s_{i}^{h} ,s_{j}^{h} } \right).$$(8)Then, HI array is obtained as H^{h} = {(d_{1}^{h})^{2}, (d_{2}^{h})^{2}, â€¦, (d_{NH}^{h})^{2}}. The inverse number is applied as the monotonicity metric, and its calculation function is marked as I(Â·); the optimization model is expressed as
$$G_{1} \,\left( {\gamma _{1}^{h} ,\gamma _{2}^{h} ,...,\gamma _{{NH}}^{h} } \right) = \min I(H^{h} ),\quad \text{s.t.},{\mkern 1mu} \sum\limits_{i} {\gamma _{i}^{h} } {\mkern 1mu} = {\mkern 1mu} 1.$$(9) 
(2)
Optimization Model Considering Distinguishability
Distinguishability is also introduced for the model construction to describe the difference between healthy and abnormal data. In addition to the healthy data, rare but valuable abnormal data is exploited to improve boundaries' distinguishability through center optimization indirectly. Specifically, the center described by the healthy data is kept as far away from the abnormal data as possible. Denote the feature set of abnormal training data as {s_{p}^{a}, p = 1, 2, â€¦, NA}, the subscript a refers to the corresponding parameters of the abnormal data. The distances from each abnormal point to the center can be expressed as
$$\left( {d_{p}^{a} } \right)^{2} = {\mkern 1mu} \left( {\phi (s_{p}^{a} )  \,\varvec{a}_{2} } \right)^{2} ,$$(10)$$\varvec{a}_{2} \, = \sum\limits_{i} {\gamma _{i}^{a} \phi \,\left( {s_{i}^{h} } \right),\quad } {\text{s}}.{\text{t}}.,{\mkern 1mu} \sum\limits_{i} {\gamma _{i}^{a} } {\mkern 1mu} = {\mkern 1mu} 1.$$(11)Bring Eq. (10) into Eq. (9), we have
$$\left( {d_{p}^{a} } \right)^{2} = k{\mkern 1mu} \left( {s_{p}^{a} ,s_{p}^{a} } \right)  2\sum\limits_{i} {\gamma _{i}^{a} } k{\mkern 1mu} \left( {s_{i}^{h} ,s_{q}^{a} } \right) + \sum\limits_{{ij}} {\gamma _{i}^{a} \gamma _{j}^{a} } k{\mkern 1mu} \left( {s_{i}^{h} ,s_{j}^{h} } \right).$$(12)Their HI array is obtained as H^{a} = {(d_{1}^{a})^{2}, (d_{2}^{a})^{2}, â€¦, (d_{NA}^{a})^{2}}. The abnormal points are expected to be as far away from the safe region to achieve distinguishability. Thus, the optimization model is expressed as
$$G_{2} \,\left( {\gamma _{1}^{a} ,\gamma _{2}^{a} ,...,\gamma _{{NA}}^{a} } \right){\mkern 1mu} \, = {\mkern 1mu} \max {\mkern 1mu} \left( {\sum {H^{a} } } \right),\quad \text{s.t.},{\mkern 1mu} \sum\limits_{i} {\gamma _{i}^{a} } {\mkern 1mu} = {\mkern 1mu} 1.$$(13)
2.2.2 Model Solution and Center Expression

(1)
Model Solving Based on Genetic Algorithm
The models of Eqs. (9) and (13) are the problems of multivariate implicit function optimization. We introduce the genetic algorithm to automatically seek the optimal solution based on natural selection and genetic mechanisms. The optimizations are achieved by following processes.
Step 1. Parameter initialization. Suppose the initial population is M, the individual P is expressed as
$$\begin{gathered} P_{j} = [\gamma_{{1}} ,\gamma_{2} , \cdots ,\gamma_{N} ]_{j} = [\gamma_{1j} ,\gamma_{2j} , \cdots ,\gamma_{Nj} ],\forall j = 1,2, \cdots ,M, \hfill \\ {\text{s.t.}},\,\sum\limits_{i} {\gamma_{ij} = 1} ,\forall i = 1,2, \cdots ,N. \, \hfill \\ \end{gathered}$$(14)Step 2. HI calculation. Based on the M groups of P_{j}, we can get the distance set {H_{1}, H_{2}, â€¦, H_{M}} by Eq. (5).
Step 3. Fitness calculation. The fitness function of each population is calculated as the index of their fitness.
Step 4. Convergence strategy. As a result, it fluctuates greatly with the Ïƒ and the maximum number of iterations allowed is selected instead of the specific threshold.
Step 5. Survival of the fittest through natural selection. After the selection operations, crossover, and mutation, a more adaptable population is obtained for further evolution until it meets the stop criteria.

(2)
Fusion Representation of the Hypersphere Center
After applying the genetic algorithm to solve the optimizations for monotonicity and distinguishability, we obtain two sets of optimized parameters. They are further used for two optimized centers with Eqs. (7) and (11). Subsequently, the weight coefficient coef is introduced to balance them, and we obtain the final hypersphere center:
$$\user2{a} = \user2{f(a}_{1} \user2{,a}_{2} ,coef) = \,\left[ {coef \cdot P^{h} + (1  coef) \cdot P^{a} } \right] \cdot \varvec{\phi} ^{h} ,$$(15)where Ï•^{h} = [Ï•(s_{1}^{h}), Ï•(s_{2}^{h}), â€¦, Ï•(s_{NH}^{h})]^{T} and the subscript T signifies the transpose symbol.
$$P^{h} = \mathop {\arg }\limits_{{\gamma _{1}^{h} ,\gamma _{2}^{h} ,...,\gamma _{{NH}}^{h} }} \min I\,\left( {H^{h} } \right),$$(16)$$P^{a} = \mathop {\arg }\limits_{{\gamma _{1}^{a} ,\gamma _{2}^{a} ,...,\gamma _{{NA}}^{a} }} \max \,\left( {\sum {H^{a} } } \right).$$(17)The coef needs to be manually adjusted in the range [0,1] according to the characteristics of the data. The adjustment is to balance the monotonicity and distinguishability of HI. When the distinguishability of feature 1 is good, a large coef is feasible to improve the monotonicity of HI. Otherwise, a smaller coef is necessary to avoid the impact on anomaly detection.
2.3 Boundary Construction
With the optimized hypersphere center, the boundary is solved based on a variant of SVDD. As a is fixed, the variant is a univariate model converted from the original bivariate model:
where C is a tradeoff to balance the volume and errors; and Î¾_{i} is the slack variable to allow more points to be contained in the hypersphere with the constraint of Î¾_{i} > 0.
The Lagrange multipliers method is applied to incorporate the constraints into the model. The minimization problem is transformed into a maximum one, as shown below:
where Î±_{i} and Î²_{i} are Lagrange multipliers with the constraints Î±_{i} â‰¥ 0 and Î²_{i} â‰¥ 0, respectively.
Seeking the partial derivatives of the variables and setting them to 0, we have
By introducing Eqs. (15) and (20), Eq. (19) is converted into Eq. (21):
where Î³_{new} = coefÂ·P^{h} +(1coef)Â·P^{a}; K = [k(s_{i}^{h}, s_{j}^{h})]_{NHÃ—NH}, and i, j = 1, 2, â€¦, NH; A=[Î±_{1}, Î±_{2}, â€¦, Î±_{i},â€¦, Î±_{NH}]^{T}, and 0 â‰¤ Î±_{i} â‰¤ C, sum(A) = 1; sum(Â·) is summation function and âˆ˜ refers to the Hadamard product.
In Eq. (21), the first term equals 1 using the Gaussian RBF kernel. The second term is also a constant because Î³_{new} has been determined with a genetic algorithm, and K can be calculated with the training data and kernel function. Thus, the last term is the only changeable term that changes with the elements of A. As only the data s_{i}^{h} with the Î±_{i} > 0 describes the boundary, these data are called the support vectors of the description [20]. Assuming R^{2} is the radius square of the hypersphere, we have
where N_{sv} is the number of support vectors; and s_{sv} denotes the support vectors.
2.4 Implementation of Anomaly Detection and HSESR
In the proposed method, HSESR and anomaly detection are the two goals we want to achieve simultaneously.
HSESR can be achieved with the HI array of healthy data. The HIs of the training and test data are all calculated to illustrate the implementation process and effect of the proposed method. After getting the optimized hypersphere center, the HI array of healthy data is acquired by the distances of their features from each point to the hypersphere center in kernel space. Sorting the elements of an array by time, the sorted sequence can be considered timecontinuous. To make the trend more prominent, a smoothing of the moving average is adopted to remove the effect of volatility, and the smoothed indicator can display the state change of healthy data with time.
Anomaly detection needs to be judged by both HI array for healthy and abnormal data. Unlike HSESR, the obtained array in anomaly detection is temporal discrete. In addition, it requires the introduction of boundaries for anomaly judgment. In this way, anomaly detection and HSESR are unified into the same framework and implemented simultaneously.
3 Experimental Validation and Discussion
There are two dataset types for bearing prognostics and health management. One is a fault dataset whose data is collected under different bearing faults, and the other is a degradation dataset that consists of timecontinuous operation data. The latter is selected for validation because it satisfies the data assumption of the proposed method. Accordingly, two bearing degradation benchmark datasets are used for experiments.
3.1 Experimental Dataset
XJTUSY bearing dataset has large data amounts and abundant failure types [21]. Three different operating conditions were tested, including a 12 kN load at 2100 r/min, an 11 kN load at 2250 r/min, and a 10 kN load at 2400 r/min. Five bearings were tested under each operating condition, and two accelerometers were used to record vibration signals in the horizontal and vertical directions. The sampling frequency is 25.6 kHz, each time record collects 1.28 s of data, and the interval between two adjacent collections is 1 min.
PRONOSTIA bearing dataset is the most widely used bearing degradation dataset [22]. Three different operating conditions were tested, including a 4 kN load at 1800 r/min, a 4.2 kN load at 1650 r/min, and a 5 kN load at 1500 r/min. Seven bearings were tested under the first two operating conditions, and three bearings were tested under the last operating condition. Two accelerometers were installed to record vibration signals in the horizontal and vertical directions. The sampling frequency is 25.6 kHz, each time record collects 0.1 s of data, and the interval between two adjacent collections is 10 s.
3.2 Performance Evaluation Metric
This paper evaluates the experimental results concerning existing related research. The anomaly detection result is evaluated from the accuracy perspective, including the accuracy metrics of healthy data, abnormal data, and all data. The result of HSESR is judged from the perspective of monotonicity and correlation. The monotonicity is measured by both traditional metrics obtained by Eq. (1) and the newly introduced metric of inverse number; the correlation is estimated by two classic metrics: Pearson coefficient is used to show the linear correlation of the HI array with time, and the Spearman coefficient reflects the nonlinearity of the HI array through monotonicity.
3.3 Comparison Methods
Three distinguished methods from varied domains have been chosen for comparison: RMS, the negative entropy of the squared envelope spectrum (NESES), and the autoencoder. RMS is a cornerstone metric in vibration analysis, providing a perspective on system states by measuring vibration intensity. NESES is adept at discerning nuanced variations in earlystage faults by gauging signal complexity [23]. On the other hand, the autoencoder, a sophisticated approach widely employed in anomaly detection [24, 25], excels at seamlessly extracting anomalous features directly from raw data.
3.4 Experimental Results
The feature indicators are selected based on the calculation results of 35 classical statistical characteristics [26] commonly used as HIs. The feature of standard deviation frequency reflects the degradation in a safe region best, and it is chosen as the first feature. Besides, the RMS value is selected as the other candidate.
After specifying the features to be extracted, we use the dataset of XJTUSY for the experiment first. The metric results of anomaly detection are shown in Table 2. The Bearing 1_4 and Bearing 3_5 are excluded due to their insufficient data volume.
Across most datasets, all four methods effectively distinguish between healthy and abnormal states. The proposed method, in particular, demonstrates exceptional robustness and consistently delivers the most outstanding average accuracy. Further, the healthy data is collated based on temporal continuity, and the corresponding metrics for HSESR are computed. The original SVDD method is also applied as an additional comparison to see the changes before and after the improvement. The monotonicity results are shown in Figure 4.
The result reveals that the monotonicity metrics of the proposed method see enhancements across all datasets, with the majority showing substantial improvements. Specifically, the Mon1 metric surges with an impressive average growth of 281.8%, and the Mon2 metric increases by 39.2%. Further, the correlation results are illustrated in Figure 5, where Cor1 and Cor2 correspond to the Pearson and Spearman coefficients.
Consistent with the data above, the results in Figure 5 highlight the proposed method's pronounced augmentation in the correlation of HIs over time. This encompasses both linear and nonlinear correlations. To elaborate, the linear correlation metric Cor1 rises by 132.6%, and the nonlinear correlation metric Cor2 increases by 157.1%. Postenhancements, the two metrics have values of 0.797 and 0.842, respectively. This transformation suggests a shift from a previously weak correlation to a very strong correlation.
The PRONOSTIA dataset was also tested for the proposed method's generalization performance. The metric results of anomaly detection are shown in Table 3, and the metric results of HSESR are calculated in Figures 6 and 7.
For anomaly detection, the outcomes closely align with the XJTUSY dataset. The performance of RMS is decent but exhibits some inconsistencies; NESES displays significant fluctuations. Both the autoencoder and the proposed method exhibit impressive performance. However, the former shows performance declines on certain datasets.
For HSESR, the proposed method brings about substantial improvements across all metrics. In terms of monotonicity metrics, the Mon1 sees a remarkable average rise of 270.2%, while the Mon2 increases by an average of 51.3%.
About the correlation metrics, the linear correlation represented by Cor1 surges by 198.1%, whereas the nonlinear correlation, indicated by Cor2, grows by 120.1%. Following these optimizations, the values of the two metrics settle at 0.806 and 0.827, respectively. This marks a transition from an initially weak correlation to a very strong one.
Upon consolidating the metric results of the proposed method from both datasets, the following insights emerge: In anomaly detection, the proposed method showcases a consistently stellar accuracy, surpassing 99%. When we shift our focus to state evaluation, the metrics tell a tale of significant advancement. One of the monotonicity metrics registers a surge of roughly 276%, while its counterpart experiences a rise of about 45.3%. Parallelly, two correlation metrics experience robust growth, with average increases of 165.4% and 138.6%, respectively. Postoptimization, both the metrics settle at impressive averages of 0.802 and 0.835. This remarkable evolution highlights a transformative leap from an initial weak correlation to a very strong one, attesting to the soundness of the state assessment. The proposed method performs state assessments during healthy intervals and maintains unparalleled accuracy in detecting anomalies.
3.5 Result Discussion
Based on the experimental results, this section analyzes the proposed method from two aspects: effectiveness and deficiency.
The effectiveness refers to the attainment of anomaly detection and state evaluation. The achievement of anomaly detection is based on the second assumption of the proposed method, that is, healthy and abnormal data can be distinguished well. When abnormal data are not easily distinguished from healthy data, anomaly detection requires a tight envelope, and the boundary adjustment may affect the anomaly detection. When the abnormal data are distinguished from the healthy data, the proposed method can sacrifice part of the envelopment to achieve state assessment. Although the adjusted boundary becomes looser, the optimization of distinguishability is added in the adjustment process to ensure the effectiveness of anomaly detection.
The effectiveness of HSESR is interpreted from two perspectives. Perspective 1: Parameter adjustment. According to Eq. (4), the HI array is determined by the location of the hypersphere center, while the group of weight factors determines the center in Eq. (3). After applying a genetic algorithm to optimize the weight factors, the obtained HI array better reflects the health state in the safe region. As shown in Figure 8, the example with the XJTUSY dataset intuitively shows that the proposed method improves the monotonicity of HI.
Perspective 2: Enveloping tradeoffs. The optimization of the HI sequence is based on the reduction of envelope requirements, and Figure 9 shows the envelop plot of the data in Figure 8. Compared to the original SVDD model, the envelop boundary of the proposed method becomes looser, as shown in Figure 9(b). Due to the good distinguishability of the data, the loose boundary can still separate healthy data from abnormal data very well, just as Figure 9(a) shows. At the same time, the released envelope is directly transformed into the monotonic gain of the HI sequence through model optimization, which ensures the effectiveness of HSESR.
Two deficiencies exist in the experimental results. The first one is that the accuracy of several anomaly detection results is not very high, such as Bearing 1_1, Bearing 1_2, and Bearing 2_2 of the PRONOSTIA dataset. The reason is that the distinguishability of the feature indicators is not good enough; in other words, the data does not meet the applicable conditions of the proposed method.
Another shortcoming is that the HSESR metrics of some data are still not good enough, even though these trend features have been greatly improved. These unsatisfactory performances are also closely related to the application hypotheses of the proposed method. For instance, the abnormal data for Bearing 2_5 from the PRONOSTIA dataset is not distinctly differentiated from the healthy data, indicating a weak alignment with the second hypothesis. Accordingly, the data envelope has to be tight to prevent the misjudgment of abnormality, and the HI monotonicity cannot be well considered. Another example, the HSESR of Bearing 3_2 in the XJTUSY dataset is invalid because all metrics of HSESE are too small to reflect the tendency. It can be attributed to the failure to satisfy the third hypothesis, i.e., the trend of extracted features in safe regions is too poor. The trend of the features is positively correlated with the metrics of the HSESE. For example, the extracted trend features do not reflect the condition degradation well in Bearing 1_2 in the XJTUSY dataset and Bearing 2_1 of the PRONOSTIA dataset; the corresponding correlation metrics of their results are all lower than 70%, which is lower than the other data.
Therefore, future improvements can be made from the following aspects.

1)
Better features must be explored to characterize the degradation in safe regions. As the degradation process of mechanical components is theoretically irreversible [18], it is commonly assumed that the true inherent health condition decreases with time [27]. Since the degradation inside the safe region is imperceptible, most studies assume no degradation exists. This degradation is theoretically inevitable. The study of healthy features is expected to improve the predictive maintenance inside the safe region.

2)
The feature performances of generalization and robustness are expected to improve the indicator's effectiveness for more and wider data. Although the adopted features work well on most data, some cases still do not apply.

3)
More data are expected to be collected. The amount of data used in the modeling is insufficient, making the boundary's generalization ability insufficient for unknown data.
4 Summary and Conclusions
In this paper, a dualtask learning approach is proposed to deal with the problem of suddenness in anomaly detection. By considering both the monotonicity and distinguishability of the HIs in model construction, the proposed scheme unifies anomaly detection and HSESR under the same framework. Experimental outcomes from two datasets reveal that the proposed method ensures an impressive average anomaly detection accuracy surpassing 99% and excels in state evaluation. The correlation indicators have surged upwards by over 150%, reaching values beyond 0.8. This signifies a shift in correlation from its initial weak correlation to an extremely strong one. Still, some data results were suboptimal because the application assumption was unsatisfied. Analysis of the results showed that the data amount and the extracted features are the key factors affecting the effect of the method. Accordingly, they enlighten us on the direction for further improvement in the future. The proposed method lays the foundation for implementing predictive maintenance throughout the life cycle by improving state awareness in safe regions.
Availability of Data and Materials
The datasets generated and analyzed during the current study are available in the XJTUSY dataset repository, https://biaowang.tech/xjtusybearingdatasets/, and PRONOSTIA dataset repository, https://www.kaggle.com/ datasets/alanhabrony/ieeephm2012datachallenge.
References
A Bousdekis, B Magoutas, D Apostolou, et al. Review, analysis and synthesis of prognosticbased decision support methods for condition based maintenance. Journal of Intelligent Manufacturing, 2018, 29: 13031316.
A Cubillo, S Perinpanayagam, M EsperonMiguez. A review of physicsbased models in prognostics: Application to gears and bearings of rotating machinery. Advances in Mechanical Engineering, 2016, 8(8): 1687814016664660.
X L Ou, G R Wen, X Huang, et al. A deep sequence multidistribution adversarial model for bearing abnormal condition detection. Measurement, 2021, 182: 109529.
R N Liu, B Y Yang, E Zio, et al. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mechanical Systems and Signal Processing, 2018, 108: 3347.
X Huang, G R Wen, S Z Dong, et al. Memory residual regression autoencoder for bearing fault detection. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 112.
W Q Jiang, Y Hong, B T Zhou, et al. A GANbased anomaly detection approach for imbalanced industrial time series. IEEE Access, 2019, 7: 143608143619.
W T Mao, J X Chen, X H Liang, et al. A new online detection approach for rolling bearing incipient fault via selfadaptive deep feature matching. IEEE Transactions on Instrumentation and Measurement, 2020, 69(2): 443456.
H S Zhao, H H Liu, W J Hu, et al. Anomaly detection and fault analysis of wind turbine components based on deep learning network. Renewable Energy, 2018, 127: 825834.
K Vos, Z X Peng, C Jenkins, et al. Vibrationbased anomaly detection using LSTM/SVM approaches. Mechanical Systems and Signal Processing, 2022, 169: 108752.
T Ergen, S S Kozat. Unsupervised anomaly detection with LSTM neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(8): 31273141.
J T Zhang, B Zeng, W M Shen, et al. A oneclass Shapelet dictionary learning method for wind turbine bearing anomaly detection. Measurement, 2022, 197: 111318.
Y P Zhao, Y L Xie, Z F Ye. A new dynamic radius SVDD for fault detection of aircraft engine. Engineering Applications of Artificial Intelligence, 2021, 100: 104177.
W B Song, D Wu, W M Shen, et al. Metalearning based early fault detection for rolling bearings via fewshot anomaly detection. arXiv:2204.12637, 2022.
H S Dhiman, D Deb, S M Muyeen, et al. Wind turbine gearbox anomaly detection based on adaptive threshold and twin support vector machines. IEEE Transactions on Energy Conversion, 2021, 36(4): 34623469.
Y F Li, X Zhang, Z G Chen, et al. Timefrequency ridge estimation: An effective tool for gear and bearing fault diagnosis at timevarying speeds. Mechanical Systems and Signal Processing, 2023, 189: 110108.
L Montechiesi, M Cocconcelli, R Rubini. Artificial immune system via Euclidean Distance Minimization for anomaly detection in bearings. Mechanical Systems and Signal Processing, 2016, 7677: 380393.
Y Z Liu, Y S Zou, Y Wu, et al. A novel abnormal detection method for bearing temperature based on spatiotemporal fusion. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 2021, 236(3): 317333.
Y Y Yin, Z L Liu, J H Zhang, et al. An adaptive sampling framework for life cycle degradation monitoring. Sensors, 2023, 23(2): 965.
Z L Liu, M J Zuo, J L Kang, et al. Equipment health condition assessment method based on support vector data description. Chengdu: University of Electronic Science and Technology of China Press, 2022.
Z L Liu, J L Kang, X J Zhao, et al. Modeling of the safe region based on support vector data description for health assessment of wheelset bearings. Applied Mathematical Modelling, 2019, 73: 1939.
B Wang, Y G Lei, N P Li, et al. A hybrid prognostics approach for estimating remaining useful life of rolling element bearings. IEEE Transactions on Reliability, 2020, 69(1): 401412.
P Nectoux, R Gouriveau, K Medjaher, et al. PRONOSTIA: An experimental platform for bearings accelerated degradation tests. IEEE International Conference on Prognostics and Health Management, Denver, USA, June 1821, 2012: 18.
J Antoni. The infogram: Entropic evidence of the signature of repetitive transients. Mechanical Systems and Signal Processing, 2016, 74: 7394.
J U Ko, K Na, J S Oh, et al. A new autoencoderbased dynamic threshold to reduce false alarm rate for anomaly detection of steam turbines. Expert Systems with Applications, 2022, 189: 116094.
S Givnan, C Chalmers, P Fergus, et al. Anomaly detection using autoencoder reconstruction upon industrial motors. Sensors, 2022, 22(9): 3166.
Z L Liu, J Qu, M J Zuo, et al. Fault level diagnosis for planetary gearboxes using hybrid kernel feature selection and kernel Fisher discriminant analysis. The International Journal of Advanced Manufacturing Technology, 2013, 67(5): 12171230.
Z G Tian. An artificial neural network method for remaining useful life prediction of equipment subject to condition monitoring. Journal of Intelligent Manufacturing, 2012, 23(2): 227237.
Acknowledgements
Not applicable.
Funding
Supported by Sichuan Provincial Key Research and Development Program of ChinaÂ (Grant No. 2023YFG0351), and National Natural Science Foundation of China (Grant No.Â 61833002).
Author information
Authors and Affiliations
Contributions
YY designed the study, performed the assays, prepared the manuscript, and contributed to its application part; ZL conducted the optimization and assay validation studies; BG assisted with data analysis and processing; YY, ZL, and MZ participated in discussing the results and revised the manuscript; All authors read and approved the final manuscript.
Authorsâ€™ Information
Yuhua Yin, born in 1990, is currently a Ph.D. candidate at University of Electronic Science and Technology of China. He received an M.S. degree in mechanical engineering from Chongqing University, China, in 2014. His research interests include intelligent operation and maintenance, prognostic and health management.
Zhiliang Liu, born in 1984, is currently an Associate Professor at School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China. He received his Ph.D. degree in detection technology and automatic equipment from University of Electronic Science and Technology of China, in 2013. His research interests include intelligent maintenance for complex equipment using advanced signal processing and data mining methods.
Bin Guo, born in 1998, is currently pursuing an M.S. degree at University of Electronic Science and Technology of China. He received a B.S. degree in Mechanical Engineering from Chengdu University of Technology, China, in 2020. His research interests include signal processing and health maintenance of mechanical equipment.
Mingjian Zuo, born in 1962, is currently a principal scientist at Qingdao International Academician Park Research Institute, China, and a guest professor at University of Electronic Science and Technology of China. He received his Ph.D. degree in Industrial Engineering from Iowa State University, Ames, Iowa, U.S.A, in 1989. His research interests include system reliability analysis, maintenance modeling and optimization, signal processing, and fault diagnosis.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing financial interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yin, Y., Liu, Z., Guo, B. et al. A DualTask Learning Approach for Bearing Anomaly Detection and State Evaluation of Safe Region. Chin. J. Mech. Eng. 37, 4 (2024). https://doi.org/10.1186/s10033023009783
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1186/s10033023009783