 Original Article
 Open Access
 Published:
Digital Twinbased Quality Management Method for the Assembly Process of Aerospace Products with the GreyMarkov Model and Apriori Algorithm
Chinese Journal of Mechanical Engineering volume 35, Article number: 105 (2022)
Abstract
The assembly process of aerospace products such as satellites and rockets has the characteristics of single or smallbatch production, a long development period, high reliability, and frequent disturbances. How to predict and avoid quality abnormalities, quickly locate their causes, and improve product assembly quality and efficiency are urgent engineering issues. As the core technology to realize the integration of virtual and physical space, digital twin (DT) technology can make full use of the low cost, high efficiency, and predictable advantages of digital space to provide a feasible solution to such problems. Hence, a quality management method for the assembly process of aerospace products based on DT is proposed. Given that traditional quality control methods for the assembly process of aerospace products are mostly postinspection, the GreyMarkov model and TK control chart are used with a small sample of assembly quality data to predict the value of quality data and the status of an assembly system. The Apriori algorithm is applied to mine the strong association rules related to quality data anomalies and uncontrolled assembly systems so as to solve the issue that the causes of abnormal quality are complicated and difficult to trace. The implementation of the proposed approach is described, taking the collected centroid data of an aerospace product’s cabin, one of the key quality data in the assembly process of aerospace products, as an example. A DTbased quality management system for the assembly process of aerospace products is developed, which can effectively improve the efficiency of quality management for the assembly process of aerospace products and reduce quality abnormalities.
Introduction
Aerospace products include satellites, missiles, and rockets, which are complex in their customer requirements, product composition, manufacturing processes, and project management. Assembly is the addition or connection of parts to form a complete product, which is important in the research and development (R&D) and production of aerospace products. As the assembly process is the most important link in the delivery of aerospace products, to improve assembly quality and achieve quality management of the process has important engineering significance [1].
Aerospace product assembly is a typical discrete assembly, with the characteristics of single or smallbatch production and a long assembly cycle, involving many professional fields, scattered assembly data, and much rework and repair [2]. The assembly process generates much quality data, including indirect data such as the number of defective products, first pass rate, and repair rate, and direct data such as product length, weight, and assembly errors. Assembly quality data include data that can reflect product quality, such as quality cost loss, production batch, inventory backlog, and invalid operation time [3]. These data are the basis for the evaluation of assembly quality, which can be used to measure product quality and provide guidance for the continuous improvement of assembly quality.
Quality management is a process of organizing and coordinating related activities to meet certain quality requirements, i.e., to manage and control the above quality data. The complete quality management process aims to grasp the status of quality data, predict its future status, and adjust the assembly process accordingly, so as to maintain quality within a reasonable range. Assembly quality data are collected, stored and managed through a manufacturing execution system (MES) [4]. However, assembly quality management adopts postinspection, i.e., checking and controlling the quality of the final product. If the data meet the standard, the product is warehoused or it enters the next process, and otherwise it is returned for repair or scrap. Assembly process quality management for aerospace products has issues to be resolved.

(1)
Quality prediction when the sample volume of data is small. A large amount of quality data is generated in the assembly of aerospace products. Their variety is wide, and the volume of a particular type of data may be insufficient for forecasting. Accuracy of data forecasting and sample data volume show some positive correlation. Hence, it can be difficult to obtain accurate predictions.

(2)
Quick location of abnormal causes when influence factors of quality data are complicated. The assembly quality of aerospace products is affected by the factors of man, machine, material, method, measurement, and environment (5M1E), each having several influencing factors, making it difficult to directly determine specific causes of abnormal quality data.

(3)
Quality control is timeconsuming and has poor realtime performance. Assembly quality control is implemented based on the quality management process of postoptimization, including problem definition, investigation and measurement of relevant factors, analysis and determination of key factors, and control and improvement of influence factors. This can take a long time, precluding the avoidance of imminent quality problems.
The concept of digital twin (DT) was first proposed by the Air Force Research Laboratory in 2011, and was first applied to industry [5]. Its purpose is to optimize the simulation through virtual environment before production to avoid undesirable conditions. With the deepening of research, DT technology has been considered the key to realizing the interaction and fusion between the physical and information worlds [6, 7]. It provides a clear path for the implementation of a cyberphysical system (CPS) and introduces the idea of making full use of digital space to mirror, predict, guide and control the physical space [8]. Additionally, it provides a novel way to assist humans in understanding the physical world from a multitime dimension including the past, present and future. Hence, the application of DT technology to the quality management in the assembly process of aerospace products can realize quality status monitoring, prediction, and anomaly traceability in virtual spaces based on the collected realtime data and can enable adjustments in physical spaces. It conforms to the basic ideas of status monitoring and prediction, anomaly traceability, and the timely regulation of quality data. Therefore, DT provides a feasible approach to realize quality management of assembly process for aerospace products.
The rest of this paper is organized as follows. Section 2 summarizes the stateoftheart of quality data prediction, traceability technology, and DT applications in the manufacturing phase. In Section 3, the implementation framework of quality management for the assembly process of aerospace products based on DT is established. Three key technologies are discussed in Section 4, including numerical prediction of quality data based on Grey Markov model, assembly system statues prediction based on TK statistical control chart, and association rule mining for quality abnormalities based on the Apriori algorithm. In Section 5, taking the centroid data of an aerospace product’s cabin, one of the key quality data in the assembly process of aerospace products, as an example, the application process of the proposed method is elaborated in detail. Additionally, a DTbased assembly process quality management system for aerospace products is developed and verified. Finally, the main contributions of the paper and future work are summarized in Section 6.
Related Work
This paper focuses on the use of DT technology to achieve quality management for the assembly process of aerospace products. We start by reviewing the quality management technology including quality data prediction and traceability, and pointing out the future trend of taking full advantage of digital space, including low cost, high efficiency, multiple iterations, and predictability, to promote it. And then the stateoftheart of DT applications in the production phase is overviewed.
Quality Data Prediction and Traceability Technology
The key to quality management is to forecast the future development trend of quality data, and even its specific value. Data prediction includes qualitative prediction, i.e., of change trends, and quantitative prediction, i.e., of specific values. We study quantitative prediction for assembly quality management of aerospace products. Quantitative data prediction includes time series analysis and causal analysis. Time series analysis methods include the simple sequential time mean, weighted sequential time mean, moving average, weighted moving average, exponential smoothing, seasonal trend prediction, and market life cycle prediction. Taylor et al. analyzed several time series algorithms based on intraday power demand data from 10 European countries and found the prediction results of exponential smoothing to be best [9]. Guo et al. presented a chaotic time series prediction algorithm to predict wind speed, modified the model with the parallel rule algorithm, and verified the method with real wind data [10]. Time series analysis focuses on predicting large amounts of time series data, but the volume of assembly quality data for aerospace products is generally small and does not necessarily have strict time series. Causal analysis includes linear regression, support vector machines, neural network prediction, Grey prediction, and Markov prediction. To improve the data prediction accuracy, Zhou et al. investigated the fine tuning approach for the parameters of leastsquares support vector machines to predict onestep ahead shortterm wind speed [11]. For the case of TFT–LCDs which is a small sample size prediction problem, Li et al. developed a new approach involves threesteps. They are Kmeans clustering, attribute extension using the fuzzy membership function in each cluster, and put the data with new generate attributes into a backpropagation neural network (BPNN) machine learning algorithm [12]. Additionally, grey prediction well also discovers the intrinsic correlation behind the hazy phenomenon and is suitable to predict small datasets. Li et al. used trend price tracking to extract hidden information on the behavior of manufacturing sample data and constructed an adaptive Grey prediction model, AGM(1,1), to forecast industrial data [13]. Chang et al. presented a modified grey forecasting model to forecast the shortterm manufacturing demand [14]. Markov models can also be used to predict the trends of small datasets and analyze the future trends of discrete random processes, i.e., to predict the future state of a variable from its present state and change trend [15]. However, how to improve prediction accuracy based on small volumes of sample data is an issue.
The most commonly used approach for quality data tracing, the relational data model, builds the corresponding management model and database in the assembly process to trace abnormal quality data. Materials are marked to trace quality data. Zhuang et al. realized material traceability and management in the assembly process based on workflow technology [16]. Besides, pictures and videos data are collected to initiatively discover the existing quality problems and trace the causes of quality abnormities. On this basis, visionbased recognition methods from a feature perspective are applied to discover the defect [17]. Such methods achieve traceability of quality data through correlation but do not mine the complicated association between multisource heterogeneous scattered data, so the efficiency of quality problem traceability is low.
However, due to the unstable process, long assembly and adjustment period, and strict quality control in the R&D stage, quality management still has problems such as much rework and repair, and difficulty tracing quality issues. With the rapid development of newgeneration information technology such as the Internet of Things, big data, and artificial intelligence, the deep integration of digital and physical space has become a common bottleneck in various countries’ industrial strategies, including Industrial 4.0, the Industrial Internet, and “Made in China 2025.” Therefore, to fully utilize the advantages of digital space to improve the quality management of the assembly process for aerospace products will be the development trend [18,19,20].
Overview of DT Applications in the Production Phase
DT technology has been widely applied in all stages of the product lifecycle [21, 22], including design [23], manufacture [24], and service [25]. In the production stage, Tao et al. introduced the digital twin shopfloor (DTS), a paradigm for shopfloor instances of cyberphysical production systems (CPPSs) [26]. Leng et al. incorporated DT in the parallel control of automated manufacturing systems [27]. Park et al. built a DTbased CPPS architectural framework for personalized production [28]. Bao et al. investigated an ontologybased modeling and evolution method of DT for the assembly shopfloor to deal with for the issues such as discreteness of assembly process, diversity of assembly resource, and complexity of dataflow in the assembly task execution [29]. Son et al. presented a DTbased CPS for abnormal scenarios involving automotive body production lines, which can forecast whether a product can be manufactured where abnormal scenarios occur [30]. Zhang et al. presented an information modeling method for a CPPS based on DT and AutomationML, which integrated various physical resources into CPPS to support information interaction between resources [31]. Yildiz et al. discussed the demonstration and evaluation of a DTbased virtual factory [32]. Additionally, Sun et al. investigated DTbased assembly commission approach for high precision products to solve the issues of low assembly efficiency and poorquality consistency caused by traditional manual method [33]. Zhang et al. discussed the hybrid prediction approach of physical model and data to achieve quality assurance for composite components [34]. DT technology is the key in the realization of the virtualphysical fusion of CPS, which makes full use of the digital and physical spaces, and can improve the assembly quality control of aerospace products. In the assembly process of aerospace products, not only the DT model can truly and dynamically mirror the quality status and process of the physical counterpart, but also can be applied to achieve quality anomaly traceability and quality status prediction. Hence, the DT can be used to assist shopfloor managers in physical space to carry out quality optimization to effectively avoid some quality anomalies and reduce quality problem processing time. The above research uses DT in the production phase but does not address its application to assembly process quality management and control.
Framework of DTbased Quality Management for the Assembly Process of Aerospace Products
There are two main quality abnormalities in the assembly process of aerospace products: abnormal product quality data and uncontrolled assembly systems. The first case indicates quality problems in the assembly process. The latter is a hidden quality problem. If the assembly system is uncontrolled, quality anomalies are more likely in the subsequent assembly process. Considering these two anomalies, we propose a framework of quality management for the assembly process of aerospace products based on DT, as shown in Figure 1.
On the physical assembly shopfloor, a shopfloor Internet of Things (IoT), including RFIDs, sensors, barcodes, industrial Ethernet and wireless network, is constructed to realize realtime perception of manufacturing resources, quality data collection and transmission.
On the shopfloor data layer, realtime quality data are dynamically collected based on the built shopfloor IoT and managed based on product assembly BOM, which include inspection data, measurement data, assembly process parameters, environmental data, equipment operation data, material usage data, process completion data, and technical problem data. To ensure the realtime quality data, the inspection data and measurement data, such as centroid, moment of inertia, and weight, are automatically collected and transmitted using the corresponding inspection and inspection equipment. The assembly process parameters, environmental data and equipment operation data are obtained using the corresponding sensors such as displacement, speed, temperature and humidity sensors. The material usage data are acquired through RFID (Radio Frequency Identification) and scanning the barcode corresponding to the material. The process completion data and technical problem data are collected with human computer interaction.
On the virtual assembly shopfloor, a DT model of quality management for the assembly process of aerospace products is constructed. The DT model is composed of two parts: one is the shopfloor visualization model, which can be used for quality monitoring, and the other is the quality prediction and traceability model, which can be used for calculation and decisionmaking. Therefore, the virtual shopfloor contains two levels.
At the DTbased monitoring level, based on the shopfloor visualization model, data and model visualization are applied to mirror and monitor the product quality status based on the built DT model of the shopfloor, and warnings are issued if abnormalities occur.
At the DTbased quality prediction and traceability level, a GreyMarkov model forecasts the future value of quality data according to historical and current data. An abnormal predicted value indicates quality abnormality in the current assembly process, triggering cause traceability; if the predicted value is normal, verification will continue. Then, according to the GreyMarkov forecasting results and historical quality data values, observation samples are selected from different batches, a TK statistical control chart is established according to the sample data, changes of the mean and standard deviation of quality sample data are observed, and it is predicted whether the assembly system will be within the control range at the next moment. If an uncontrolled assembly system is forecast, quality anomalies are highly likely during subsequent assembly, triggering the cause traceability process of quality abnormality; if the assembly system is forecast to be in control, the process is repeated until no products are available. When the cause tracing process of quality abnormality is triggered, it is necessary to analyze factors that may affect the quality data on the assembly shopfloor according to 5M1E. The values of these factors, together with quality data values and assembly system status, form a project set. We then use the Apriori algorithm to mine the strong association rules related to quality data abnormalities and an uncontrolled assembly system. Through these strong association rules, the influencing factors related to quality anomalies are traced and stored in the quality management knowledge base.
The key technologies illustrated in Section 4 focus on relevant technologies included in the level of DTbased quality prediction and traceability, which contains numerical prediction of quality data using a GreyMarkov model, status prediction of the assembly system based on a TK control chart, and mining of association rules for quality abnormities based on the Apriori algorithm.
Key Technologies
Numerical Prediction of Quality Data Using GreyMarkov Model
Grey Model
When only part of the information in a system is known, the unknown information can be predicted using the Grey model (GM), i.e., a Grey system theory model. While appearing to be random, the quality data generated during the assembly process of aerospace products is actually ordered and timedependent, so the Grey model can be used.
For the quality data of the assembly process of aerospace products, a GM(1,1) model is established, i.e., a firstorder linear differential equation.
The GM(1,1) modeling process for assembly quality data of aerospace products is as follows.
Assume the original series of quality data is:
Calculate the scale ratio of the original series of quality data
If all scale ratios of quality data fall within the acceptable coverage interval \(Y=({{e}}^{\frac{2}{{n}+1}}, {{e}}^{\frac{2}{{n}+1}})\), the original series of quality data \({y}^{(0)}\) can be used to establish the GM(1,1) model. Otherwise, the data must be transformed, such as through a shift transformation
where c is a constant that can bring all scale ratios fall within acceptable coverage.
New series can be obtained by ratio validation and processing of original data:
Accumulate the original quality data to weaken fluctuation and randomness that may exist in the random series to obtain the quality data accumulation series
Because the solutions of firstorder differential equation show an exponential growth trend, similar to that of the sequence \({x}^{(1)}(t)\), the sequence \({x}^{(1)}\) is considered to satisfy the firstorder differential equation.
where \(a\) is the development coefficient, the effective interval is (− 2, 2), and \(u\) is the Grey action, and both are undetermined factors. As long as the parameters \(a\) and \(u\) are obtained, \({x}^{(1)}(t)\) can be obtained, as can the predicted value of \({x}^{(0)}\).
According to the definition of derivative,
i.e.,
The following matrices are obtained:
Let
Solving the Grey parameters by the least squares method, we obtain
We substitute \(\widehat{{\varvec{a}}}\) into Eq. (6) to obtain
which is the time response function model of GM(1,1).
On this basis, the forecasting equation of the original quality data is obtained as
After establishing the Grey model, it is necessary to check whether it can be used to forecast the target data. Three test methods are selected: residual, correlation, and posterior.
(1) Residual Test
Residual test is relatively intuitive and only needs to compare the predicted and original values and observe whether the relative error can meet the requirements.
The residual of the original data column \({x}^{(0)}(t)\) and predicted data column \({\widehat{x}}^{(0)}(t)\) is
The relative error \({\Delta }_{t}\) and average relative error \(\overline{\Delta }\) are calculated as
The fitting accuracy is
The Grey prediction model of quality data passes the residual test if p is greater than 80%.
(2) Correlation Degree Test
The correlation degree test is a geometric test to study the similarity of model curves of the original and predicted value. The more similar the geometry of the two curves, the more the values are correlated.
The correlation coefficient of the original quality data column \({x}^{(0)}(t)\) and predicted quality data column \({\widehat{x}}^{(0)}(t)\) is
where \(\rho\) is the resolution coefficient, usually taking a value in (0,1). A larger \(\rho\) indicates a smaller difference between correlation coefficients, and weaker discrimination ability.
The correlation degree between the original quality data column \({x}^{(0)}(t)\) and predicted quality data column \({\widehat{x}}^{(0)}(t)\) is
The closer \(r\) is to 1, the better the forecast accuracy. If \(r\) is greater than 0.6, then Grey prediction passes the correlation degree test.
(3) Posterior Variance Test
A posterior variance test is based on the probability distribution of the residual predicted by quality data.
We calculate the variance of the original quality data \({{s}_{1}}^{2}\), and of the prediction residuals \({{s}_{2}}^{2}\)
Then the ratio of the mean squared error (MSE) is
The residual probability is
As shown in Table 1, a smaller C and larger P indicate a more accurate Grey model. Grade I indicates the highest prediction accuracy, and Grade IV the lowest. Generally, if a prediction model is evaluated as Grade I, II, or III, it can be considered to pass the posterior variance test.
Revision of Predicted Residuals Using Markov Model
The assembly quality data of aerospace products are greatly affected by external information such as operators and the operating environment. This external information is considered random, so the correlation between the changes of quality data is not strong. Therefore, the Markov method is used to forecast and correct the value residuals in the Grey model.
The residuals of Grey predicted values are divided into different states, and a state transition matrix is established, which is composed of all onestep transition probabilities of random processes,
where \({p}_{ij}\) is the onestep transition probability from state i to state j.
It is worth noting that all elements in the state transition matrix are nonnegative, and they sum to 1. The state transition matrix is an important part of the Markov prediction model. After calculating it, the subsequent state of the residual value of quality data can be calculated according to the initial state of the residual value. The sketch diagram of the residual state transition, as shown in Figure 2, is convenient for observing the transition probability of each residual state of quality data.
Figure 3 shows the process of correcting residual values of Grey forecast results using a Markov prediction model.
After confirming that the variation of residuals of assembly quality data is a Markov process, it is necessary to collect residual data and classify the residual state. The state transfer matrix is built dynamically according to the specific changes of quality residual data and is used to solve the prediction state of the assembly quality data residuals of aerospace products. On this basis, the quality data obtained by the Grey forecasting model are corrected.
Assembly System Status Prediction Using TK Statistical Control Chart
The TK control chart does not require a large sample and is independent of the standard deviation of the parent. It is applicable to predict the assembly system status.
Tcontrol Chart for Monitoring the Mean of Quality Data
The Tstatistic is used to monitor the fluctuation of the mean value of quality data and is suitable for small samples. In the actual assembly process of aerospace products, the mean value of certain quality data is usually uncertain and will change constantly. Therefore, when constructing T statistics, it is assumed that the mean value of quality data is unknown.
X is set as the quality data items. 15 observation samples are selected for each batch of products and \(\{{X}_{i,j,1}^{(r)},\dots ,{X}_{i,j,n}^{(r)}\}\) as the group i sample, where i=1, 2, …; j indicates the serial number of the product type corresponding to the batch sample, j=1, 2, …, P; n is the sample size; and the superscript r indicates the serial number of a batch of the same type of product.
Quality data within a batch and between batches are independent, sample data of the same variety obey the same normal distribution, and sample data of different varieties obey different normal distributions, i.e.,
where, \({\mu }_{j}\) and \({\sigma }_{j}\) are the mean and standard deviation of the distribution of product j quality data under the controlled state. The mean and standard deviation of group i samples are
The mean of the first r−1 batches of samples of different product types are defined as
Then the Tstatistic is
The control limits of the Tcontrol chart are
where \({G}_{t}^{1}(\cdot n1)\) is the inverse function of the cumulative Tdistribution function with degree of freedom n−1, and α is the significance level. According to statistical process control theory, the upper and lower control limits correspond to the positions of ±3σ, so α is 0.0027.
During the assembly of aerospace products, if an assembly system is in a controlled state and the mean value of quality data does not deviate, for different kinds of products, as long as the sample size of each group is the same, Tstatistics calculated from groups of sample data with the same sample size will be independent of each other, subject to the same Tdistribution, and with the same control limits. Tstatistics and control limits calculated from each batch of quality data can be used to plot the Tcontrol chart and monitor assembly quality.
Kcontrol Chart for Monitoring the Standard Deviation of Quality Data
The Kstatistic is used to monitor the fluctuation of the standard deviation of quality data and is suitable for small samples. Similar to the process of calculating Tstatistics, the mean of quality data is considered unknown.
The mean of the variance of sample quality data for the first r−1 batches of different product types is
The intermediate variable is defined as
The K statistic is
where F_{ν1, ν2} are cumulative distribution functions of the F distribution with first and second degrees of freedom ν1 and ν2, respectively.
Since the intermediate variable in Eq. (33) follows the Fdistribution with first degree of freedom (n−1) and second degree of freedom (n−1) (r−1), the Kstatistics are independent of each other and follow a normal distribution, and the control limits of the Kcontrol chart are:
Association Rules Mining for Quality Exceptions Based on Apriori Algorithm
Analysis of Quality Influencing Factors
Quality abnormalities occurring in the assembly process of aerospace products may be caused by human, equipment, or environmental factors in the preassembly, assembly, or inspection stage, and the influencing factors of quality abnormalities differ by their type. We use the 5M1E analysis method to classify the influencing factors of quality anomalies into six categories.

(a)
Man. The operator or inspector’s knowledge of quality, health condition, technical proficiency, and other factors may cause abnormal quality.

(b)
Machine. Precision and maintenance of equipment can affect quality. Because the volume of aerospace products is large, the equipment of each workstation is usually fixed, so machine factors relate to an assembly workstation.

(c)
Material. The composition and physical and chemical properties of materials may cause abnormal quality.

(d)
Method. The assembly process, fixture selection, and operating procedures can affect quality. Assembly processes vary by product.

(e)
Measurement. The measurement method adopted for use with inspection equipment can cause a quality exception.

(f)
Environment. Temperature, humidity, lighting, and cleaning conditions in the workplace are possible factors affecting quality abnormalities.
Principle of Apriori Algorithm
An association rules algorithm is used to mine the hidden correlation behind complex data, the classic being the Apriori algorithm, which is used as follows. Filter all items in a transaction dataset that are greater than or equal to the minimum support degree. Then, association rules are generated based on the most frequent item set and filtered according to the minimum confidence level to obtain strong association rules. According to shopfloor assembly data, Apriori was selected to determine the relationships between influencing factors and assembly quality data values and system status.
An association rule is evaluated by support, confidence, and lift. Support is the probability that items X and Y occur simultaneously in a project set, i.e., the ratio of the number of items including X and Y to the number of all items. It describes the universality of association rules and is calculated as
Confidence is the probability that item Y will occur when item X occurs, i.e., the ratio of the number of items containing X and Y to the number of items containing X. It describes the authenticity of association rules and is calculated as
Lift is a parameter that describes the probability change of item Y occurrence due to the emergence of item X, i.e., the ratio of confidence to support of item \(X\to Y\),
If the lift is 1, then there is no correlation between the two events; if it is less than 1, then events X and Y are incompatible.
Case Study and System Implementation
Numerical Prediction Results of Quality Data
Data Selection
Taking the assembly process of an aerospace product as an example, the centroid offset data of cabin A of model S1 are collected to verify the proposed key technologies and method. It is known that cabin A of model S1 has completed three batches of assembly since the last production environment change. In each batch 15 products are assembled, and the assembly of the 15th product in the fourth batch is currently underway. If the assembly time of each product is the same, the allowable error of the centroid offset of cabin A is 15 mm. The collected centroid offset data of cabin A of the fourth batch are shown in Table 2.
From the above data, we use 13 centroid offset data items, from A41 to A413 as raw data, A414 as validation data, and A415 as predicted data. The original series of centroid offset are obtained:
The scale ratio of the original series \({{x}}^{({0})}\) for centroid offset is calculated as
According to the Grey prediction model, the tolerant coverage interval of the ratio is obtained as
By observing the scale ratio of the centroid offset, it is found that each item of the ratio series falls within the admissible coverage interval, which indicates that the GM(1,1) model is applicable for the original series of centroid offsets.
Establishment of Grey Forecasting Model
The original sequence of the centroid offset is accumulated to weaken the volatility and randomness that may exist in the random sequence, and the accumulated sequence of centroid offsets is obtained as
According to the Grey model, the cumulative sequence of centroid offsets is processed, and the matrix B and constant vector \(\varvec{{y}}_{{n}}\) are obtained:
The Grey parameter matrix \(\hat{ \varvec{a}}\) is calculated as
Substituting \(\hat{ \varvec{a}}\) into the prediction model of the centroid offset, the prediction function of \({{\hat X}^{(1)}}({\text{t}})\) is obtained as
Expressions \({\hat{x}}^{(1)}(t+1)\) and \({\hat{{x}}}^{({1})}({{t}})\) are discretized and restored to the original sequence of centroid offsets, whose prediction sequence is
The predicted serial value of centroid offsets is obtained as
Grey Prediction Model Test
(1) Residual Test
The predicted and original series of centroid offsets are put together, and the residual and relative error of each data item is calculated to obtain the fitting results in Table 3.
The fitting curve of centroid offsets drawn from the above fitting results is shown in Figure 4.
The calculated average relative error is 4.16%, and the fitting accuracy is 95.84%. Because the fitting accuracy is more than 80%, the forecast result passes the residual test.
(2) Correlation Degree Test
According to the above fitting curve, \(\min _{i = 1}^{14} {{{\hat y}^{(0)}}(i)  {y^{(0)}}(i)}  = 0,\max _{i = 1}^{14} {{{\hat y}^{(0)}}(i)  {y^{(0)}}(i)}  = 1.314\), and \(\rho = 0.5\). According to Eq. (18), the corresponding correlation coefficients are
The correlation degree between the two sets is 0.794, which is greater than 0.6; hence, the prediction result passes the correlation degree test.
(3) Posterior Deviation Test
The column of residuals extracted from the test results is
We calculate that the variance of the original series \({y}^{(0)}(t)\) of the centroid offset is \({{s}_{1}}^{2}=8.284\), the variance of the residuals is \({{s}_{2}}^{2}=0.165\), and the ratio of the mean variance is \(C=0.141\). According to the definition, the probability of small error is P = 1.
Referring to Table 1, we can see that the mean variance and small probability error calculated above are in the range of Level I, but the average relative error is in the range of Level II.
Case of GreyMarkov Model
Based on the Grey prediction results, it can be found that the minimum residual error of the predicted and actual values is −0.354, and the maximum is 1.314, i.e., the values of residual error are within the range [−0.36, 1.32].
According to the distribution of residual error values, the above large interval is divided into five subintervals: [−0.36, −0.24], [−0.24, −0.12], [−0.12, 0], [0, 0.12], [0.12, 1.32]. Each interval corresponds to a residual error state of the centroid offset, as shown in Table 4.
It is assumed that the predicted result of centroid offset will transfer between States I and V with a certain probability. The state transition diagram of the predicted results is shown in Figure 5. If the current forecasting result is State I, then the next forecasting result has a Z11 probability become State I, a Z12 probability become State II, a Z13 probability become State III, a Z14 probability become State IV, and a Z15 probability become State V.
From the column of residual error values of the centroid offset, we can see that State I occurs three times, twice transferred to State I and once to State II; State IV occurs five times, transferred once each to States I, III, IV, V, and the final State (the residual error of the predicted value at the moment of the final State is left of the interval where State I is located, which we consider as State I). The state transition of residual error is shown in Table 5.
It is assumed that the time interval of the measurement for each two cabins’ centroid offset is one assembly cycle, i.e., the step size of the Markov model, which is the time required for state transition. According to the statistics above, the state transition matrix of the residual error can be obtained as
According to the state of residual error, the residual error of prediction values for A4–13 belongs to state IV, so the initial state can be considered to be (0, 0, 0, 1, 0). According to the state transition matrix of Eq. (50), the state after onestep transition is (2/5, 0, 1/5, 1/5, 1/5), so the residual error of the prediction value of A414 is state I.
We correct the predicted centroid offset value of A414 from the Grey model:
The results are compared with those of the Grey prediction model, as shown in Table 6.
It is found that, unlike the Grey prediction model, the GreyMarkov prediction model takes into account the fluctuation of data, and it reduces the relative error of centroid offset prediction from 3.44% to 1.38%, which improves the accuracy of prediction.
Similarly, we calculated the state after two steps of transfer of A413 (0, 0, 0, 0, 1, 0) as (0.347, 0.133, 0.107, 0.373, 0.040), so the remaining error of A415 prediction is State IV.
We correct the Grey forecast value of A415:
Therefore, it forecasts that the value of the centroid offset for cabin A of the 15th product will be 16.890 mm, exceeding the allowable error of 15 mm, and abnormal quality will occur soon.
Assembly System Status Prediction Results
Raw Data Preprocessing
Based on 14 centroid offset data items of the fourth batch of cabin A of model S1 in Table 2 and the predicted data of product 15, we can obtain 15 data items for the fourth batch of cabin A. With the centroid offset data of the first three batches of cabin A of model S1, 60 pieces of centroid offset data are obtained, as shown in Table 7, which are used as part of the sample data to forecast the status of the assembly system by means of statistical process control.
Drawing on the idea of group technology, we collect the data of cabin A of model S2 similarly to those of cabin A of model S1. The other half of the sample data, the centroid offset data of four batches of cabin A of model S2, are shown in Table 8.
All the sample data required by the statistical process are obtained so that TK control charts can be used to monitor the mean and standard deviation of the centroid data and evaluate the stability of the assembly system.
Case of T Control Chart
In the selected case, we know the allowable error range of the centroid offset of cabin A, but we do not know the mean value, which satisfies the condition for the use of the calculation formula of the T statistic in Section 4.2.1. According to the original data in Tables 7 and 8, we first calculate the mean and standard deviation of group i samples, the mean \(\overline{\overline {{X_j}}}\) of the first r1 batches of samples, and then the T statistics, as shown in Table 9.
According to Eq. (30), we set n to 15, and the upper control limit of the T control chart is 3.636, the lower control limit is −3.636, and the center line is 0. We draw the T control chart of the centroid offset of cabin A, as shown in Figure 6, from which it can be found that the mean of the centroid offset for cabin A of models S1 and S2 is within the allowable range.
Case of Kcontrol Chart
Similar to the calculation of the Tstatistic, we only know the allowable range of the error of the centroid offset, and we do not know the average, which satisfies the condition of calculating the Kstatistic as in Section 4.2.2. Based on the original data in Tables 7 and 8, we calculate the mean \(\overline {S_j^2}\) of the variance of the first r1 batches of sample data for different product types, and then calculate the intermediate variable \(\lambda _{i,j}^{(r)}\) and the K statistic, as shown in Table 10.
According to Table 10 and Eq. (33), the Kcontrol chart of the centroid offset is shown in Figure 7.
Observing the Kcontrol chart, it can be found that the standard deviation of the centroid offset is within the allowable range. Combined with the results of the Tcontrol chart in Section 5.2.2, the assembly system can be considered to be in a controlled state according to the sample data. Since the centroid offset data of the 15th product of the fourth batch are a predicted value, the predicted results based on the TK control chart indicate that the status of the assembly system at the next moment is controlled.
Mining Association Rules of Influence Factors for Abnormal Quality Data
Assuming that the assembly process of cabin A in this case has three processes, all kinds of materials required in each process are from the same batch, and all equipment of each station is fixed. According to the six categories of quality influencing factors mentioned above, the fishbone diagram of the influencing factors for the centroid offset of cabin A is shown in Figure 8, which includes 28 factors.
Although 28 factors are identified from the analysis, how these are combined to cause an abnormal centroid offset still must be determined. We take 150 pieces of quality data collected from the assembly process of a certain cabin A as an example, as shown in Additional file 1: Appendix 1.
To keep the centroid offset within the qualified range as much as possible, the ideal error range of centroid offset is determined as within 12 mm after the influencing factors are analyzed. Over 12 mm is considered to be abnormal, which requires attention. The data encoding is shown in Figure 9.
Each encoding consists of three bytes. Byte 1 represents the process to which it belongs, such as processes 1–3. Byte 2 represents the attributes of influencing factors, such as operators and inspectors. Byte 3 represents the instantiation of influencing factors, such as operator A and inspector B. It is worth noting that when bytes 2 and 3 of two encodings are the same, the corresponding objects are the same regardless of whether byte 1 is the same. For example, 1OPE and 2OPE both represent operator E, who performs process 1 in the first case and process 2 in the second. A total of 108 codes appear in Additional file 1: Appendix 1, and their meanings are given in Additional file 2: Appendix 2.
According to the principle of the Apriori algorithm, association rules are mined from 150 pieces of assembly quality data in Additional file 1: Appendix 1 by R system. We set the support degree to 0.16 and the minimum length of item to 2, to obtain 7250 frequent item sets, from which 15 strong association rules with RHS item “AN” are mined, as shown in Figure 10. The depth of color indicates the degree of lifting, and the size of circle indicates the size of support degree.
Strong association rules with RHS item “AN” are arranged in descending order of lifting in Table 11.
Observing the above rules, the lifts of the top 14 rules are greater than 3, which means they are effective. It is found that the probability of abnormal centroid offset is relatively high when the material batch of process 3 is E. Also, when the material batch of process 1 is A, the product model is S1, the material batch of process 2 is C, or the pressure in the cabin is A. Managers can try to avoid the combination of the above rules.
Mining Association Rules of Influence Factors for Assembly System Status
Because no strong association rule with RHS item “NCT” is found, 30 uncontrolled data items of the assembly system are filtered in 150 collected pieces of data. Mining association rules from this data, we set the support to 0.66, and the minimum length of item set to 2, from which 7250 frequent item sets are obtained. From these, 31 association rules with RHS item “NCT” are mined, as shown in Figure 11.
System Implementation
The proposed quality management method is achieved through data value prediction, assembly system status prediction, and association rule mining based on the constructed DT model, providing early warning for the assembly shopfloor and avoiding future quality abnormalities. According to the comparison of the results in Table 6, the GreyMarkov model considers the fluctuation of data compared with the general Grey prediction model, thus improving the accuracy of data prediction. A DTbased quality management system for the assembly process of aerospace products is developed, which achieves the monitoring of quality information and quick tracing of quality problems, thereby reducing the occurrence frequency and improving the processing efficiency of quality problems on the shopfloor. Currently, the system has been applied in an aerospace enterprise, and the quality data value of more than seven types of key indicators on a final assembly shopfloor and their trend for different products are monitored and predicted.
The interface for monitoring assembly shopfloor quality based on the DT is shown in Figure 12. The lower left quarter quadrant enables the monitoring of the operating status of the equipment, the bill of arrived materials, and the values of key quality indicators of the current station by clicking the corresponding button. The topleft corner monitors quality data value and change trends for different indicators. The lowerright corner is a popup window for early warning if an exception occurs or may occur. The upperright corner shows the prediction results such as the utilization rate of each station and maximum load area.
The interface for monitoring all of the assembly quality data on the assembly shopfloor is as shown in Figure 13, including the prediction results at the next moment based on the built GreyMarkov model. On the left side, the assembly quality data monitoring for different stations on the shopfloor can be realized, including the loading preparation area, empty cylinder treatment area, and docking area. For a specific station, the quality data are monitored on the right side of the interface, and the warning prompt appears in a red font on the interface for monitoring shopfloor quality data on the left side, and on a red background on the right ride of the interface for monitoring station quality data. Quality abnormality records are automatically generated.
The interface for monitoring assembly shopfloor status is shown in Figure 14. The left side is an assembly quality data list for batches of aerospace product A. In the list, the values of key quality data of historical batches can be viewed, along with collected realtime data and predicted data of the current batch. Moreover, users also can monitor the mean, standard deviation, and T and K statistics of quality data in each batch. Meanwhile, users can view the TK statistical diagram on the right side to observe whether the current status of the assembly system is controlled. If the TK statistics of a batch exceed the upper or lower limits, then quality abnormality records will be automatically generated.
When dealing with quality problems, a craftsman or inspector can switch to the interface of managing strong association rules for quality exceptions, as shown in Figure 15. The left side is product BOM, through which quality data and rules are managed. The upper right is a search bar. Select product model, quality data, support, confidence, and other conditions to search and view these rules which are listed in the lower right of the interface, and then the reference methods of processing quality problems can be provided. The list of strong association rules contains the combination of shopfloor resources that lead to abnormal quality data, which are obtained through the proposed approach and imported into the knowledge base in Excel format.
Conclusions and Future Work
The assembly process of aerospace products has the characteristics of single or smallbatch production and many exceptions. How to control assembly quality and improve production efficiency has been an issue in engineering applications. By introducing DT technology, the advantages of digital space, including low cost, high efficiency, and predictability, can be fully utilized to improve the management and control capability of the quality in the assembly process of aerospace products. The main contributions of this article include the following:

(1)
To obtain more accurate predictions in the small sample volume of aerospace product quality data, the fluctuations of data are considered and a GreyMarkov modelbased quality data prediction algorithm was presented. Moreover, group technology was used to increase the sample size of quality data and a TK control chart was applied to obtain the mean and standard deviation of quality data, realizing the prediction of assembly system status and avoiding quality problems caused by an uncontrolled assembly system.

(2)
To improve the efficiency of quality problem tracing and handling in the assembly process of aerospace products, an Apriori algorithmbased traceability method of quality anomaly influencing factors was proposed. Strong association rules related to quality data anomalies and uncontrolled assembly systems were mined to trace the influencing factors of quality anomalies, which can assist related personnel in quickly locating abnormal causes and improving the efficiency of quality control.

(3)
A DTbased quality management system for the assembly process of aerospace product is developed and has been applied in an aerospace enterprise, which promotes the application of DT in the assembly quality management.
This paper explores the application of DT to improve the quality management and problem tracing capability for the assembly process of aerospace products. Future research will focus on two areas: ① deeply studying the feedback link in quality control for the assembly process of aerospace products, so as to make the feedback process in real time and more automatic and intelligent; ② using text mining algorithms to obtain more quality problem processing knowledge, so as to assist in the rapid processing and decisionmaking of quality problems.
References
L P Liu, F Zhu, J Chen, et al. A quality control method for complex product selective assembly processes. International Journal of Production Research, 2013, 51(18): 54375449.
C Zhuang, J Gong, J Liu. Digital twinbased assembly data management and process traceability for complex products. Journal of Manufacturing Systems, 2021, 58: 118131.
Y Hong. Data mining for classroom teaching quality based on fuzzy comprehensive evaluation. Computer Science, 2008, 35(2): 154156, 170.
S Zheng. Dynamic quality control in assembly systems. LIE Transactions, 2000, 32: 797–806.
E J Tuegel, A R Ingraffea, T G Eason, et al. Reengineering aircraft structural life prediction using a digital twin. International Journal of Aerospace Engineering, 2011: 15498.
F Tao, Q Qi, L Wang, et al. Digital twins and cyber–physical systems toward smart manufacturing and industry 4.0: correlation and comparison. Engineering, 2019, 5(4): 653661.
C Zhuang, T Miao, J Liu, et al. The connotation of digital twin, and the construction and application method of shopfloor digital twin. Robotics and ComputerIntegrated Manufacturing, 2021, 68(4): 102075.
C Zhuang, J Liu, H Xiong. Digital twinbased smart production management and control framework for the complex product assembly shopfloor. International Journal of Advanced Manufacturing Technology, 2018, 96: 11491163.
J W Taylor, P E Mcsharry. Shortterm load forecasting methods: an evaluation based on european data. IEEE Transactions on Power Systems, 2007, 22(4): 22132219.
Z Guo, D Chi, J Wu, et al. A new wind speed forecasting strategy based on the chaotic time series modelling technique and the Apriori algorithm. Energy Conversion and Management, 2014, 84: 140151.
J Zhou, J Shi, G Li. Fine tuning support vector machines for shortterm wind speed forecasting. Energy Conversion and Management, 2011, 52(4): 19901998.
D C Li, C C Chang, C W Liu, et al. A new approach for manufacturing forecast problems with insufficient data: the case of TFTLCDs. Journal of Intelligent Manufacturing, 2013, 24(2): 225233.
D C Li, C W Yeh, C J Chang. An improved greybased approach for early manufacturing data forecasting. Computers & Industrial Engineering, 2009, 57(4): 11611167.
C J Chang, J Y Lin, P Jin. A grey modeling procedure based on the data smoothing index for shortterm manufacturing demand forecast. Computational and Mathematical Organization Theory, 2017, 23(3): 409422.
X Kou, Q Zhang. The forecast for the wear trend of the diesel engine based on grey Markov chain model. Lubrication Engineering, 2007, 1: 288291.
C Zhuang, J Liu, C Tang, et al. Material dynamic tracking and management technology for discrete assembly process of complex product. Computer Integrated Manufacturing Systems, 2015, 21(1): 108122. (in Chinese)
Y Gao, X Li, X V Wang, et al. A review on recent advances in visionbased defect recognition towards industrial intelligence. Journal of Manufacturing Systems, 2022, 62: 753766.
F Tao, J Cheng, Q Qi, et al. Digital twindriven product design, manufacturing and service with big data. International Journal of Advanced Manufacturing Technology, 2018, 94: 35633576.
Y Lu, C Liu, I Kevin, et al. Digital twindriven smart manufacturing: Connotation, reference model, applications and research issues. Robotics and ComputerIntegrated Manufacturing, 2020, 61: 101837.
D Jones, C Snider, A Nassehi, et al. Characterising the digital twin: a systematic literature review. CIRP Journal of Manufacturing Science and Technology, 2020, 29: 3652.
M Liu, S Fang, H Dong, et al. Review of digital twin about concepts, technologies, and industrial applications. Journal of Manufacturing Systems, 2021, 58: 346361.
F Tao, H Zhang, A Liu, et al. Digital twin in industry: stateoftheart. IEEE Transactions on Industrial Informatics, 2019, 15(4): 24052415.
F Caputo, A Greco, M Fera, et al. Digital twins to enhance the integration of ergonomics in the workplace design. International Journal of Industrial Ergonomics, 2019, 71: 2031.
J Leng, Q Liu, S Ye, et al. Digital twindriven rapid reconfiguration of the automated manufacturing system via an open architecture model. Robotics and ComputerIntegrated Manufacturing, 2020, 63: 101895.
B R Seshadri, T Krishnamurthy. Structural health management of damaged aircraft structures using the digital twin concept. 25th AIAA/AHS Adaptive Structures Conference, 2017: 1–13.
F Tao, M Zhang. Digital twin shopfloor: A new shopfloor paradigm towards smart manufacturing. IEEE Access, 2017, 5: 2041820427.
J Leng, H Zhang, D Yan, et al. Digital twindriven manufacturing cyberphysical system for parallel controlling of smart workshop. Journal of Ambient Intelligence and Humanized Computing, 2019, 10(3): 11551166.
K T Park, J Lee, H J Kim, et al. Digital twinbased cyber physical production system architectural framework for personalized production. International Journal of Advanced Manufacturing Technology, 2020, 106: 17871810.
Q Bao, G Zhao, Y Yu, et al. The ontologybased modeling and evolution of digital twin for assembly workshop. International Journal of Advanced Manufacturing Technology, 2021, 117: 395–411.
Y H Son, K T Park, D Lee, et al. Digital twin–based cyberphysical system for automotive body production lines. International Journal of Advanced Manufacturing Technology, 2021, 115: 291–310.
H Zhang, Q Yan, Z Wen. Information modeling for cyberphysical production system based on digital twin and AutomationML. International Journal of Advanced Manufacturing Technology, 2020, 107: 19271945.
E Yildiz, C Møller, A Bilberg. Demonstration and evaluation of a digital twinbased virtual factory. International Journal of Advanced Manufacturing Technology, 2021, 114: 185–203.
X Sun, J Bao, J Li, et al. A digital twindriven approach for the assemblycommissioning of high precision products. Robotics and ComputerIntegrated Manufacturing, 2020, 61: 101839.
M Zhang, F Tao, B Huang, et al. A physical model and datadriven hybrid prediction method towards quality assurance for composite components. CIRP AnnalsManufacturing Technology, 2021, 70(1): 115118.
Acknowledgements
Not applicable.
Funding
Supported by National Key Research and Development Program of China (Grant No. 2020YFB1710300), National Natural Science Foundation of China (Grant No. 52005042), National Defense Fundamental Research Foundation of China (Grant No. JCKY2020203B039), Equipment Preresearch Foundation of China (Grant No. 80923010101), and Beijing Institute of Technology Research Fund Program for Young Scholars.
Author information
Authors and Affiliations
Contributions
JL was in charge of the whole trial; CZ wrote the manuscript; ZL, HM, SZ, and YW assisted with sampling and laboratory analyses. All authors read and approved the final manuscript.
Authors’ Information
Cunbo Zhuang, born in 1991, is currently an associate research fellow at Laboratory of Digital Manufacturing, School of Mechanical Engineering, Beijing Institute of Technology, China. He received his PhD degree from Beijing Institute of Technology, China, in 2018. His research interests include digital twin, manufacturing execution system, and shop scheduling.
Ziwen Liu, born in 1999, is currently a master's degree candidate at Beijing Institute of Technology, China.
Jianhua Liu, born in 1977, is currently a professor at Laboratory of Digital Manufacturing, School of Mechanical Engineering, Beijing Institute of Technology, China.
Hailong Ma, born in 1983, an engineer at Shanghai Institute of Spacecraft Equipment, China.
Sikuan Zhai, born in 1997, is currently a master's degree candidate at Beijing Institute of Technology, China.
Ying Wu, born in 1994, received her master's degree from Beijing Institute of Technology, China in 2020.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing financial interests.
Supplementary Information
Additional file 1
. Assembly quality data of a certain cabin A
Additional file 2
. Query table of meaning for the assembly data coding.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Zhuang, C., Liu, Z., Liu, J. et al. Digital Twinbased Quality Management Method for the Assembly Process of Aerospace Products with the GreyMarkov Model and Apriori Algorithm. Chin. J. Mech. Eng. 35, 105 (2022). https://doi.org/10.1186/s10033022007638
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1186/s10033022007638
Keywords
 Quality management
 Digital twin
 Assembly process
 Aerospace product
 Grey Markov model
 Apriori algorithm