Lifetime and Aging Degradation Prognostics for Lithium-ion Battery Packs Based on a Cell to Pack Method

Aging diagnosis of batteries is essential to ensure that the energy storage systems operate within a safe region. This paper proposes a novel cell to pack health and lifetime prognostics method based on the combination of transferred deep learning and Gaussian process regression. General health indicators are extracted from the partial discharge process. The sequential degradation model of the health indicator is developed based on a deep learning framework and is migrated for the battery pack degradation prediction. The future degraded capacities of both battery pack and each battery cell are probabilistically predicted to provide a comprehensive lifetime prognostic. Besides, only a few separate battery cells in the source domain and early data of battery packs in the target domain are needed for model construction. Experimental results show that the lifetime prediction errors are less than 25 cycles for the battery pack, even with only 50 cycles for model fine-tuning, which can save about 90% time for the aging experiment. Thus, it largely reduces the time and labor for battery pack investigation. The predicted capacity trends of the battery cells connected in the battery pack accurately reflect the actual degradation of each battery cell, which can reveal the weakest cell for maintenance in advance.


Introduction
Lithium-ion batteries have been widely used as energy storage systems in electric areas, such as electrified transportation, smart grids, and consumer electronics, due to high energy/power density and long life span [ [1]]. However, as the electrochemical devices, lithium-ion batteries suffer from gradual degradation of capacity and increment of resistance, which are regarded as the aging of batteries [ [2]]. The health status of the batteries largely determines the safety and reliability of the energy storage systems during operation [ [3]]. Therefore, prognostics and health management (PHM) is essential for battery operation, where accurate health prognostic is the key to guide predictive maintenance and cascade utilization [[4]]. Predictive maintenance is a significant function in battery management system to diagnosis the potential dangerous in advance, which can ensure the safe operation and enlarge the lifetime. Battery health prognostics usually refer to the estimation of the state of health (SOH) and the prediction of the remaining useful life (RUL). SOH is usually defined as the ratio of the present capacity to the nominal capacity or the present resistance to the resistance of a fresh battery cell. RUL is defined as the remaining cycles before the end of the service life (usually when the capacity defined SOH reaches 70% or 80% [ [5]]). SOH reflects the battery health status at the present cycle, while RUL reflects the future remaining service life of the battery until end of life (EOL). In recent years, the rapid development of materials improves the energy density of batteries, which makes the battery have larger capacity [ [6]]. So, it would take too much time and labor to do the aging experiment. SOH prognostics can be seen as state estimation with some measured parameters, while there is a greater challenge facing RUL prediction due to the need to predict the degradation in the future. RUL prediction could not only guide cascade utilization but also cut down the time of battery design and development. Unlike a battery cell, the battery pack consists of many cells connected in series and/or parallel [ [7]], which makes the RUL prediction more complicated. Besides, inconsistency is one of the most important factors that need to be considered [ [8]]. This is why the RUL prediction for battery packs is much more difficult than the RUL prediction of the battery cell. The advanced machine learning-based technologies have been widely used in lithium-ion batteries production and management [ [9]]. This paper focuses on the issue of lifetime prognostics and degradation prediction for lithium-ion battery packs.
Generally, health prognostic and lifetime prediction for lithium-ion batteries can be divided into modelbased, data-driven, and hybrid methods [ [1]]. One type of model-based method is based on empirical or semiempirical models of the degradation curve under specific aging conditions. Several factors are considered in the empirical expression, such as depth of discharge, state of charge, temperature, and charging/discharging rate, etc. [ [10]]. Generally, the fitted expression is updated by a Kalman filter or particle filter with the measured parameters during operation [ [11], [12]]. After fitting, the fitted expression is applied directly to other test batteries for SOH estimation and RUL prediction according to their aging conditions based on the interpolation method. Sometimes, the equivalent circuit model is constructed to identify certain parameters, such as capacity and resistance [ [13]]. Then, an empirical model is used to model the degradation of these parameters. However, this method lacks generalization due to the requirement of specific battery types and aging conditions. Also, the internal chemical variations cannot be known. Another type of model-based method is based on the physicsbased aging models, where some physical changes are considered during operation [ [14]]. Physics-based models can effectively capture the internal chemical reactions of the battery by modelling the main factors that contribute to the degradation [ [15]]. Many researchers argue that solid electrolyte interphase layer growth is the main factor that leads to the aging of the battery [ [16]]. Other chemical reactions, including loss of lithium inventory [ [17]], loss of active material [ [18]], surface cracking [ [19]], electrolyte oxidation at the cathode [ [20]], were also considered and modelled to reflect the aging status of batteries. The health status can be estimated by identifying specific parameters and RUL can be predicted by obtaining the future simulation data to calculate the remaining cycles until the discharge capacity drops below the pre-defined threshold. However, the physics-based model is very difficult to build, more chemical considerations cause and parameter changes are mostly approximated by empirical expressions, which may lack accuracy and generality.
Data-driven methods develop the mapping between inputs and outputs in the training process and then can be directly implemented for the health prognostics. Due to the large amount of data collected by intelligent big data platforms, data-driven methods for PHM have shown promising progress and have developed rapidly recently [ [21]]. One way to predict RUL is to extract health indicators (HIs) from experimental data, and then develop a prediction model between HIs and RUL [ [22]]. Usually, a simple linear regression model can meet the requirements. This method can provide accurate RUL prediction results for one type of battery under specific aging conditions. However, a large amount of experimental data is required, and degradation pattern prediction is missing. In another method, the historical capacity data is split into two parts. One is the data before the current time and another contains the data after the current time, where a few parameters of the former cycles are regarded as inputs and the same parameters of the latter cycles are used as outputs. In this way, the historical capacities can be divided into inputs and outputs, and the sequential relationship can be obtained by training the datadriven models [ [23]]. Then, the future capacity can be predicted and RUL can then be obtained. The main drawback of this method is that no physical meaning is included and the model diverges easily. Popular machine learning techniques include support regression machine, relevant regression machine, and gaussian process regression (GPR), etc. [ [24], [25]]. Recently, the deep learning method is also widely used in battery health prognostics [ [26]]. Another obstacle is the difficulty of measuring the capacity in real applications due to the incomplete discharge process. To solve this issue, researchers proposed HIs to estimate capacity first and then used estimated capacity for the regression training [ [27]]. Various HIs, including measured and calculated, have been proposed in published papers and have been summarized in Ref. [[28]]. Among them, the incremental capacity (IC) analysis is regarded as a useful way of investigating battery degradation mechanisms [ [29]]. The IC analysis transfers the voltage plateaus caused by the electrode phase transition into intuitive and identifiable peaks on the IC curve [ [30]]. Besides, the general HIs proposed in Refs. [ [31], [32]] also show great physical coupling relationships with battery capacity. The future capacities are predicted by iterating the regression model until the capacity reaches the failure threshold [ [33]]. However, the lack of the entire lifetime information in the regression model can easily lead to poor predictions. Therefore, researchers used transfer learning (TL) to take advantage of the available entire lifetime data from other batteries [ [34]]. The pre-model trained based on the entire lifetime data from other batteries is used as the initial model of the test battery, and some parameters are then fine-tuned by using some early data of the early cycles [ [35]]. In this way, the model training time for the test battery is reduced while prediction accuracy is improved. Hybrid methods are based on a combination of modelbased and data-driven methods or a combination of different data driven methods. For example, the model parameters (such as internal resistance) can be identified by the model-based method, and its future value can be predicted by using the data-driven method [ [36]]. Besides, different machine learning/deep learning techniques can be hybridized to obtain more accurate prediction results [ [37]]. The optimized combination of different methods largely determines the performance of prognostics.
In summary, conventional methods either use the historical data for model training, which makes the prediction to be divergent easily due to the lack on future knowledge, or transfer the capacity degradation model directly, which means other battery packs with similar degradation are needed, would be time and labor-consuming and also lack accuracy due to the inconsistencies which often evolve in different ways during degradations [ [38]]. Also, the regression model of capacity degradation lacks physical meaning. The prediction of the future capacity distribution of battery cells in the battery pack is also meaningful, which can guide the maintenance in advance. Unfortunately, the above issues have not been considered in published works.
In this article, a deep learning method that combines transferred deep learning (TDL) and GPR is developed for battery pack health prognostics and future degradation prediction. The conventional capacity extrapolation method is reformulated as the future degradation prediction based on physical meaningful HIs extracted from the daily partial voltage curves. Firstly, general HIs are extracted from partial discharge curve. Then, a usefulness evaluation strategy is used to assess the HIs. Thirdly, the capacity estimation models are built by GPR with a modified kernel, and the HIs degradation model is constructed by long short-term memory (LSTM) neural network. Next, the TDL is designed to transfer the information of battery cell to battery pack for lifetime prediction. Finally, the experimental data are used for the verification. The main advantages and contributions are summarized in the following areas.
(1) The knowledge from separate battery cells (SBCs) is transferred to realize the prediction of the battery pack, without the need of other battery packs that have similar degradation patterns. are predicted to reveal the inconsistency evolution and capacity distribution in the aging process, which can reveal the weakest cell clearly for advance maintenance. (4) More than 85% of the time for the aging experiment can be saved by this method, which is very helpful for testing and investigation of newly generated battery packs that conventionally consume too much time and labor cost.
The remainder of this paper is organized as follows. Section 2 introduces the experimental data and HI extraction method. Then, Section 3 proposes the lifetime prognostic method. Next, experimental results are evaluated in Section 4, and main conclusion is finally summarized in Section 5.

Experiment and HI Extraction
Generally, aging experiments are conducted through cyclic charging and discharging processes to accelerate battery aging, and the aging data for the verification of prognostics methods can be collected from the experiments. The dataset and HI extraction method are introduced in this section.

Experimental Dataset
The aging experiments for battery cells and the battery pack are carried out. The aging process consists of constant current charging and constant discharging with a rest between them. The battery is made of LiFePO 4 (LFP) cathode and carbon anode; the nominal capacity is 100 Ah. Seven SBCs are aged at different environmental temperatures and current rates, and the test specifications are listed in Table 1. A battery pack with 16 CBCs of the same battery type connected in series is also used for the aging test. The voltage and temperature of each CBC are measured together with the pack voltage and current. The sampling interval is 10 s for SBC and 30 s for the battery pack. The capacity degradation curves of each battery cell and the battery pack are shown in Figure 1. Different cells show different capacity fading patterns even under the same aging condition. Therefore, it is a challenging task to apply the capacity estimation strategy trained based on one battery data to other different battery cells, and it is more difficult to apply it to battery packs. Obviously, the SBCs are aged to different final capacities with different aging rates according to Table 1. The current rate is the ratio of the nominal capacity. For example, 1 C refer to 100 A. In this paper, we use the data collected throughout the entire test instead of the data from fresh cell to 20% degradation (i.e., 80% remaining capacity).

General HI Extraction
For data-driven health prognostics, the extraction of HIs is one significant step that determines the accuracy and reliability of prediction. In this section, the HIs extraction based on the experimental data is introduced. HIs those have physical meaning can tell the aging status more convincing. IC is one accepted method that reflects the electrode phase transition of the battery. Here the data of Cell#2 that has the longest lifetime are used for demonstration. The IC curves of Cell#2 of whole cycles are shown in Figure 2(b). The IC curve is an effective way to convert the plateaus on the Q (charge amount)~V curve (shown in Figure 2(a)) into identifiable peaks [ [39]]. It can be seen from Figure 2(b) that the main peak decreases gradually with aging cycles. Therefore, the peak value of the IC curve during the discharging process is selected as one HI. The IC is obtained by the ΔQ/ΔV calculation. Using the ΔQ sequence in another way by calculating the standard deviation, another general HI denoted as std(Q(V)) or stdQ can be extracted [ [31]]. The ΔQ sequence is obtained by the 50 segments between the voltage range from 3.15 V to 3.3 V which is similar to the voltage range of the IC peak. This ensures the online HI extraction using partial voltage curves. It can be seen from Figure 2(c) that the ΔQ sequence gets lower during aging, which makes the stdQ decrease according to the aging status. These two HIs are selected as the learning information in deep learning, which has practically physical correlations to aging status. It is worth mentioning that the capacity is calculated by ampere-hour counting during the HI extraction according to the experimental data. But in real application, for more accurate Q calculation, many advanced estimation methods can be adopted [ [7], [40]]. The HIs for each CBC are extracted according to the voltage of CBC and the capacity of the battery pack to illustrate the cell inconsistencies. It worth noting that under real onboard use, the noise may affect the extraction. And various filtering methods can be used in preprocessing to help extract the HIs effectively [ [28]]. The extracted HIs of the SBCs and CBCs are shown in Figure 3, where the stdQ is shown in Figure 3(a) and the IC peak is shown in Figure 3(b). The correlation coefficients, including the Pearson correlation coefficient (PCC) and Spearman correlation coefficient (SCC) [ [41]], are shown in Figure 3(c). PCC analysis is a proper way to evaluate the linear dependence, while SCC analysis is better at monotone evaluation between HIs and capacities. The numerical results are listed in Appendix (Tables A1 and A2). The results show that both stdQ and IC peak are highly correlated to the capacity (all PCC and SCC are larger than 0.98). In addition, the HIs present different degradation patterns for CBCs, showing inconsistencies clearly.

Methodology
The method for battery pack lifetime prognostics is proposed in this section. Specifically, GPR models are developed for the capacity prediction of the battery pack and the CBCs in Section 3.1. Then, TDL is proposed for future degradation curve prediction in Section 3.2. Finally, the prognostics framework is proposed in Section 3.3.

Capacity Estimation Based on GPR
Data-driven capacity (or SOH) estimation generally includes HI extraction, model training, and model testing. The general HI extraction process is introduced in the earlier section. For model training and testing, machine learning is widely used and has shown great estimation performance. Among the various machine learning algorithms, GPR shows superiority because it is based on the Bayes optimization and probabilistic estimation [ [28], [32]]. GPR has been widely used in battery pack state of charge and SOH estimation, where satisfactory results are obtained [ [8], [42]]. Therefore, the GPR is adopted in this paper to construct the capacity estimation model. Generally, the relationship between input x and out y is given as follows by assuming the noise is additive, independent, and gaussian [ [43] where ε is the white noise with a variance of σ 2 n . f (x) is a latent function, which has a probability distribution: where m(x) is the mean function, and k(x, x') is the covariance function. The expressions are shown as follows: The kernel function is selected as follows: where the covariance σ 2 f represents the output amplitude the diagonal matrix, l is the characteristic length scale. The predicted mean value y * and the predicted covariance value cov(y * ) are expressed as follows, where the covariance σ 2 f represents the output amplitude of the diagonal matrix, x and y represent the input matrix and output matrix, respectively. I n is an n-dimensional unit matrix. K f is the kernel matrix. Based on the probability theory, the 95% confidence interval (CI) can be obtained: Therefore, the predicted value can be obtained by y * with a probabilistic distribution interval.
To predict the future capacity of the battery pack, two GPR models are constructed. One for battery cells and the other for the battery pack. The capacities of the CBCs cannot be measured, so we do not have the measured data to develop the GPR model for CBCs capacity estimation. Therefore, the GPR model for CBCs capacity estimation is established based on the source batteries, a.k.e SBCs. The input and output of the GPR model for battery cells are the HIs and capacity at each cycle from the aging data of the source batteries, respectively. This can make the various aging conditions be considered in modeling. The GPR model for CBC capacity estimation is established using the entire lifetime data in this way. For the GPR model for battery pack capacity prediction, only the data of early cycles are obtained, but the actual capacity is known. To consider the impact of inconsistencies on battery pack capacity, the information of each CBC is included by adding the HIs into the input matrix. The output of the battery pack GPR model is the pack capacity. The data of early cycles are used to construct the battery pack GPR model, and the model is used for future capacity prediction using the predicted HIs from the following algorithms. For different uses, the specific relationship between the HIs and the pack capacity can be learned by using the specific early data.

Regression Extrapolation Based on TDL
The GPR models developed are used for capacity estimation, where HIs are needed as the inputs. Therefore, future unknown HIs are required for future capacity prediction. Conventional data-driven methods only use the historical information of the test battery for training the regression model. However, the lack of knowledge of the future trend usually leads to poor prediction performance. For example, the early trend is linear while the latter is exponential. TDL is an effective tool to improve the prediction accuracy of the task in the target domain by transferring and fine-tuning the known information from the source domain [ [44]]. In this paper, TDL is used to transfer the pre-trained sequential degradation model of HIs to the test battery and battery pack. The sequential relation of the HI is represented by m-dimension input and n-dimension output, which are, where m is chosen as 5 while n is 1 in this paper. Then, the relationship between inputs and outputs is modeled by the LSTM regression model. The network is shown in Figure, where an input layer, a LSTM hidden layer, a fully connected layer, and an output layer are included. The main advantage of LSTM chose here is the ability to avoid the gradient vanishing and exploding problems by controlling information flow. And it shows priority in battery health prognostic [ [34], [45]]. The fully connected layers are designed to output the regression layers and be used as fine-tuning layers. The four gates of LSTM are calculated as [ [46]]: Forget gate: Input gate: Update: Output gate: y = [HI k+1 , HI k+2 , · · · , HI k+n−1 , HI k+n ], (11) where W and b are the weights and biases, σ and tanh are the sigmoid and tanh activate function. h(t) and x(t) are the flowing and input information respectively; f(t) is the remaining information; i(t) and φ(t) are the candidate information; C(t) is the cell state while o(t) is the output information. The detailed structure of the four gates in the LSTM is shown in Figure 4. The expression of the fully connected layer is, where relu is the linear unite activation. Finally, the output HI k+1 can be obtained by the output layer, In order to avoid overfitting, the dropout (0.05) is considered [ [45]]. The dropout is an effective way to solve the overfitting problem by dropping some units during the information transmission process, as illustrated in Figure 4. Other parameters of LSTM are set as follows: the time step is 3, the number of units of the LSTM layer is 15, and the input sequence is 5. During model training, the epoch is 300 and learning rate is 0.0008.
TDL has a strong ability to improve prediction performance by adapting the existed models. As for the transfer strategy in this paper, the HI degradation model is trained based on the SBCs in the source domain and then fine-tuned using early data of battery pack for each CBC. Specifically, the input layer and LSTM layer are freezed while the fully connected layer and regression output layer are fine-tuned, which means the w and b of the last two layers are adjustable.

Lifetime Prognostic for Battery Pack
The capacity estimation model and HIs prediction model have been established in the above section. Then the combination of them is constructed to predict the lifetime of battery packs in this section. The process of the proposed battery pack lifetime prognostics is shown in Figure 5. The solid line indicates the data flow; the dotted line is the model migration path; while the dash-dotted line is the model utilization path. The source domain contains the available data of the SBCs over the entire lifespan, while the target domain contains the early information of the battery pack. For the base model development, general HIs are firstly extracted from the partial discharging process and then evaluated by correlation analysis and estimated errors of battery capacities. This process can assess whether the HIs are suitable for battery pack lifetime prognostics. Finally, two models, including the cell capacity estimation model and HI degradation model, are constructed based on the data of batteries in the source domain by GPR and LSTM, respectively.
For the battery pack lifetime prognostics in the target domain, the same HIs are extracted based on the measured voltage of each CBC and pack capacity. Then, the base HI degradation model trained in the source domain is migrated and fine-tuned using the newly extracted HIs and known capacities to adapt the degradation pattern for each CBC in the target domain. Meanwhile, the battery pack capacity estimation model is constructed using GPR. The information (HIs) of each CBC is used as input while pack capacity is set as output, which considers the influence of inconsistency on the capacity of the battery pack. Finally, the pack future degradation is predicted by combining the pack capacity estimation model and the fine-tuned HI degradation model, while the degradation of each CBC is predicted by combining the SBC capacity model and the fine-tuned HI degradation model. The future HIs are firstly predicted by extrapolating the HI degradation model and then the future capacities are predicted using the corresponding capacity estimation model.
It can be seen from the prognostic process that both the future capacity of the battery pack and the future capacity evolution of the CBCs are predicted by the proposed method. Therefore, more comprehensive prognostics are obtained.

Results and Discussion
In this section, the results of the battery pack lifetime prognostics and degradation prediction are evaluated based on the experimental data. Firstly, the capacity estimation model is evaluated to verify the feasibility of future capacity prediction based on the predicted HIs. Then, the lifetime prediction of the battery pack, as well as the future capacity distribution of the CBCs are provided and discussed.

Capacity Estimation
Because the lifetime is directly reflected by future capacities, the capacity estimation model is significant and should be evaluated. Both the cell model and the pack model for capacity estimation are discussed in this section. The cell capacity model is used for CBCs capacity prediction by using the first half of the data for model training and the rest for testing, and the results are shown in Figure 6 (the symbols on the error curve are drawn every a few points). Figure 6(a) and Figure 6(b) shown the capacity estimation results using stdQ as HI of the battery with maximum and minimum error respectively, i.e., cell#4 and cell#1. Figure 6(c) and Figure 6(d) are the results that have the maximum and minimum error using IC peak as HI, i.e., cell#4 and cell#6. The goal of using a single HI for capacity estimation is to evaluate the performance of that HI. The  95% CIs are all narrow, which means the estimations are reliable. Besides, the statistical results of both two HIs for inputs and single HI for inputs for all the seven batteries are shown in Figure 7(a). The results illustrate that the capacity can be accurately estimated by the HI. The combination of two HIs can compensate for the disadvantage of a single HI so that the distribution of the errors is narrower. Overall, both the mean absolute error (MAE) and root mean square error (RMSE) is less than 1.2%, which means they have great correlations with capacity and can be used in capacity estimation. As shown in Figure 5, the model for capacity estimation of the CBCs is trained based on the source batteries. It means that the model is trained based on other batteries that are different from the battery in the target domain. Therefore, it is necessary to verify the capacity estimation performance in this condition. The GPR model is constructed using the data from one battery, and then the capacity estimations are validated using the data from the other batteries. Figure 7(b) shows the statistical results of the MAE and RMSE when using this strategy for verification. The x-axis is the SBC for model training. Most errors are less than 3%, indicating the model can be migrated for capacity estimation of different batteries, which means the CBC capacity prediction strategy is effective.
Then, the capacity estimation results for the battery pack using two HIs are shown in Figure 8. Specifically, Figure 8  show that the errors fluctuate in the first 0.4 proportion, and then decrease more monotonously in the rest. The reason for that may be the fluctuation of the capacity curve in the early cycles. Figure 8(b) shows the results when 0.2 and 0.4 proportions of data are used for model training. It shows that although the two have similar statistic errors, the 95% CI of 0.4 is narrower than that of 0.2. Therefore, the prediction results are more reliable.
Overall, the battery pack capacity estimation results are satisfactory where the inconsistency is considered. The capacity estimation models are used for lifetime prediction in the following sections.

Lifetime Prognostic
The capacity estimation results shown in the former section illustrate that the future capacity can be predicted by the trained model if the HIs are available for future cycles, which can be realized by extrapolating the HI degradation model. The future degradation is predicted in this section by the combination of the TDL-based HI prediction and GPR-based capacity prediction. Specifically, different proportions of data are used for model fine-tuning and the corresponding battery pack lifetime prognostics are given. It should be noted that there is not a specified threshold value for the end of life. On the contrary, we use the data until the last cycle. Besides, the lifetime prediction mainly focuses on the future trend instead of local fluctuation, and the fluctuation will also influence the performance of the extrapolation process. Therefore, the capacity and HIs are smoothed firstly in the lifetime prognostics process, where the moving average filter is adapted. Firstly, the superiority of the proposed TDL+GPR based lifetime prognostics method is demonstrated by comparing it to the conventional method that only uses the available historical data for model training. The conventional method is demonstrated by using the former data for model training and the latter data for prediction. For the sake of fairness, the data for fine-tuning of the proposed method and the training of the conventional method are both set to 200 cycles. The results are shown in Figure 9(a). The threshold is drawn by the real capacity of the last cycle. It shows that when the data of the first 200 cycles of the battery pack are available for model fine-tuning, the future capacity prediction can fit the real value with a narrow confidence interval. The predicted RUL is 212 cycles now, which is just 3 cycles less than the real RUL. And the 95% CI is [391 449], which means the prediction results are reliable. However, the results predicted by the conventional self-training-based method show poor performance. The predicted value largely deviates from the real value and even has an obviously different degradation pattern from the real capacity. The reason is that the TDL can transfer the characteristic of the future degradation to the testing battery pack using the proposed method, while the conventional method cannot achieve. Therefore, the proposed method shows great improvement for battery pack lifetime prognostics. Then, the data of different cycles are employed for the model fine-tuning to evaluate the TDL+GPR based battery pack lifetime prognostics method. The numerical results are listed in Table 2. The positive value means the predicted value is larger than the real value, and vice versa. The results show that the prediction became more accurate when more data is available for model fine-tuning before 200 cycles. But after that, the predicted errors remain small with fluctuations. This means the model is well-tuned. And it also proves that the proposed method can obtain accurate prediction based on only a few early data. The model can be accurate and reliable enough with less than 50% data; and even with about 10% data for fine-tuning, a satisfactory prediction is obtained. When only the data of the first 50 cycles are used, the predicted value is 23 cycles less than the real value, which means the error is about 6.3%. This means a satisfactory prediction is obtained. Figure 9(b) and Figure 9(c) show the results of the prediction when the cycles for fine-tuning are 100 and 300, respectively. It is shown in Figure 9(b) that although the predicted is only 11 cycles less than the real value, the predicted future capacities show an obvious difference from the real capacity, and the 95% is very large. This means an accurate prediction could also be obtained, but the reliability of the prediction is not so good. This is because the model is more depended on the former reference model, and more prediction steps would also produce bigger uncertainty. However, the situation shows significant improvement in Figure 9(c). The predicted capacities and the real values are almost the same with a quite narrow confidence interval. The predicted error is 4 cycles and the 95% confidence interval is [409 419], which is an interval of 10 cycles. The confidence interval is much narrower than that of Figure 9(a), indicating that the predictions are more reliable even though the accuracy does not furtherly improve after 200 cycles. Therefore, the results indicate that the main error corrections are achieved before 200 cycles; and although the final predictions are all close to the real lifetime after 200 cycles, the model is still getting more and more reliable, and the predictions are more convinced. Overall, the prediction of battery pack life is improved compared to conventional methods, with less than 50 percent of the data guaranteeing sufficient accuracy and only 50 cycles guaranteeing an acceptable prediction.
Another advantage of the proposed method is that it can also predict the future capacities of the CBCs, which can show the degradation evolution and capacity distribution clearly. This is significant since it can provide key information for the maintenance of the weakest cell in advance. Therefore, the prediction of the future capacity distribution of the CBCs is finally provided and discussed. Figure 10 Figure 10(a) that the predicted capacities have a similar degradation trend to that estimated by the real HIs, when only the data of the early 150 cycles is available for fine-tuning of the HI degradation model. However, some battery cells show faster degradation than the real situations, which still shows some obvious predicted errors. The results in Figure 10(b) show that the predicted capacities are closer to that estimated based on the real HIs, and the predicted capacities in Figure 10(c) almost cover that estimated by the real HIs. It illustrates that the future capacity degradation of each connected battery can be well predicted by this strategy; and when about 50% of data are used for fine-tuning, the future capacity predictions of CBCs are almost cover the real degradation. It can demonstrate the degradation of each CBC accurately and clearly, which can help manage the weak cell early. Table 3 lists the mean MAE (mMAE) and the mean RMSE (mRMSE) of the 16 batteries when using the predicted HIs for future capacity prediction compared with the results estimated by real HIs. The standard deviation of the mMAE (std_mMAE) and mRMSE (std_mRMSE) are also given to see the adaptation among different cells. The results show that the predicted capacities have small errors according to the estimated capacities based on the real HIs. The std_mMAE and std_mRMSE are quite small, which suggests the predicted model is suitable for all the connected batteries. It demonstrates that even the data of early 100 cycles are available, the mMAE is 0.665% and mRMSE is 1.054%, which can guarantee satisfactory accuracy; the std_mMAE and std_mRMSE are 0.307 and 0.567 respectively, which means the predictions of all the 16 CBCs are accurate enough. And when the data of the first 250 cycles are available, the mMAE and mRMSE reduce to 0.146% and 0.152% respectively; the std_mMAE and std_mRMSE also reduce to less than 0.1. This means the predictions now are accurate and reliable, which demonstrate the degradation of latter life can be predicted precisely using the early data. It is significant for PHM to predict the future capacity degradation to manage the weakest batteries in advance, and the method proposed in this paper provides a promising technique.

Conclusions
Health and lifetime prognostics of series-connected battery packs are essential for health management. Different working conditions and internal chemical reactions lead to various aging patterns, and the inconsistency largely influences the degradation of the battery packs. This paper proposes a novel method for battery pack lifetime prediction by the synergy of TDL and GPR.
HIs are extracted and proved to have high correlations with capacities for both SBS and battery packs. TDL is designed for future HIs prediction while GPR uses the predicted HIs to estimate the future capacities. The proposed method provides not only the future degradation pattern of the battery pack but also the lifetime distribution of the CBCs with probabilistic prognostics. The general HIs can be used for battery cell capacity estimation under different work conditions, and consider the inconsistency for the capacity estimation of battery packs. An experiment data set is used to verify the methods. The results show that the MAE and RMSE of the SBC capacity estimation models trained by GPR and the extracted HIs are less than 1.2% for self-estimation and less than 4% for the estimation of other batteries. The MAE and RMSE of the battery pack capacity estimation model are less than 3.5%, even only a little data is used for model training. The lifetime prognostics results show the predicted errors of the lifetime are less than 25 cycles only with 50 cycles for model fine-tuning, and the errors are reduced to less than 5 cycles when 200 cycles are available. The future capacity distribution of the CBCs follows the real trend well, which clearly illustrates the future weakest cell. Future work will focus on the battery packs with different electrochemical systems under various working conditions. The aim is to achieve a fast lifetime investigation for all kinds of batteries with a general method.

Appendix
See Tables A1 and A2.