Skip to main content

Optimization on the Crosswind Stability of Trains Using Neural Network Surrogate Model

Abstract

Under the influence of crosswinds, the running safety of trains will decrease sharply, so it is necessary to optimize the suspension parameters of trains. This paper studies the dynamic performance of high-speed trains under crosswind conditions, and optimizes the running safety of train. A computational fluid dynamics simulation was used to determine the aerodynamic loads and moments experienced by a train. A series of dynamic models of a train, with different dynamic parameters were constructed, and analyzed, with safety metrics for these being determined. Finally, a surrogate model was built and an optimization algorithm was used upon this surrogate model, to find the minimum possible values for: derailment coefficient, vertical wheel-rail contact force, wheel load reduction ratio, wheel lateral force and overturning coefficient. There were 9 design variables, all associated with the dynamic parameters of the bogie. When the train was running with the speed of 350 km/h, under a crosswind speed of 15 m/s, the benchmark dynamic model performed poorly. The derailment coefficient was 1.31. The vertical wheel-rail contact force was 133.30 kN. The wheel load reduction rate was 0.643. The wheel lateral force was 85.67 kN, and the overturning coefficient was 0.425. After optimization, under the same running conditions, the metrics of the train were 0.268, 100.44 kN, 0.474, 34.36 kN, and 0.421, respectively. This paper show that by combining train aerodynamics, vehicle system dynamics and many-objective optimization theory, a train’s stability can be more comprehensively analyzed, with more safety metrics being considered.

Introduction

The advantage of high-speed trains over conventional trains is their high speed [1]. Therefore, the existence and potential problem of crosswinds questions the advantage of high-speed trains [2]. It is necessary to study the crosswind stability of high-speed trains. In fact, vehicle dynamics and optimization have always been the subject of scholars’ research. In terms of dynamics, Baker et al. [3] added wind load to the train dynamic model, and studied the relationship between a train’s derailment coefficient, wheel load reduction coefficient and wind speed. Deng et al. [4] studied the dynamic performance of high-speed trains when passing through wind barriers, and focused on the effects of crosswind based on the two metrics of wheel load reduction and derailment coefficient. Yan et al. [5] used a spectral analysis method to study the overturning risk of high-speed trains subject to strong winds. Olmos et al. [6] used the derailment coefficient and wheel load reduction ratio as metrics to study the safety of trains under the vehicle-bridge-wind coupling model. Wei et al. [7] used an indirect test method to obtain the wheel-rail contact force of a high-speed train in a crosswind environment from an experimental point of view, and then analyzed the safety. Niu et al. [8] studied the influence of coupling regions on the aerodynamic performance of trains under different operating conditions. Yu et al. [9] studied the operational stability of high-speed trains under different wind models. They paid close attention to the differences in wheel load reduction of high-speed trains. In terms of optimization, Li et al. [10] optimized the active suspension system of electric vehicles with the three objectives of reducing the vertical component of the unbalanced electrode, eliminating the negative impact of electromechanical coupling on the vehicle system, and retaining the dynamic advantages of the vehicle’s active suspension. Fossati et al. [11] used driver’s seat vertical acceleration, root mean square of the wheels’ dynamic amplification factor, and maximum relative displacement between each wheel and car body as metrics to optimize the passive suspension vehicle’s parameters and obtained good results. By optimizing the shape of aircraft or trains to reduce the negative impact of aerodynamic loads on them, great progress has been made [1217]. One of the most important methods is the surrogate model method.

Considering the content introduced above, we have an initial impression of research progress, as shown in Figure 1. In the past, scholars’ research work about train stability can be divided into two categories. One is to combine train aerodynamics and vehicle system dynamics to study the impact of wind on vehicle safety. Another is the combination of vehicle system dynamics and optimization theory to optimize vehicle’s safety by adjusting suspension parameters. Both types of research have made good progress and guided later scholars’ work, but some areas are still worth studying. Firstly, the deterioration of train’s safety mainly occurs in some extreme situations, such as in a crosswind environment. The primary purpose of our study is to improve the safety of trains running in these situations. Therefore, it is necessary to combine train aerodynamics, vehicle system dynamics and optimization theory. Secondly, when previous scholars researched the dynamic performance of trains, especially with regards to safety, two or three metrics were usually selected, such as derailment coefficient and wheel load reduction ratio, or derailment coefficient and overturning coefficient. But in the industry standard, the number of safety metrics is more than two. Code GB5599-85 [18] is one of the national standards of the People's Republic of China, which specifies the dynamic performance evaluation metric of railway vehicles. There are at least five safety metrics specified in code GB 5599-85, which are derailment coefficient, vertical wheel-rail contact force, wheel lateral force, wheel load reduction rate, and overturning coefficient. If suspension parameters are optimized for only some of these metrics, the final optimization results may not be satisfactory. When the train is running, some metrics may be improved due to optimization, but other metrics may deteriorate due to the changes in suspension parameters. Therefore, the work of this paper is to combine the three areas mentioned above.

Figure 1
figure1

Research ideas of this paper

This paper is divided into five parts. The first part is an introduction, which mainly introduces the research progress of vehicle dynamics and optimization. The second part is the methodology, which mainly introduces the theories employed in this research. The third part is the model introduction and related verification. The fourth part is the research process and result analysis. The fifth part is the conclusions.

Methodology

Surrogate Model Method

The main idea of the surrogate model method is to fit a series of isolated sample points into a function, based on mapping relationships, as shown in Figure 2.

Figure 2
figure2

Principle of surrogate model method

Consider a function with only a 1-dimensional design variable and a 1-dimensional objective output. The points in Figure 2 represent sample points, and we know the input and output values of the sample points. If there are only three sampling points, we can also fit a curve with a functional relationship. This is our surrogate model. However, the surrogate model currently is very rough. Its mapping accuracy is very poor, because there are too few sample points. With the number of sample points increasing, our surrogate model will become more accurate until it can accurately express the real characteristics of input and output. It is obvious from Figure 2 that the minimum value of the function is in the red circle. Sometimes we don’t need the surrogate model to be sufficiently accurate globally, because meeting global accuracy often requires a large number of sample points. We can build a rough surrogate model with widely distributed sample points and find the interval where the minimum point may exist, as shown by the red circle. Then continuously increasing the number of sample points for this interval, we construct the surrogate model, until the surrogate model is sufficiently accurate for this small interval. What has been described above is a case with one-dimensional input and one-dimensional output. The dimensions of the design variables and output can be generalized to \(m\) and \(n\), and the principle remains the same.

It has been shown that neural networks have good data-fitting ability [19]. This means that even a two-layer neural network can fit almost any function as long as the parameters are appropriate. Therefore, this paper uses a BP neural network to build the surrogate model. A BP neural network uses the error back propagation method. It approximates real functions by continuously adjusting weights and offset values [20].

Suppose the input is \({\varvec{X}} = \left[ {x_{1} ,x_{2} , \cdots ,x_{n} } \right]\). Then, the output of a neuron can be expressed as y:

$$y = \sigma \left( {\sum\limits_{i = 1}^{n} {w_{i} x_{i} + b} } \right),$$
(1)

where wi is the weight vector, σ is activation function and b is offset.

Suppose the final output of the entire neural network is ol, and the real output of the system is yl. Assume that the cost function is the mean square error function:

$$E = \frac{1}{2}\sum\limits_{l = 1}^{L} {\left\| {{\varvec{o}}^{l} - {\varvec{y}}^{l} } \right\|}^{2} .$$
(2)

Finally, the weights and offsets are updated until the gradient is small enough and the value of the cost function no longer decreases:

$${\varvec{W}}^{{\varvec{k}}} = {\varvec{W}}^{{\varvec{k}}} - \eta \frac{\partial E}{{\partial {\varvec{W}}^{{\varvec{k}}} }},$$
(3)
$${\varvec{b}}^{{\varvec{k}}} = {\varvec{b}}^{{\varvec{k}}} - \eta \frac{\partial E}{{\partial {\varvec{b}}^{{\varvec{k}}} }},$$
(4)

where Wk and bk are the weights and offsets of the kth layer neural network, respectively.

Under normal circumstances, the structure of a neural network is as shown in Figure 3.

Figure 3
figure3

Structure of the neural network

Then the output of the neurons in each layer of the neural network can be expressed as:

$$\left\{ {\begin{array}{*{20}l} {{\varvec{h}}_{{1}} = \sigma_{{h_{1} }} \left( {{\varvec{W}}^{{\mathbf{1}}} {\varvec{x}} + {\varvec{b}}^{{\mathbf{1}}} } \right),} \hfill \\ {{\varvec{h}}_{{\varvec{k}}} = \sigma_{{h_{k} }} \left( {{\varvec{W}}^{{\varvec{k}}} {\varvec{h}}_{{{\varvec{k}} - {\mathbf{1}}}} + {\varvec{b}}^{{\varvec{k}}} } \right),} \hfill \\ {{\varvec{O}} = \sigma_{o} \left( {{\varvec{W}}^{{\user2{R + 1}}} {\varvec{h}}_{{\varvec{R}}} + {\varvec{b}}^{{\user2{R + }{\mathbf{1}}}} } \right),} \hfill \\ \end{array} \left( {2 \le k \le R} \right),} \right.$$
(5)

where \(\sigma_{{h_{1} }}\), \(\sigma_{{h_{k} }}\) and \(\sigma_{o}\) are the activation functions of each layer. o is the output of the neural network. hk is a hidden layer vector. \({\varvec{h}}_{{\varvec{k}}} = \left( {{\varvec{h}}_{{{\varvec{k}},{\mathbf{1}}}} ,{\varvec{h}}_{{{\varvec{k}},{\mathbf{2}}}} , \cdots ,{\varvec{h}}_{{{\varvec{k}},{\varvec{n}}}} } \right)^{{\text{T}}}\). n is the number of neurons in the hidden layer.

Many-Objective Optimization

Consider a maximization problem with two decision vectors \({\varvec{a}},{\varvec{b}} \in {\varvec{X}}\). Then, a is said to dominate b (also written as ab) if: \(\forall i \in \left\{ {1,2, \cdots ,n} \right\}:f_{i} \left( {\varvec{a}} \right) \ge f_{i} \left( {\varvec{b}} \right) \wedge \exists i\left\{ {1,2, \cdots ,n} \right\}:\)

$$f_{i} \left( {\varvec{a}} \right) > f_{i} ({\varvec{b}}).$$
(6)

Additionally, in this study a is said to cover b if ab or f(a)=f(b). The decision vectors that are non-dominated within the entire search space are denoted as pareto-optimal and are called the pareto-optimal front [21].

When the number of objectives is 2 ≤ n ≤ 3, the optimization is called multi-objective optimization, and when n > 3, it is called many-objective optimization. Deb et al. [22, 23] proposed the NSGA-III algorithm based on NSGA-II, which can be used to solve many-objective problems (n ≥ 3). Traditional algorithms, such as particle swarm multi-objective algorithms and NSGA-II algorithms often fail, when the number of optimization objectives exceeds 3. This problem forces us to use a better performing algorithm. Most of the operations of the NSGA-III algorithm are the same as NSGA-II, except that the necessary improvements are made to retained childhood. Assuming the number of the objectives is n and the population to be processed is St. The highest rank of elite level in St is Fl (the last level). Childhood is s. The objectives value of s is fi (s), 1 ≤ i ≤ n.

First, we need to compute the ideal point.

$${\varvec{z}}_{j}^{{{\text{min}}}} = \min_{{{\varvec{S}} \in S_{t} }} f_{j} \left( {\varvec{s}} \right).$$
(7)

Second, transform objectives.

$$f^{\prime}_{j} \left( {\varvec{s}} \right) = f_{j} \left( {\varvec{s}} \right) - z_{j}^{\min } ,\quad \forall {\varvec{s}} \in S_{t} .$$
(8)

Third, calculate the extreme values of St.

Fourth, normalize objectives (fn) by

$$f_{i}^{n} \left( x \right) = \frac{{f_{i}^{n} \left( x \right)}}{{a_{i} }},\quad i = 1,2, \cdots ,M.$$
(9)

Fifth, specify or evenly arrange reference points Zr.

Sixth, for each reference point \({\varvec{z}} \in Z^{r}\), we compute reference line \({\varvec{w}} = {\varvec{z}}\). Also, we calculate the distance between s and w by

$$d = \left\| {\left( {{\varvec{s}} - \frac{{{\varvec{w}}^{{\text{T}}} {\varvec{sw}}}}{{\left\| {\varvec{w}} \right\|^{2} }}} \right)} \right\|.$$
(10)

Seventh, assign \(\uppi ({\varvec{s}}) = {\varvec{w}}:\arg \min_{{w \in Z^{r} }} d\) and \(d({\varvec{s}}) = d({\varvec{s}},\uppi ({\varvec{s}}))\) respectively.

Finally, select the remaining new individuals from Fl according to the distance and distribution.

Model Introduction and Related Verification

CFD Model

The 1:8 scale train model is used as the research object, as is common in the field of train aerodynamics research. The model is divided into a head car, a middle car and a tail car, as shown in Figure 4.

Figure 4
figure4

Train geometry model

The characteristic height of the train h= 0.52 m, the length of the head and tail cars l1 = 3.45 m, and the length of the middle car l2 = 3.125 m. The cross-section area of train is 0.1687 m2.

Li et al. [24] used this model as to study the influence of inter-car gap length on aerodynamics, based on experiments in a windless environment. If you want to simulate a crosswind environment, you must make corresponding improvements to the calculation domain. Code TB/T 3503.4 [25] is one of the railway industry standards of the People’s Republic of China. It is a standard for numerical simulation of train aerodynamics. When a train is running in windless environments, it stipulates that the ratio of the cross-sectional area of the train to the cross-sectional area of the calculated domain should be less than 0.01. The calculation domain height L should not be less than 8 times the characteristic height h, the length of upstream LF should be not less than 8 times the characteristic height h, and the length of the downstream LB should be not less than 16 times the characteristic height h. When the train is running in windy conditions, the requirements for the size of the domain are similar to those in windless environments, but two additional conditions are added. The length of the upstream WU should not be less than 8 times the characteristic height h, and the length of the downstream WD should not be less than 16 times the characteristic height h. Therefore, to obtain the accurate aerodynamic load of the model under crosswind conditions, the calculation domain must meet both the windless requirements and the crosswind requirements. If the simulation results are not significantly different from the experimental results under windless conditions, we must continue to study the aerodynamic performance of the train under crosswind condition. In this paper the relative position of the train in the calculation domain is shown in the Figure 4(c).

The LF, LB, WU, WD and L meet the requirements of windless and crosswind conditions. In addition, in order to ensure the accuracy of aerodynamic calculations, three refine zones are set around the train, as shown by the dotted lines.

Related Settings of Aerodynamic Simulation

Aerodynamic simulations, presented here are based on the commercial software Fluent. The simulation mode is steady-state. Li et al. [26] studied the aerodynamic behavior of a train under crosswinds based on the SST k-ω turbulence model. Comparing 6 different turbulence models, the SST k-ω model is most suitable for the numerical simulation, and so, is used here. The solver is pressure-based. When the operating condition is windless, the speed of the velocity inlet v (x, y, z) = (60, 0, 0) m/s, because the wind speed in the wind tunnel test was 60 m/s. In this case, DAEH is the velocity inlet and BCGF is the pressure outlet. EABF, ABCD, HDCG and HEFG are the wall. HDCG is a sliding wall surface with a sliding speed of 60 m/s. When the train is running under the crosswind, the speed of the velocity inlet v(x, y, z) = (97.2, 15, 0) m/s, so that it can simulate the train running at 350 km/h while encountering a crosswind with a speed of 15 m/s. The boundary conditions are shown in Figure 4(c). HDCG is sliding wall surface with a sliding speed of 97.2 m/s. The number of boundary layers is 12 layers, y+ = 1. The height of the first boundary layer is 1 × 10−5 m. ρ = 1.225 kg/m3. The train grid and the boundary layer grid are shown in Figure 5.

Figure 5
figure5

Train grid and the boundary layer grid

Aerodynamic Model Verification

This paper not only has experimental data verification, but also does grid independence testing to eliminate the difference in results caused by the number of grids. The study by Li et al. [24] provided experimental data of the model's aerodynamic load when the train is running under windless conditions. Aerodynamic coefficients are solved using definition 11 and definition 12.

$$C_{d} = \frac{{F_{d} }}{{\frac{1}{2}\rho Av^{2} }},$$
(11)
$$C_{l} = \frac{{F_{l} }}{{\frac{1}{2}\rho Av^{2} }},$$
(12)

where Fd is drag force, and Fl is lift force of the train. v=60 m/s, A =0.1687 m2 and ρ=1.225 kg/m3.

A total of four sets of grids were used. As shown in the Table 1, EXP represents experimental values. Among them, the mesh amount of the mesh2 exceeds 30 million, and the simulation results are not significantly different from the mesh3, mesh4, and experimental results. The larger the number of grids, the greater computational cost. Based on the above considerations, mesh2 is selected for further research.

Table 1 Grid independence test

Vehicle Dynamic Model

The dynamic model is a single vehicle model, and both bogies are trailer bogies, as shown in Figure 6. The Dynamic model has a primary-suspension system and a secondary-suspension system. The mechanical parameters of some components have nonlinear characteristics. The letters in the Figure 6 represent the main components of train, as shown in Table 2.

Figure 6
figure6

Vehicle dynamic model

Table 2 Symbolic description of the vehicle dynamic model

The nonlinear characteristics of some components are shown in Figure 7. This paper selects the stiffness of steel spring of the primary-suspension in x, y, and z directions, the stiffness of the axle box arm positioning node in the x, y, z directions, the stiffness of the anti-rolling torsion bar, the damping of the shock absorber of the primary-suspension, and the damping of the yaw damper as design variables. There are nine variables. The damping force of yaw damper is velocity-dependent, as shown in the Figure 7(c). It has different damping values at different speed grades. It is inappropriate to artificially choose fixed damping values for the yaw damper. But we can specify the coefficient α to change the slope of the damping force function. The new damping force function can be set as αc3. Parameters of the benchmark model and the range of each design variable are shown in Table 3. We use Latin hypercube sampling technology to construct 200 initial samples in the design space. These dynamic model samples have different suspension parameters.

Figure 7
figure7

Damping force function and Elastic force function

Table 3 The main parameters of the benchmark model and the range of each design variable

Related Settings of Vehicle Dynamics Simulation

Vehicle dynamic simulations, presented in this paper are based on the commercial software Simpack. JingJin track spectrum is used as the track spectrum. The wheel tread type is LM tread. The wheel-rail contact force can be calculated by the simplified theory of Kalker (FASTSIM) [27]. The vehicle speed is 350 km/h. The train is run for 20 s. The number of sampling points is 5100. The wind load on the train consists of two parts: aerodynamic forces and aerodynamic moments. Lift and side forces are regarded as concentrated forces acting on the center of the car body. The roll moment, pitch moment and yaw moment obtained by CFD simulation are the moments of the aerodynamic force on the center of the car body. Therefore, in the vehicle dynamics model, they also act on the center of the car body. The wind load is applied to the centre of the car body from 0 s, and reaches full load after 2 s after which the wind load remains unchanged. When the simulation is completed, the wheel-rail vertical force and wheel-rail lateral force of each wheel are extracted.

Vehicle Dynamic Model Verification

The topological structure of all dynamic model samples is the same as that of the benchmark train. We verify whether this structure can respond correctly when excited. The train is only subjected to two types of loads, namely constant aerodynamic load and fluctuating track excitation. These two loads have been verified separately.

Aerodynamic Response Verification

For a moving train, side force has a very large impact on safety. Therefore, we use side force to verify whether the model behaves normally when encountering a constant load. When the train is running on a straight track (without the excitation effect of the track spectrum), the lateral resultant force of rail acting on the wheelset should be equal to the value of the side force of the train, but in the opposite direction. The time history curve of the side force and the resultant force of rail acting on the wheelset are shown in Figure 8. The results show that when the train runs stably, the resultant force on the wheelsets is 10 kN. This shows that the response of the model is normal.

Figure 8
figure8

Aerodynamic load verification

Rail Excitation Verification

In this section, we only consider the influence of track excitation on the dynamic behavior of the train, that is, with the train running without crosswinds. In the research process, the track spectrum we used is the JingJin spectrum. It is measured on the railway line. In order to check whether the wheelset response is normal, we replaced the JingJin spectrum with a sine function spectrum. The waveform of the track in the z direction is determined by Eq. (13):

$$z = A\sin \left( {\frac{{2{\uppi }x}}{\lambda }} \right),$$
(13)

where z is the unevenness of the track. A=0.5 mm. x is the longitudinal distance of the track. λ is the wavelength, λ=97.2 m.

When the train is excited by the track, the wheelsets will be forced to vibrate. If the model is accurate, the vibration frequency of the wheelset should be equal to the excitation frequency, when the state of motion is stable. The excitation frequency is as follows:

$$f = \frac{\omega }{{2{\uppi }}} = \frac{{2{\uppi }v}}{{2{\uppi }\lambda }} = 1\;{\text{Hz,}}$$
(14)

where v is the train speed, v=97.2 m/s.

The verification result of the vertical displacement of each wheelset is shown in Figure 9.

Figure 9
figure9

Vertical displacement of wheelsets

Figure 9 shows that in the time interval of 10‒20 s, the vertical displacement curve of each wheelset contains 10 cycles. The vibration frequency is 1 Hz. The response of the wheelset is correct. In summary, the model can respond correctly to both constant aerodynamic loads and fluctuating track excitations.

Optimization Process and Results Analysis

Aerodynamic Performance of High-Speed Train under Crosswind

The pressure distribution of the train is shown in Figure 10. More detail about the pressure distribution of head car and tail car as shown in Figure 11. The streamline distribution of the cross-section in each car as shown in Figure 12. The cross-sections were located in center point of each car. Figure 13 shows the vortex structure around the train in the form of vorticity.

Figure 10
figure10

Pressure distribution of the train (Pa)

Figure 11
figure11

Pressure distribution of the head car and tail car (Pa)

Figure 12
figure12

Streamlined distribution of each cross-section

Figure 13
figure13

Vortex structure around train

It can be seen from Figure 10 and Figure 11 that there is a large positive pressure zone on the windward side of the train. This is caused by the force of the crosswind. For the head car and the tail car, the streamlined area is most susceptible to crosswinds. On the windward side, there is a large area of positive pressure in the nose tip of the head car. On the leeward side, almost the entire streamlined area of the head car is a negative pressure area. And the value of negative pressure is very large. The huge difference in pressure between the two sides causes the head car to experience huge side forces. In the tail car, the situation was exactly the opposite. The area and value of negative pressure on the windward side of the streamlined area of the tail car are larger than those on the leeward side. This results in the streamlined area of the trailing car being mainly subjected to the pull by the crosswinds rather than a thrust. Figure 12 shows the streamline distribution of each cross-section. On the leeward side of the head car, the airflow has begun to form two vortexes. The cross section of the middle car shows that the vortex generated by the head car has become very large. As the vortex continues to develop, it will gradually move away from the tail car. Figure 13 shows the vortex structure in the form of vorticity. Steady-state simulation can get the main vortex structure around the train. We noticed that the area of the bogie, the leeward side of the head car and the streamline shape area of the tail car are the main areas where the vortex structure is generated. The head car produced two main vortex structures. The vortex structure located below the head car is relatively unstable, while another has been developing steadily along the train, and finally formed into the large vortex structure shown in Figure 12. Vortices may adversely affect the infrastructure around the railway, but these are not within the scope of this paper. This paper pays more attention to the aerodynamic loads on the train. When the train is running with the speed of 350 km/h, under the crosswind speed of 15 m/s, the drag force, side force, lift force and roll moment, pitching moment, and yaw moment of the scaled train are obtained. It is obviously inappropriate to apply the aerodynamic load of the scaled model directly to the full-scale dynamic model. In this paper, the Reynolds number is 3×106, and yaw angle is 8.7°. Although the Reynolds number of the scale model and the full-size model are different, the previous work has found that there are few effects of such Reynolds number on the aerodynamic coefficients of trains under different yaw angles between 0° and 30° [28, 29]. Based on the same coefficients of the two models, the aerodynamic load of the scaled model is converted into the aerodynamic load of the full-scale model, as shown in Table 4 and Table 5. However, in the field of train system dynamics, drag force is usually not considered. Among the remaining five types of aerodynamic loads, the values of the side force, roll moment, pitching moment and yaw moment of the head car are the largest. Although the lift force of head car is not the largest in the train, it is not small. Therefore, safety of the head car is probably the worst in the whole train. It is necessary to optimize the dynamics parameters of the head car to ensure that the high-speed train runs safely under crosswind.

Table 4 Aerodynamic force of full-size train
Table 5 Aerodynamic moments of full-size train

Safety Evaluation Metrics

We have 200 dynamic models with different suspension parameters. The aerodynamic loads shown in the Table 4 are applied to those dynamic models. The wheel-rail lateral and vertical forces of the train can be obtained through simulation, and the safety metrics can be calculated. Before research, it is necessary to describe the composition of the safety metrics. The safety metrics used in this study are determined according to GB/T 5599 and EN 14067-6 [18, 30].

  1. (1)

    Derailment coefficient Q/P

$$\frac{Q}{P} = \frac{\tan \alpha - \mu }{{1 + \mu \tan \alpha }} \le 0.8,$$
(15)

where Q and P are the wheel-rail lateral and vertical forces, respectively. α is rim angle and μ is wheel-rail friction coefficient.

  1. (2)

    Wheel-rail lateral force L

$$H \le 15 + \frac{{P_{0} }}{3} \le 55.5\;{\text{kN,}}$$
(16)

where P0 is the static axle load.

  1. (3)

    Vertical wheel-rail contact force Plim

$$P_{\lim } < 160\;{\text{kN}}{.}$$
(17)
  1. (4)

    Wheel load reduction rate \(\Delta P/\overline{P}\)

$$\frac{\Delta P}{{\overline{P}}} = \frac{{\left| {P_{R} - P_{L} } \right|/2}}{{P_{{{\text{st}}}} }} \le 0.8,$$
(18)

where ∆P is the wheel load reduction; PR, PL are the vertical forces on the left and right wheels, and Pst is the vertical static load of the wheel-rail.

  1. (5)

    overturning coefficient ∆P/Pst

$$\frac{\Delta P}{{P_{{{\text{st}}}} }} = 1 - \frac{{P_{i1} - P_{j1} }}{{P_{{{\text{st}}}} }} \le 0.9,$$
(19)

where Pi1 represents the wheel-rail vertical force of the front wheel on the load reduction side of the bogie, and Pj1 represents the wheel-rail vertical force of the rear wheel on the load reduction side of the bogie.

All metrics can be calculated from wheel-rail forces. Metric values change over time. We find the maximum value of each metrics and record it as the final safety value. Applying wind load to the benchmark model can obtain the safety metrics, as shown in the Table 6. We can see that the derailment coefficient and wheel-rail lateral force have exceeded the standard.

Table 6 Safety metrics of benchmark model

Structure of the Surrogate Model

We take the design variables of each dynamic model as input data, the safety metrics of the corresponding models as output data, and use neural networks to analyze this input and output. The neural network is our surrogate model. Before setting up a surrogate model, we must perform some processing on the data. There are nine design variables, whose range of values vary greatly. If we input them directly into the neural network, the mapping accuracy of the neural network may be poor. Therefore, it is necessary to normalize the input variables so that their values are between 0 and 1.

$$x_{{{\text{new}}}} = \frac{{x_{i} - x_{i\min } }}{{x_{i\max } - x_{i\min } }},$$
(20)

where xnew is the design variable after conversion, xi is the design variable before conversion, ximin is the minimum value of the xi, and ximax is the maximum value of xi.

Similarly, there are five metrics of safety, three of which are coefficients and the other two are forces. There is a huge difference between them. In order to avoid the loss of mapping accuracy of the neural network, we need to deal with it in advance. The difficulty was that, due to the uncertainty, we did not know the precise range of the five metrics before running the simulation. Usually the ranges of overturning coefficient, wheel load reduction ratio and derailment coefficient are between 0‒1. However, the derailment coefficient is special. It is the ratio of lateral force to vertical force. During the simulation, if the value of the lateral force is extremely large and the value of the vertical force is extremely small at the same instant, the maximum value of the derailment coefficient will be very large. In this case, the maximum value of the derailment coefficient of some samples is less than 1, and the maximum value of the derailment coefficient of other samples is much larger than 1. This situation is very unfavorable for the neural network to build a surrogate model. Usually, we don't really care about the specific value of the particularly large derailment coefficient. But, one of the objectives of our optimization is to make the derailment coefficient as small as possible. Therefore, after the dynamic simulation of a certain sample, if the maximum value of the derailment coefficient is greater than 1.5, we artificially set to be equal to 1.5 to avoid excessive differences between samples. The other two metrics that need to be dealt with are the vertical wheel-rail contact force and wheel-rail lateral force. Their values are much larger than 1. Because of large differences in magnitude between these two metrics and other metrics, it may be difficult for the surrogate model to establish a correct mapping relationship. However, the range of wheel-rail vertical force is typically within 0‒200 kN, and the range of wheel-rail lateral force is typically within 0‒150 kN. Thus we can use Eq. (20) to crudely normalize these two metrics.

The structure of the surrogate model is shown in Figure 14. It is a neural network with three hidden layers. The number of neurons in each hidden layer is different. Increasing the number of hidden layers and neurons will enhance the ability of the neural network to map non-linearly, but too many neurons and hidden layers will easily cause the surrogate model to overfit the data, which is not good for optimization. The nonlinear activation functions tanh, softplus and sigmoid are used to enhance the nonlinear fitting ability of the neural network. The optimizer is RMSprop, lr = 0.01, rho = 0.9.

Figure 14
figure14

Structure of surrogate model

Process of Optimization

The specific optimization process is shown in Figure 15. Minimizing the values of 5 objectives is the ultimate goal of optimization.

Figure 15
figure15

Process of optimization

As mentioned above, we already have two hundred sets of data. They comprise the design variables and corresponding safety metrics of each model. They are used as the input and output of the neural network to build a primary surrogate model. Of course, the mapping accuracy of the first attempt at a surrogate model is not sufficient. But NSGA-III algorithms can still find a series of non-dominated solutions based on this rough surrogate model. The population number of the NSGA-III algorithm is set to 300, and the evolutionary generation number is set to 300. In each cycle, we always can get a series of non-dominated solutions. Partial solutions in this non-dominated solution set are randomly selected. Design variables corresponding to these solutions are found and inputted into the dynamic model. Through dynamic simulation, the values of the safety metrics of each model can be obtained and compared with the predicted values obtained by the surrogate model. If the accuracy meets the requirements, it is determined that the optimization has been successful. The non-dominated solution set is the pareto-optimal front we need. Next, we can find the most suitable solution in the solution set according to preference. If the accuracy does not meet the requirements, the samples that have been simulated will be treated as new sample points and will be used in the expanded sample set together with the original samples in the next stage of neural network training. The accuracy of the surrogate model improves as the number of cycles increases. In terms of model structure, the greater the number of design variables and/or objectives, the more difficult it is to build a surrogate model. The train often experiences random excitations, caused by the effect of the track spectrum, which also makes it very difficult to build a very accurate surrogate model. Therefore, it is necessary to make some trade-offs between model accuracy and calculation cost. In each cycle, 20 solutions are randomly selected from the optimized pareto solution set. If these solutions satisfy Eq. (21), the accuracy requirements are met:

$$\frac{1}{20}\sum\limits_{j = 1}^{20} {\sum\limits_{i = 1}^{5} {\frac{{\left| {o_{i} - y_{i} } \right|}}{{y_{i} }}} } \times 100\% \le 35\% ,$$
(21)

where oi represents the ith predicted value of the solution by the surrogate model, and yi represents the ith simulation value obtained by the same solution put through the dynamic model.

In summary, \(\sum\nolimits_{i = 1}^{5} {\left| {o_{i} - y_{i} } \right|} /y_{i}\) represents the sum of the absolute values of the relative errors between the predicted values and the simulation values of each solution. \((1/20)\sum\nolimits_{j = 1}^{20} {\left( \bullet \right)}\) is to average the 20 individuals. Even so, we have paid a very high price. The initial 200 samples were far from enough, and a total of 670 samples were used throughout the study.

Optimization Result Analysis

A total of 210 pareto solutions were obtained. If the number of objectives is 2 or 3, a 2D plane coordinate system or a 3D space coordinate system could be used to represent the pareto solution. When the number of objectives is greater than 3, it is difficult to visualize the result by the traditional method. Parallel coordinates are introduced here to represent the pareto solution set, as shown in Figure 16. Before the pareto front is presented, it is necessary to perform numerical processing on each metric so that the five metrics having very different values can be represented on one graph, as shown in following equation.

$$\left( \frac{Q}{P} \right)_{{i{\text{new}}}} { = }\left( \frac{Q}{P} \right)_{i} /0.8,$$
(22)
$$H_{{i{\text{new}}}} = \frac{{H_{i} }}{55.5},$$
(23)
$$P_{{\lim_{{_{i} {\text{new}}}} }} = \frac{{P_{{\lim_{i} }} }}{160},$$
(24)
$$\left( {\frac{\Delta P}{{\overline{P}}}} \right)_{{i{\text{new}}}} = \left( {\frac{\Delta P}{{\overline{P}}}} \right)_{i} /0.8,$$
(25)
$$\left( {\frac{\Delta P}{{P_{{{\text{st}}}} }}} \right)_{{i{\text{new}}}} = \left( {\frac{\Delta P}{{P_{{{\text{st}}}} }}} \right)_{i} /0.9,$$
(26)

where i=1, 2, ..., 210, i is the serial number of the pareto solution \(\left( \frac{Q}{P} \right)_{inew}\), \(H_{{i{\text{new}}}}\), \(P_{{\lim_{{i{\text{new}}}} }}\), \(\left( {\frac{\Delta P}{{\overline{P}}}} \right)_{{i{\text{new}}}}\),\(\left( {\frac{\Delta P}{{P_{{{\text{st}}}} }}} \right)_{{i{\text{new}}}}\) are the derailment coefficient, wheel lateral force, wheel-rail vertical force, and wheel load reduction rate after conversion \(\left( \frac{Q}{P} \right)_{i}\), \(H_{i}\), \(P_{{\lim_{i} }}\), \(\left( {\frac{\Delta P}{{\overline{P}}}} \right)_{i}\),\(\left( {\frac{\Delta P}{{P_{{{\text{st}}}} }}} \right)_{i}\) are the derailment coefficient, wheel lateral force, wheel-rail vertical force, and wheel load reduction rate before conversion.

Figure 16
figure16

Pareto-optimal front

The purple line in Figure 16 represents the pareto-optimal front. All solutions are combined into a five-dimensional curved surface. The values of the safety metrics of the benchmark model are also shown on the graph. The pareto front is evenly distributed across the entire curved surface, indicating that pareto solutions are well distributed, proving the reliability of the algorithm. The result shows that four metrics are greatly improved compared to the benchmark model in derailment coefficient, vertical wheel-rail contact force, wheel-rail lateral force, and wheel load reduction rate. The improvement in overturning coefficient is not obvious. This shows that different objectives have different sensitivities to design variables. By changing the stiffness of the steel spring of the primary-suspension, the damping of the shock absorber of primary-suspension, the damping of the yaw damper, the stiffness of the anti-rolling torsion bar and the stiffness of the axle box arm positioning node, we can effectively reduce the derailment coefficient, vertical wheel-rail contact force, wheel-rail lateral force, and wheel load reduction rate of the train. However, it may not be effective to reduce the overturning coefficient only by changing the design variables mentioned above. If it is wished to further reduce the overturning coefficient, other methods may need to be considered. Because the overturning coefficient is quite unusual, it seems to have little correlation with the suspension parameters of the bogie. If we do not want an object to overturn, the most effective measures are to reduce the intensity of the crosswind or increase the mass of the object. By changing the shape of the car body, the effect of crosswind on trains can be reduced, thereby the overturning moment can be greatly reduced, which in turn reduces the overturning coefficient. We select a set of design variables from the pareto solution set, as shown in Table 7. The design variables are used as the input of the dynamic model to obtain the simulation values of the safety metrics. Predicted values by surrogate model and simulation values by dynamic model are shown in Table 8.

Table 7 Design variable
Table 8 Safety metrics after optimization

In Table 8, the difference between the predicted values and the simulation values is relatively small. Among the simulation values, the five metrics are all better than the benchmark model. The errors’ absolute values of derailment coefficient, vertical wheel-rail contact force, wheel load reduction rate, wheel-rail lateral force and overturning coefficient are 1.1%, 2.1%, 4.6%, 15.3% and 3.6% respectively. The total error of the 5 objectives is 26.7%. Of course, the improvement of the overturning coefficient is not obvious. The remaining four metrics are greatly improved, compared to the benchmark model. In particular, the derailment coefficient and the wheel-rail lateral force are no longer in excess of the safety standards, and they are far from the warning line. Table 8 proves that our idea is feasible. When the train is running in a crosswind environment, safety will be greatly reduced. Optimizing only two of these metrics does not guarantee train safety. Only when all safety metrics are taken into account can an optimization scheme be considered reasonable. And we can do this by combining the surrogate model method and advanced optimization algorithms. None of the five metrics exceeded the safety standards. They were all well below the alert values. Figure 17 shows the wheel-rail vertical and lateral forces of wheelset2 of the benchmark model and the optimized model (Wheelset2 being most representative of the differences in the forces experienced by these models). Where A is the lateral wheel-rail force of the right wheel, B is the lateral wheel-rail force of the left wheel, C is the vertical wheel-rail force of the right wheel, and D is the vertical wheel-rail force of the left wheel.)

Figure 17
figure17

Wheel-rail forces of benchmark model and optimized model

In Figure 17, we can see that both the benchmark model and optimized model's vertical force and lateral force of wheel-rail on the windward side are smaller than on the leeward side. Because the wind load is applied to the body center of the car body, the windward side wheel has reduced load and the leeward side wheel has increased load. Wheel-rail forces of both models fluctuate over time. This is caused by the synergistic effect of wind loads and track spectrum. However, it is obvious that the wheel-rail force fluctuations of the benchmark of the model are more severe than the optimized model. In particular, the benchmark model’s lateral and vertical forces of the wheel-rail on the leeward side have many protrusions on the time course curve. This phenomenon did not appear on the optimized model’s time course curve. The peak points of the protrusions are the maximum points of wheel-rail forces. In Figure 17(b), the maximum vertical force of the right wheel is 133.30 kN. This value is the Plim of the benchmark model. Other safety metrics are also obtained from wheel-rail forces, either directly, or indirectly. The extreme values of forces are the root cause of the train exceeding the safety standards. The optimization process is intended to adjust the dynamic parameters, to alleviate the abnormal fluctuation of wheel-rail force and reduce the peak value of the wheel-rail force. Reducing peaks is not only helpful for safety. We have reason to believe that this will also improve the smoothness and comfort for passengers when the train is running. But quantitative analysis needs further study.

Conclusions

  1. (1)

    Combining aerodynamics, the surrogate model method and the many-objective optimization meth-od, we can optimize train dynamic performance comprehensively, considering more safety metrics. These measures can make trains run more safely.

  2. (2)

    By changing the stiffness of the steel spring of the primary-suspension, the damping of the shock absorber of the primary-suspension, the damping of the yaw damper, the stiffness of the anti-rolling torsion bar and the stiffness of the axle box arm positioning node, we can effectively reduce the derailment coefficient, vertical wheel-rail contact force, wheel-rail lateral force, and wheel load reduction rate. However, an improvement of the overturning coefficient is not significant, and further research is needed.

  3. (3)

    When the train is running, the wheel-rail force may fluctuate drastically with time. Many protrusions features will be shown on the wheel-rail force time-history curve. The extreme points are the root cause of exceeding the safety standards. The optimization process is to adjust the parameters of the design variables to suppress the occurrence of fluctuation, thereby improving the safety metrics.

References

  1. [1]

    H Q Tian. Review of research on high-speed railway aerodynamics in China. Transportation Safety and Environment, 2119, 1(1): 1-21.

    Article  Google Scholar 

  2. [2]

    X He, H Li. Review of aerodynamics of high-speed train-bridge system in crosswinds. Journal of Central South University, 2020, 27(4): 1054-1073.

    Article  Google Scholar 

  3. [3]

    C Baker, H Hemida, S Iwnick, et al. Integration of crosswind forces into train dynamic modelling. Proceedings of the Institution of Mechanical Engineers, 2011, 225(2): 154-164.

    Article  Google Scholar 

  4. [4]

    E Deng, W C Yang, M F Lei, et al. Aerodynamic loads and traffic safety of high-speed trains when passing through two windproof facilities under crosswind: A comparative study. Engineering Structures, 2019, 188: 320-339.

    Article  Google Scholar 

  5. [5]

    N J Yan, X Z Chen, Y L Li. Assessment of overturning risk of high-speed trains in strong crosswinds using spectral analysis approach. Journal of Wind Engineering and Industrial Aerodynamics, 2018, 174: 103-118.

    Article  Google Scholar 

  6. [6]

    J M Olmos, M A Astiz. Non-linear vehicle-bridge-wind interaction model for running safety assessment of high-speed trains over a high-pier viaduct. Journal of Sound and Vibration, 2018, 419: 63-89.

    Article  Google Scholar 

  7. [7]

    L Wei, J Zeng, P B Wu, et al. Safety analysis of high speed trains under cross winds using indirect wheel-rail force measuring method. Journal of Wind Engineering and Industrial Aerodynamics, 2018, 183: 55-67.

    Article  Google Scholar 

  8. [8]

    J Q Niu, D Zhou, T H Liu, et al. Numerical simulation of aerodynamic performance of a couple multiple units high-speed train. Vehicle System Dynamics, 2017, 55: 681-703.

    Article  Google Scholar 

  9. [9]

    M G Yu, R C Jiang, Q Zhang, et al. Crosswind stability evaluation of high-speed train using different wind models. Chinese Journal of Mechanical Engineering, 2019, 32: 40.

    Article  Google Scholar 

  10. [10]

    Z Li, L Zheng, Y Ren, et al. Multi-objective optimization of active suspension system in electric vehicle with In-Wheel-Motor against the negative electromechanical coupling effects. Mechanical Systems and Signal Processing, 2019, 116: 545-565.

    Article  Google Scholar 

  11. [11]

    G Fossati, F Miguel, P Casas. Multi-objective optimization of the suspension system parameters of a full vehicle model. Optimization and Engineering, 2019, 20(1): 151-177.

    Article  Google Scholar 

  12. [12]

    L Zhang, J Y Zhang, T Li, et al. A multiobjective aerodynamic optimization design of a high-speed train head under crosswinds. Part F: Journal of Rail and Rapid Transit, 2018, 232(3): 895-912.

    Google Scholar 

  13. [13]

    J Liu, W P Song, Z H Hab, et al. Efficient aerodynamic shape optimization of transonic wings using a parallel infilling strategy and surrogate models. Structural and Multidisciplinary Optimization, 2016, 55(3): 925-943.

    Article  Google Scholar 

  14. [14]

    T Li, Z Y Dai, M G Yue, et al. Numerical investigation on the aerodynamic resistances of double-unit trains with different gap lengths. Journal of Engineering Applications of Computational Fluid Mechanics, 2021, 15(1): 549–560.

    Article  Google Scholar 

  15. [15]

    D Viúdez-Moreiras. Performance influences on metamodelling for aerodynamic surrogate-based optimization of an aerofoil. Engineering Optimization, 2019, 51(3): 427-446.

    MathSciNet  Article  Google Scholar 

  16. [16]

    J Munoz-Paniagua, J Garcia. Aerodynamic drag optimization of a high-speed train. Journal of Wind Engineering and Industrial Aerodynamics, 2020, 204.

  17. [17]

    L Zhang, T Li, J Zhang. Research on aerodynamic shape optimization of trains with different dimensional design variables. International Journal of Rail Transportation, 2020, Doi: https://doi.org/10.1080/23248378.2020.1817803.

    Article  Google Scholar 

  18. [18]

    Standardization Administration. GB5599-1985. Railway trains – Specification for evaluation the dynamic performance and accreditation test, Beijing: Standardization administration of P.R.C. Standards Press of China, 1985. (in Chinese)

  19. [19]

    K Funahashi. On the approximate realization of continuous mappings by neural networks. Neural Networks, 1989, 2: 183-192.

    Article  Google Scholar 

  20. [20]

    D E Rumelhart, G E Hinton, R J Williams. Learning representations by back-propagating errors. Nature, 1986, 323: 533-536.

    Article  Google Scholar 

  21. [21]

    M Q Zhang, L Wang, W A Guo, et al. Many-objective evolutionary algorithm based on relative non-dominance matrix. Information Sciences, 2021, 547: 963-983.

    MathSciNet  Article  Google Scholar 

  22. [22]

    K Deb, H Jain. An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, Part I: Solving problems with box constraints. IEEE Transactions on Evolutionary Computation, 2014, 18(4): 577-601.

    Article  Google Scholar 

  23. [23]

    H Jain, K Deb. An evolutionary many-objective optimization algorithm using reference-point based non-dominated sorting approach, PartII: Handling constraints and extending to an adaptive approach. IEEE Transactions on Evolutionary Computation, 2014, 18(4): 602-622.

    Article  Google Scholar 

  24. [24]

    T Li, M Li, Z Wang. Effect of the inter-car gap length on the aerodynamic characteristics of a high-speed train. Proceedings of the Institution of Mechanical Engineers, Part F: Journal of Rail and Rapid Transit, 2019, 233: 448-465.

    Article  Google Scholar 

  25. [25]

    National Railway Administration of the People’s Republic of China, TB/T 3503.4-2018 Railway Applications-Aerodynamics-Part 4: Requirement for train aerodynamic simulation. Beijing: China Railway Culture Media Company Limited, 2018. (in Chinese)

  26. [26]

    T Li, D Qin, J Y Zhang. Effect of RANS turbulence model on aerodynamic behavior of trains in crosswind. Chinese Journal of Mechanical Engineering, 2019, 32: 85

    Article  Google Scholar 

  27. [27]

    J J Kalker. A fast algorithm for the simplified theory of rolling contact. Vehicle System Dynamic, 1982, 11: 1-13

    Article  Google Scholar 

  28. [28]

    F Dorigatti, M Sterling, C Baker, et al. Crosswind effects on the stability of a model passenger train-A comparison of static and moving experiments. Journal of Wind Engineering and Industrial Aerodynamics, 2015, 138: 36-51.

    Article  Google Scholar 

  29. [29]

    J Du, L Zhang, M Z Yang, et al. Moving model experiments on transient pressure induced by a high-speed train passing through noise barrier. Journal of Wind Engineering and Industrial Aerodynamics, 2020: 204.

  30. [30]

    European committee for standardization. EN14067-6-2018. Railway Applications-aerodynamics-part 6: Requirements and Test Procedures for Cross Assessment. Brussels: CEN-CENELEC Management Centre, 2018.

Download references

Acknowledgements

Not applicable.

Funding

Supported by The National Key Research and Development Program of China (Grant No. 2020YFA0710902), The National Natural Science Foundation of China (Grant No. 12172308), Sichuan Provincial Science and Technology Program of China (Grant No. 2019YJ0227), and State Key Laboratory of Traction Power of China (Grant No. 2019TPL_T02).

Author information

Affiliations

Authors

Contributions

LZ, TL and JZ were in charge of the whole trial; LZ wrote the manuscript; RP assisted with sampling and laboratory analyses. All authors read and approved the final manuscript.

Authors’ Information

Le Zhang, born in 1993, is currently a PhD candidate at State Key Laboratory of Traction Power, Southwest Jiaotong University, China. He received his bachelor degree from Southwest Jiaotong University, China, in 2016. His research interests include train aerodynamics and optimization.

Tian Li, born in 1984, is an associate professor at State Key Laboratory of Traction Power, Southwest Jiaotong University, China.

Jiye Zhang, born in 1965, is a professor at State Key Laboratory of Traction Power, Southwest Jiaotong University, China.

Ronghuan Piao, born in 1997, is currently a master candidate at State Key Laboratory of Traction Power, Southwest Jiaotong University, China.

Corresponding author

Correspondence to Jiye Zhang.

Ethics declarations

Competing Interests

The authors declare no competing financial interests.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Li, T., Zhang, J. et al. Optimization on the Crosswind Stability of Trains Using Neural Network Surrogate Model. Chin. J. Mech. Eng. 34, 86 (2021). https://doi.org/10.1186/s10033-021-00604-0

Download citation

Keywords

  • Safety
  • Surrogate model
  • Optimization
  • High-speed train
  • Crosswind