Physics-informed Deep Neural Network for Bearing Prognosis with Multi-sensory Signals

: Prognosis of bearing is critical to improve the safety, reliability and availability of machinery systems, which provides the health condition assessment and determines how long the machine would work before failure occurs by predicting the remaining useful life (RUL). In order to overcome the drawback of pure data-driven methods and predict RUL accurately, a novel physics-informed deep neural network, named degradation consistency recurrent neural network, is proposed for RUL prediction by integrating the natural degradation knowledge of mechanical components. The degradation is monotonic over the whole-life of bearings, which is characterized by temperature signals. To incorporate this knowledge of monotonic degradation, a positive increment recurrence relationship is introduced to keep the monotonicity. Thus, the proposed model is relatively well-understood and capable to keep the learning process consistent with physical degradation. The effectiveness and merit of the RUL prediction using the proposed method are demonstrated through vibration signals collected from a set of run-to-failure tests.


Introduction
Prognosis and health management (PHM) of machine systems plays an important role in performing the digital transition of industry in which combining the digital, physical and human dimensions together.It is a computation-based paradigm that leverages physical knowledge, monitoring data and human experience to achieve the goal of fault detection, degradation assessment, evolution prediction and remaining useful life prediction [1].Up to now, a lot of efforts have been made to develop the PHM techniques, such as development of hardware, (i.e., Internet of Things), smart sensors, and software including data analytics.PHM mainly contains three aspects: fault diagnosis, evolution prognosis, and decisions for management.Nowadays, a large amount of research work focuses on fault diagnosis and prognosis, which are the prerequisites of health management [2].
The initial fault detection of mechanical components has been addressed for many years and made great success in many fields [3].If an initial fault is detected, it's more challenging to accurately predict how long the machine system will work before failure occurs, namely remaining useful life (RUL) prediction.The RUL prediction methods are categorized into three classes, namely model-based methods, data-driven methods, and hybrid methods [4].Model-based approaches usually takes advantage of physical knowledge to model the degradation process for RUL prediction.Li et al. proposed an improved exponential model for RUL prediction of rolling element bearings, where an adaptive predicting time was developed based on the 3-sigma interval.The simulation and four tests of bearing degradation processes were employed to demonstrate its effectiveness [5].Singleton et al used both time and time-frequency domain features to track the degradation process of bearing and predicted the RUL under different operating conditions through extended Kalman filter [6].The modelbased approaches require degradation models, which means one has to master the physical knowledge of bearings' evolution.An alternative solution is to predict the RUL from historical data without physical models.
Data-driven methods are more widely investigated compared to model-based methods, because data-driven approaches only rely on the historical data without fully understanding the degradation models [7].With accumulation of monitoring data, machine learning models including deep learning architectures are built to predict the RUL without physical models.For examples, Sun et al proposed a deep transfer learning (DTL) network based on sparse autoencoder for RUL prediction.The RUL prediction of cutting tool using DTL model have higher accuracy compared with other methods [8].Ma et al proposed a convolutional neural network for RUL prediction, where timefrequency features were adopted to capture long-term dependencies through convolution operation [9].
Since deep learning has made breakthroughs in many applications such as image recognition, speech recognition, and language translation [10], it is widely investigated in prognosis of mechanical components, such as bearings [11] and gears [12].However, it is hardly to apply deep learning methods to real mechanical systems.A primary factor is the block-box nature of deep learning framework which is complex to understand the learned features.Even though the deep learning models may achieve somewhat more accurate prediction but they don't provide the ability to understand the underlying processes.Moreover, an interpretable model including physical knowledge will stand a better chance of safeguarding against the building of spurious models from the historical data that may cause non-generalizable performance.This is especially critical when dealing with predictions of complex systems that the failure would cause significant accidents.As a first step for moving beyond the black-box models of deep learning, the physical knowledge is integrated with deep learning models to improve the interpretability of the models.Motivated by embedding physical knowledge into deep neural models, in this study, a degradationknowledge based deep learning models are proposed for remaining useful life prediction.Instead of using purely data-driven methods, we principally embed well-known physical principles into the recurrent neural networks.As stated in [13], the degradation process of mechanical systems is monotonic, which means that components can't heal without repairing.Thus, an ideal degradation indicator should be monotonic over time.In this study, temperature signals collected during bearing run-to-failure tests are used to describe the degradation process because it has better monotonic characteristic compared to vibration signals.To ensure that the learned features of deep models are consistent with the physical knowledge, i.e., the monotonic characteristic of degradation process, a degradation consistency deep neural network is proposed which preserves the monotonicity of degradation.

Literature review
With the breakthroughs of deep learning models in many fields, there is growing interest in the scientific community to take advantage of the benefits of deep models for prognosis of mechanical components [14], this is because one can directly build the mapping functions with datasets of the whole degradation trajectories, but it neglects the knowledge information.To overcome the drawback of purely datadriven methods, knowledge-guided data science is investigated which aims to leverage the wealth of physical information to increase the generalization of the datadriven models [15].Karpatne et al proposed a physics-guided neural network (PGNN) to combine scientific knowledge of physicsbased models with neural network for lake temperature modeling.By leveraging the scientific knowledge to guide the modeling of neural network, it demonstrates that PGNN has better generalizability and scientific consistency [16].Raissi et al introduced a physics-informed neural network to solve supervised learning problems while keeping any given principles that are governed by nonlinear partial differential equations [17].The effectiveness is illustrated through some cases in the fields of fluids, quantum mechanics, etc.Furthermore, the models incorporating physical knowledge maybe produce scientifically interpretable models.There are various ways of embedding the physical knowledge in deep neural networks.Daw et al. developed a physics-guided framework of neural network to integrate the models with uncertainty quantification.The results show that the Monte Carlo estimates match the distribution of actual measurements correctly [18].In the area of bearing's RUL prediction, data-driven methods commonly neglect the degradation knowledge.During the whole life of bearings, the degradation process is monotonic, which is usually ignored when predicting RUL with vibration signals.Thus, it is necessary to consider the degradation properties when constructing the deep learning models.In this study, the degradation process is embedded into a deep neural network, which is expected to produce more interpretable models.

The proposed DcRNN for RUL prediction
In this study, a novel method, named degradation-consistency RNN, is proposed for prognosis of mechanical components.The framework of the RUL prediction procedure with DcRNN is shown in Figure 1.

The basic RNN architecture
Recurrent neural networks have been widely used for time-series data prediction, speech recognition, language translation and many other applications by incorporate the sequential information of time-series signals [20].RNNs model the sequential context among the signals by transforming a vector of hidden state t h from the last step to current time step t : ( ) where  

Physical knowledge
Vibration Signals

Degradation-consistency RNN
To improve the generalizability and scientific interpretability of machine learning models, the physical knowledge should be considered, which will ensure the models that are consistent with known principles.In this study, not only predicted loss in the target space y , but also the violations of physical knowledge in the model outputs p are leveraged.Both of them are used to compute the final loss function: arg min , , where is a trade-off hyper-parameter and controls the weight between physical consistency and empirical loss.In this way, the weights of deep neural model will be searched in the restrictions which keep consistency with physical knowledge.In this study, the degradation information of bearings is considered when building the deep neural model, which tries to keep the learning process of model consistent with the physical degradation process.Since the degradation is monotonic, it is assumed to be increased over time, the degradation trajectory over time is expressed as where 0 t d is the degradation increment over time due to working under loads.This means that the degradation process is irreversible, thus the learned features of deep neural network should be consistent with the irreversible evolution of bearing's health condition.The degradation consistency RNN is constructed based on the basic LSTM architecture by embedding the degradation knowledge.The monotonic characteristic is modeled in the proposed DcRNN through building the relationship of monotonic trend.To ensure that the deep learning model is consistent with physical knowledge, the monotonic is preserved by introducing degradation change.Instead of using vibration signals for RUL prediction in an end-to-end way, the learned features are informed by degradation monotonicity, which is represented as the physical intermediate variables that increase over time.However, it is hard to obtain the monotonic index directly because it can't be measured through sensors.As

Experiment setup
Bearings' run-to-failure tests are carried out on a special design test beds to observe the natural degradation process.The test rig is specially designed for bearing run-to-failure experiments, which includes a power and drive system, a hydraulic loading system, a lubrication system, a control system and an independent data recording system.The main part of test rig is designed consisting of a support beam structure, where two test bearings are installed on both ends of the shaft, as shown in Figure 3 Two sets of bearing run-to-failure tests were analyzed in this study.In the experiments, the failure mode of bearing in test I is inner race, outer race and rolling element faults, while that in Test II is inner race fault.There are 4071 and 763 datasets for test I and test II, respectively.The vibration and temperature signals over the whole life of bearings in test I are presented in Figures 4   and 5.The bearing works under the health state for a long time, then an initial defect occurs leading it to enter into a degradation stage.With the fault development and damage accumulation, the bearing's performance deteriorates over time.Vibration and temperature signals in test II are shown in Figures 6 and 7. A similar degradation process is observed from the figures.The amplitude of vibration signals decreased at 343.3 hour in test I and 57.1 hour in test II, but it doesn't mean the bearing's health condition become better.This is because a bearing's degradation is an irreversible process without maintenance.Thus, the degradation should be monotonic over time.When we predict RUL with vibration signals, the physical knowledge of monotonic degradation should be embedded into the neural network to improve the performance.2, are also used as inputs of the model.To some extent, they are capable of reflecting the degradation process, as shown in Figures 8 and 9   ( ) where p and p are predicted and actual values, n means the number of samples.
The predicted errors are shown in Table 3.It is seen that when the frequency features are used as inputs, the predicted errors are the

Comparison and Discussion
To demonstrate the advantages of the proposed model that embeds physical knowledge, the conventional LSTM is adopted for RUL prediction with vibration signals.By comparing with the proposed method, it is shown that results of proposed DcRNN have higher predicted accuracy, which demonstrates the benefits of the proposed method that incorporates the physical knowledge.In the training process, the learned features are forced to be consistent with degradation process, which will help to improve the predicted results.

Conclusion
In this work, a novel physics-informed deep neural network, named DcRNN, is proposed for RUL prediction of bearings.The traditional deep learning models for RUL prediction are purely data-driven methods, and ignore the physical information.The proposed DcRNN is able to learn features that are consistent with scientific principles, which moves toward constructing interpretable and generalizable deep neural models.To be more specific, the latent variables are consistent with degradation state, which is monotonic, temperature signals are used to represent the degradation process.Then the latent features and vibration signals are used for RUL prediction.Bearing run-to-failure tests are carried out to obtain the historical data of the whole life.RUL prediction is performed with vibration and temperature signals using the proposed method.The results show that deep neural models which embed physical knowledge have the potential for accurate RUL prediction.As the future work, the models that include more physical knowledge should be constructed, such as the degradation knowledge of dynamic models.With more physical knowledge incorporating, the deep neural networks will be more generalizable and have better performance in prediction.
Karniadakis et al. summarized some prevailing trends in embedding physics into machine learning for forward and inverse problems, such as discovering hidden physics [19].

Figure 1 .
Figure 1.DcRNN paradigm aims to infuse degradation knowledge for RUL prediction input gate, forget gate, and output gate, respectively.

Figure 6 .Figure 7 Figure 4 Figure 5
Figure 6.Bearing's vibration signals of Test 1 . The loss values over epochs of training data in test I are shown in Figure 10.The loss function contains two parts: data loss and physical loss function.Both the loss values are converged to a small value after 500 epochs, which means the physical information is considered during the training process.To show the physical consistency in the training process, the physical degradation and the learned features are illustrated in Figure 11.The loss values of training data and physical consistency in test II are shown in Figures 13 and 14, respectively.Compared with physical degradation states which are represented by temperature signals, the learned features have the same trend to physical degradation, which means that the proposed model is capable of extracting the features that are consistent with physical knowledge.The predicted results of RUL are shown in Figures 12 and 15.To evaluate the performance of the proposed DcRNN for RUL prediction quantitively, mean absolute percentage error (MAPE) and root mean

Figure 8 Figure 9
Figure 8 Statistical features of Test 1

Figure 10 Figure 11 Figure 12 RUL prediction of Test 1 Figure 13 2 Figure 14 . 2 Figure 15
Figure 10 Loss values of training data in Test 1

Table 2 The
The conventional LSTM includes two layers, and there are 128 hidden units in each layer.The input features, training and testing datasets are set the same with those of the proposed LSTM architecture.The RUL predicted results are shown in Table4.

Table 3
RUL prediction with the proposed method