Wind Turbine Fault Detection Using Machine Learning And Neural Networks
By Mangesh Kale and Narayani Ghatwai, eInfochips
1. Introduction
The increasing demand for energy as well as the rapid rise of greenhouse gas emissions due to the use of fossil fuels have made us invent new ways to generate renewable energy. The production of electrical energy based on wind power using wind turbines has become one of the most popular renewable sources since it can generate a reliable, clean energy with costs now comparable to conventional nuclear energy sources.
Wind turbines are massive pieces of equipment and typically are installed in locations characterized by extreme climates to exploit the high wind energy potential. Regular on-site inspection and preventative maintenance of these equipment are required to sustain long-term returns. In addition to the maintenance tasks, random electrical and mechanical failures can cause prospective breakdowns and damages, and lead to machine downtimes and energy production loss.
Offshore wind turbines in farm locations are hard to reach and may pose problems in maintenance cycles, the cost of repair, and repair procedures. The smart solution is to utilize remote monitoring and diagnostics based on the sensor data.
2. Remote Fault Monitoring and Detection Concept
A fault is an unpermitted deviance of at least one parameter or characteristic property of the system from the acceptable or standard condition, whereas failure is a permanent interruption of a system’s ability to perform the required function under specified operating conditions.
The repeated occurrence of some faults may cause a system failure hence early detection of faults is crucial in keeping the system running for a long term.
Fault detection methods are typically classified into two categories: model-based methods and signal processing based (feature-based) methods. Model-based methods are based on system modeling and model evaluation. In signal processing based methods, mathematical or statistical operations are performed or artificial intelligence (AI) techniques are applied to suitable signal features to extract the information about faults. Feature-based methods are best suited for remote monitoring as sensor data provide in-situ measurements and can be transported to the processing center by various means.
In order to develop a reliable fault-detection mechanism using feature-based methods, there is a need for information describing the state of each monitored component. Such information is extracted through various sensor signals. The signals used may include acoustic emission, vibration, torque, strain, temperature, electrical output, lubrication oil quality, and supervisory control signals.
3. Fault Diagnostics Process Flow
There are three major steps:
-
Access and pre-process data:
The sensor data received from equipment need preprocessing as all the data are often not quite useful. Preprocessing can be done using techniques such as:
Time series data synchronization: To align data which may contain missing values or data sampled at different rates.
Advanced To eliminate noise from sensor data.
Feature selection, transformation and extraction: To determine which data will be helpful for predicting the failures.
-
Develop fault detection models:
Fault detection models are based on mathematical, statistical or artificial intelligence algorithms for data clustering, classification, and system identification. These models are typically trained, validated, and tested using predictors and response data.
-
Deploy models in production:
The developed fault detection model is then deployed to enterprise systems, machines, clusters, clouds, and can be targeted to real-time embedded hardware.
4. Fault Detection Model Development using AI
Faults using sensor data can be detected by artificial intelligence techniques such as machine learning and neural networks. These techniques involve the ability to learn using training data without being explicitly programmed. The trained algorithm can be then used to make predictions from new data collected from the sensors.
Machine learning tasks are mainly classified into two categories, depending on the nature of the availability of input (predictors) and output (response) data available to a learning system.
These are:
- Supervised learning: The algorithm is trained with example inputs and associated desired outputs and the goal is to learn a general rule that maps inputs to outputs.
- Unsupervised learning: In this scenario, labels are not provided to the learning algorithm, meaning it discovers own classification to find structures in the inputs. Unsupervised learning is mainly used for discovering hidden patterns in data.
In the case of wind turbines, supervised learning is the most appropriate as historic sensor and corresponding fault occurrence data can be used as predictors and the responses can be used to train fault detection algorithms.
Figure 1. Supervised learning for fault diagnostics
Supervised learning can be classified into two different categories of algorithms:
- Classification: For categorical response values, where the data can be separated into specific “classes”
- Regression: For continuous-response value prediction
The fault detection problem falls under the category of classification as the sensor values are used to predict categorical response, “fault” or “no-fault”. There are various classification algorithms available. However, following two algorithms are best suited for the purpose of fault detection in wind turbines using sensor data:
- Support Vector Machines (SVM)
- Artificial Neural Networks
4.1 Support Vector Machines (SVM)
An SVM classifies data by identifying the best hyperplane that separates all the data points of one class from that of the other class. For an SVM, the best hyperplane is the one that has the largest margin between the two classes. By Margin, we mean the maximal width of the slab parallel to the hyperplane which has no interior data points.
The support vectors are the data points that are closer to the separating hyperplane. These data points are on the boundary of the slab. Figure 2 illustrates these definitions.
Figure 2. SVM concept
The above illustration is an example of a linear classifier, i.e. a classifier that separates a set of objects into their respective groups with a line. Most classification tasks, however, are more complex and often more nonlinear structures that are needed in order to make an optimal separation.
Figure 3. SVM: Non-linear separation
In comparison to the previous schematic, a full separation of objects will require a curve (which is more complex than a line). Figure 4 shows how Support Vector Machines perform the separation operation in such a case.
Figure 4. Non-linear to linear transformation
It can be observed that the original objects as shown on the left side of the schematic are mapped, i.e., rearranged or reorganized using a set of mathematical functions known as kernels. The mapped objects (right side of the schematic), in this new setting are linearly separable. This is why it is only necessary to find an optimal line that can separate the objects instead of constructing the complex curve (left schematic).
There are a variety of kernels that can be used in Support Vector Machine models. These include polynomial, linear, sigmoid and radial basis function (RBF).
4.2 Artificial Neural Networks
Artificial neural networks (NNs) are simplified models of the biological nervous systems. A NN can be described as a data processing system consisting of a large number of simple, highly interconnected processing elements (artificial neurons), in an architecture inspired by the structure of the cerebral cortex of the brain. The interconnected neural computing elements have the capability to learn and hence acquire knowledge and make it amenable for prediction of events. NNs have wide applications in areas such as image processing, pattern recognition, forecasting, optimization, and control systems.
A NN can be defined based on the following three characteristics:
1. Architecture:
This is based on the number of layers and the number of nodes (neurons) in each of the layers.
Figure 5. General architecture of Neural Network
Neurons are organized into layers—input, hidden and output. The input layer holds the input parameter values that act as inputs (along with external stimulus) to the next layer of neurons. The next layer is the hidden layer. There can be several hidden layers in the same neural network. The last layer is the output layer. In this layer, there is one node for each class. Based on the architecture of layers, NNs fall can be divided three categories:
- Single layer feedforward network:
This type of network consists of two layers, namely the input layer and the output layer. The input layer neurons receive the input signals and the output layer neurons receive the output signals. The links carrying the weights connect every input neuron to the output neuron, but not vice-versa. Such a network is said to be feedforward. The network is termed single layer since it is the output layer alone that performs all the computation, despite the presence of two layers. In this network, since the input layer only transmits the signals to the output layer, it is named as a single layer feedforward network. Such networks find applications into association learning.
- Multilayer feedforward network:
This network is made of multiple layers. It possesses an input and an output layer and also has one or more intermediary layers called hidden layers. The computational units of the hidden layer are usually known as the hidden units or hidden neurons. These NNs are highly generalized and can be adapted to various applications by selecting suitable kernel functions and training algorithm.
- Recurrent network:
Recurrent networks differ from feedforward network architectures since there is at least one feedback loop. In these networks, there can exist one layer with feedback connections. There are several neurons with self-feedback links. In this, the output of a neuron is fed back into itself as input.
2. The learning mechanism has been applied for updating the weights of the connections. Learning methods in NNs can be classified into two basic types:
-
Supervised Learning: In supervised learning, every input pattern which is used to train the network is associated with the desired or targeted output pattern. During the learning process, a teacher is assumed to be present when a comparison is done between the correct expected output and network computed output to determine the error. Tasks that fall under this category are Pattern Recognition and Regression.
-
Unsupervised Learning: In this method, the target output is not provided to the network. The system learns on its own by discovering and adapting to structural features in the input patterns as if there is no teacher to present the desired patterns. Tasks that fall under this category include Clustering, Compression, and Filtering.
3. The activation functions used in various layers:
Activation functions are used to limit the output of a neuron in a neural network to a certain value. Some of the commonly used activation functions are Linear, step, ramp, sigmoid, hyperbolic tangent, and Gaussian.
Back-propagation algorithm is one of the most popular NN algorithms. This algorithm works in the following framework:
- Assign random weights to all the linkages that connect nodes to initiate the algorithm.
- Using inputs and input-hidden node linkages, find the activation rate (output) of hidden nodes.
- Using the activation rate of hidden nodes and linkages to output, find the activation rate of output nodes.
- Find error at the output node and recalibrate all the linkages between hidden nodes and output nodes.
- Using weights and error found in the output node, cascade down the error to hidden nodes.
- Recalibrate weights between hidden nodes and input nodes.
- Repeat process till the convergence criterion is met.
- Using final linkage weights, calculate the activation rate of output nodes.
The key is finding the right set of weights for all of the connections to make the correct decisions for output classification.
5 Case study: Fault Detection in Wind Turbines using SVM and NN
5.1 Wind Turbine Model Diagnostics
In this project, eInfochips team has designed and implemented real-time algorithms for on-board fault diagnostics using model-based development. The setup consists of a simulation of a 5MW wind turbine along with its speed and yaw control system. This setup allows insertion of system and sensor faults by making changes to parameters and additional signal injections at suitable points. The diagnostics algorithms must detect the faults in real-time such that corrective actions can be initiated by the control system or a supervisory logic. Such algorithms operate on sensor data as well as prior design data. In case of high value and/or critical engineering systems (such as wind turbine, aero-engine), the diagnostics shall run in real-time to prevent catastrophic situations or costly repairs before a fault propagates within the system.
eInfochips, in this context, has also created a solution for monitoring systems such as complex electro-mechanical systems and rotating machinery in real time. The fault detection algorithms library includes extended Kalman filters (EKFs), fast Fourier transform (FFT) fault tree tables (FTTs), frequency analyzer, machine learning algorithms such as Support Vector Machine (SVM) and Neural Networks (NN). This solution has also been evaluated on various prototyping and deployable embedded platforms based on TI AM57x, Freescale MPC56xx, ARM CORTEX M4. Figure 6 shows the overall design and testing setup.
Figure 6: MBD of real-time diagnostics solution
A simulation setup has been developed to facilitate algorithm development and testing for fault detection. Fault scenarios are simulated by injecting fault conditions either into the sensor model or system dynamics at particular time instances. The various faults inserted into the model and respective patterns can be altered to observe and fine tune the fault detection algorithms. One such inserted fault is shown in Figure 7.
Figure 7. Inserted faults into model
5.2 Fault Detection Algorithms using AI
Fault detection for generator speed is particularly analyzed using simulation runs. Simulation data from the generator speed sensor is used to develop three signal features. Features are used as input (predictors) and injected fault conditions, fault (0) or no-fault (1), are used as outputs (responses) to train and validate SVM and NN models.
Figure 8. Fault prediction algorithm flow
Figure 9. Features: ωgm1(τ)- ωgm2(τ), ωgm1(τ)- ωgm2 (τ-1), ωgm2(τ)-ωgm1 (τ-1)
An SVM model is developed using Radial Basis (Gaussian) kernel and then validated using 5-fold cross-validation process. The prediction accuracy achieved has been ~97.7%. A NN is developed using a multilayered feedforward configuration with two hidden layers having three neurons in first and two neurons in the second hidden layer. This network is then trained with the scaled conjugate gradient algorithm. Typical fault prediction accuracy is found to be 97%. Both the SVM and NN are successful in predicting the injected faults.
5.3 Comparison of Kalman Filter Algorithm and Machine Learning Algorithms
The following figure shows plots of faults inserted and faults predicted by SVM and NN as well as Extended Kalman filter algorithm. The EKF for fault diagnostics is a model-based method. All the three algorithms perform reasonably well and are capable of diagnosing faults in real-time.
Figure 10. Fault Prediction Algorithms Comparison
One key observation while evaluating various fault patterns is that there are potential retraining or re-design efforts associated with the use of neural networks and support vector machines that are not associated with the Extended Kalman filter. The neural network tends to have a weaker self-adaptivity than the Kalman filters. However, neural networks perform well for the patterns that are similar to the original training data. If the input pattern passes beyond the boundaries of the area where the neural network has been trained, the neural network accuracy usually declines. On the other hand, the Extended Kalman filter does not tend to suffer from this problem and can be used over a larger variety of fault patterns once the appropriate noise parameters have been accommodated.
Compared to Extended Kalman filter methods, the neural networks also have some advantages. The Kalman filter iteratively corrects the estimates over a finite period of time. The neural network, on the other hand, tends to localize in a single pass of the network execution. The Kalman filter uses the model (i.e. engineering design data of the system) to effectively diagnose the instance of a fault. If the detected fault is random and spontaneous then the neural network’s ability to localize in a single execution pass results in usually more accurate fault estimates. The Kalman filter, however, requires several iterations to reach the accuracy of the neural network.
Conclusion
Overall, if the noise and disturbance parameters in measurements are not expected to change substantially, then using neural networks for fault diagnostics may be a better option. In the case where computationally minimum and modest memory requirements are highly desirable due to embedded system constraints, then Kalman filters are better suited as a diagnostic method. Also, EKF is more flexible and easily modifiable method for fault detection. However, the decision between Kalman filters and neural networks always depends on the application, noise in the measurements, characteristics of the patterns of faults that need to be effectively diagnosed and also constraints on the embedded platform where these real-time diagnostics algorithms are expected to be deployed.
References
- An artificial neural network method for remaining useful life prediction of equipmentsubject to condition monitoring by Zhigang Tian, 2009
- Support Vector Machines for Fault Detection in Wind Turbines by Nassim Laouti, Nida Sheibat-Othman et.al.
- Using Deep Learning Based Approaches for Bearing Remaining Useful Life Prediction Jason Deutsch and David He, 2016
About Authors
Dr. Mangesh Kale is a Senior Solution Architect and Key Accounts Manager at eInfochips. He has industry experience of more than 18 years in engineering, technology design and solutions for safety-critical control systems hardware and software. Mangesh leads the aerospace practice group at eInfochips with the responsibility of new technology initiatives and research & development initiatives. Mangesh has a Ph.D. from The University of Southampton, UK in flight control systems, Masters of Engineering from Indian Institute of Science, Bangalore, and Bachelors of Engineering from University of Pune, India.
Narayani Ghatwai is an engineer at eInfochips. Her areas of interest include Embedded Systems and Automotive advancements and applications. She has a Master degree in VLSI and Embedded Systems from the University of Pune.
About eInfochips
eInfochips is a product engineering and software R&D services company with over 20 years of experience, 500+ product developments, and over 40M+ deployments in 140 countries across the world. Today, 60% of its revenues come from Fortune 500 companies and 80% from solutions around connected devices. From silicon to embedded systems to software, from design to development to sustenance, it maps the product journey of its customers. The company has the expertise and experience to deliver complex, critical, and connected products across multiple domains, for projects as small as a one-time app development to a complete turnkey product design. With its R&D centers in the USA and India, eInfochips continuously invests and fuel innovations in the areas of Product Engineering, Device Lifecycle Management, IoT & Cloud Frameworks, Intelligent Automation, and Video Management. The company has a sales presence in the USA, Japan, and India. Visit at www.einfochips.com or contact marketing@einfochips.com to know more.
Related Semiconductor IP
- Root of Trust (RoT)
- Fixed Point Doppler Channel IP core
- Multi-protocol wireless plaform integrating Bluetooth Dual Mode, IEEE 802.15.4 (for Thread, Zigbee and Matter)
- Polyphase Video Scaler
- Compact, low-power, 8bit ADC on GF 22nm FDX
Related White Papers
- Learning how to learn: Toddlers vs. neural networks
- Machines can see, hear and analyze thanks to embedded neural networks
- AICP: AURA Intelligent Co-processor for Binary Neural Networks
- A systems approach to embedded code fault detection
Latest White Papers
- Reimagining AI Infrastructure: The Power of Converged Back-end Networks
- 40G UCIe IP Advantages for AI Applications
- Recent progress in spin-orbit torque magnetic random-access memory
- What is JESD204C? A quick glance at the standard
- Open-Source Design of Heterogeneous SoCs for AI Acceleration: the PULP Platform Experience