An application of machine learning approach to fault detection of a synchronous machine Jose Gregorio Ferreira Europe IT Innovation Adam Warzecha Institute of Electromechanical Energy Conversion Cracow University of Technology Kraków, Poland GE Healthcare Kraków, Poland Abstract—Accurate fault diagnosis systems should consider both the historical performance and the assessment of the current state of a machine. Manufacturing, installation, operation and maintenance are part of the machine’s history and should be taken into account. The paper focuses on experimental procedures to develop a multi-criteria methodology to classify up to ten machine conditions. Using machine learning for signal processing techniques, any deviation from a normal steady state might be categorized as an abnormal behavior and, when demonstrated, a fault. To take advantage of machine learning algorithms, a significant amount of data is needed. To demonstrate the procedure the authors examined a synchronous machine. The authors recorded currents and voltages primarily, in stator and rotor winding, well as rotational speed and electromechanical torque. The collected signals were filtered and pre-processed, and to 5038 features were calculated and transformed into a tidy dataset. The sparse Linear Discriminant Analysis algorithm was used to extract the most important defined features. The results are shown in 3D scatter plots; in which each machine condition is represented. It is then possible to visualize the ability of the model to identify the most discriminant features. The same method can be used for the diagnostic of other types of machine conditions. Keywords—machine learning; classification; fault diagnosis synchronous machine; I. INTRODUCTION The development of a fresh approach to the multi-criteria diagnosis of electrical machines relies on rapid growth in computational power and equipment and more accurate and efficient numerical algorithms. The analysis aims to compare different features extracted from a group of signals monitoring machine state, according to a variety of prescribed criteria. Efficient supervision, fault detection, and diagnosis of faults, considering causal fault-symptom relationships together with advanced methods for fault detection were studied in [8]. Another model based publication in the area of induction motors, considering a fast computational method to perform on-line monitoring can be seen in [12]. The same main author in [13] highlights examples of possibilities and limitations of using frequency analysis and mathematical models in faulty machines. The research presented in this paper was funded by subsidies on science granted by Polish Ministry of Science and Higher Education under the theme: No. E-2/581/2016/DS. An extended review of MCSA (Motor Current Signal Analysis) describing different types of faults and the signal signatures they generate and their diagnostics schemes is presented in [10]. The papers [7, 14, 15, 16] introduce the utility of spectral analysis of currents. In the paper [2] the authors described the recent trends in condition monitoring and fault diagnosis of rotating machinery and their interactions with the process they are part of. All these publications share a set of assumptions that hardly appears in complex and composed of many interrelated components of real-world systems. Each machine should be identified and classified according to the environment in which they operate. Accurate fault diagnosis systems should consider the historical performance and capability for assessing the current state of the machine. Such possibilities are offered by methods and procedures of artificial intelligence. The paper [10] presents an application of neural networks for induction motor diagnosis. In [6] the spectra of stator currents of the synchronous motor were treated as feature spaces and tested using a genetic algorithm. Apart from specifics of the model, on each machine appears variances related to manufacturing, installation, operation and maintenance. These are the causes that the real monitoring signals are randomly noisy. The machine learning methods can recognize the state of a machine despite the noise and detect any deviation from normal operation, [1, 3] II. EXPERIMENTAL SYNCHRONOUS MACHINE AND DATA ACQUISITION SYSTEM The synchronous machine experimental platform designed for diagnostic purposes was made available by the Institute of Electromechanical Energy Conversion in Cracow University of Technology. The 7.5 kW 400V salient pole machine with rotor embedded starting squirrel-cage, was adapted by the manufacturer to carry out various internal faults of different levels of severity. The paper presents an example of using a machine learning procedure to recognize the stator winding inter-turn faults of the machine operating as a motor under different loads. The two layers’ stator winding has branches formed by a group of 4-coils in series with another group of 3-coils. These parallel branches, two per phase, are available to create a 4-pole nearly sinusoidal distributed magnetic field. To collect the signals from sensors installed over the rotor, two groups of specially designed 25 slip-rings and brushes are connected through 978-1-5386-0359-8/17/$31.00 ©2017 European Union Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on October 08,2020 at 04:27:57 UTC from IEEE Xplore. Restrictions apply. the shaft [5], identified as Rotor signal collector (RSC). The total number of sensors installed is 37. In addition to the 21 sensors connected through the RSC, the apparatus includes 16 external sensors for the measurement of supply voltages, motor phase currents, torque, and rotational speed. The list of collected signals is as follows: currents of all phase windings and all parallel branches, excitation current, phase voltages, voltage between machine neutral point and ground, voltage between machine shaft and ground, mechanical rotational torque, rotational speed, voltages induced in Rogowski coil mounted on selected cage rotor bars, voltage between the terminals of each bar per pole, noise canceling membrane microphone, omnidirectional membrane microphone, analog hall sensors on selected rotor poles surface. A National Instrument NI-USB 6255 data acquisition card was used to collect 37 signals; this card has 80 analog inputs (16-bit) at 125 MS/s single-channel. III. FAULT TYPES AND DATA COLLECTION External resistances were placed parallel to each coil, and it is used to create winding asymmetries by shunting current path. The three different set-ups are represented in Fig. 1. Each fault type was introduced through three levels of the resistor current: {0.5A, 1.0A, 2.0A}. The measurements were carried out for the motor running at half and full nominal power of the load. All experiments were performed using the same winding configuration, and particular attention was given to guarantee repeatability. The signals from the sensors were collected as time series at the sampling frequency of 19 kHz. To acquire a workable file size, each collected dataset contains a 120 second recording of a given machine condition, 60 datasets in total are available; 6 per each machine condition, equivalent to 12 minutes of data collected on different days (intended to capture any impact of different external disturbances). Enough information is then gathered to train, test and validate different machine learning models. The signals were filtered in two steps. First, the spikes were identified as those data-points with an absolute value greater than 1.5 standard deviations. This approach is based on the socalled three-sigma rule of thumb, but considering a more restrictive interval of 86.64% of the available data points. Then, the removed points were replaced by Piecewise Cubic Hermite Interpolating Polynomial (PCHIP). Finally, a median filter, considering only three adjacent points, is applied to smooth the signals. The observed attenuation of the signals is negligible. Even when the theoretical limit according to the Nyquist-Shannon sampling theorem, for the chosen sampling frequency, allows the consideration of up to 9.5 kHz, the frequencies that will be evaluated while training the machine learning models are the frequencies between the fundamental (50Hz) and the slot harmonic (2.1 kHz). As explained before, the filtering method implemented here can be seen as low-pass single-pole recursive filter, with cut-off frequency value 3 kHz. Finally the signals are ready to extract relevant features containing information about different machine conditions. Three datasets domain (tidy data) identified as frequency features, Clarke transformation features and Voltage-Current features were being created considering domain related features. In the extended research two more datasets domain were being considered: statistical domain features and Park transformation features. IV. MACHINE LEARNING PROCESS STEPS In statistical signal processing it is intended to identify those patterns that represent known boundaries, symbol sets or other known types of signal domains. On the other hand, machine learning is directed towards detecting or learning to unknown patterns, without underlying assumptions and simplifications of the original system. The idea is to get an effective pattern representation of realworld signals using signal processing, then feed these patterns to machine learning algorithms to learn the scenario. Data signal processing refers to the ability to create a tidy dataset from raw sensor data. Each signal is defined as a variable, in numerical form, stored all together as column in a table. Each row in the table is a sample representing a particular machine condition. Finally, these datasets are prepared for pre-processing in which the information is transformed into features, as a result of considering shorter time intervals or observations of which each describes the condition of machine winding for two different set points. Fig. 1. Scheme of one phase branch of stator winding. The three types of faults identified as A31-A32, A31-A41 and A42-A41 are shown. Taking into account that all signals were collected under steady state condition, no local maxima or peak values are expected. The signals were prepared using digital filters. All the signals (time-series) were filtered in the same way, no selection of the signals was done in this step. All the 37 signals were used to train the classification algorithm. A. Defining the categorical variables. Table I contains all the categorical variables and the numerical coding used to define the qualitative responses. Four categorical variables were assigned to describe different machine conditions: set-point; as the working load in %, fault type; described in Fig. 1 and severity; representing three different levels of shunting current. The variable Class refers the other three categories, as all the machine conditions combination. Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on October 08,2020 at 04:27:57 UTC from IEEE Xplore. Restrictions apply. TABLE I. INITIAL TRAINING CLASS DEFINITION Categorical variable Description Values Condition This category register if the state of the machine is healthy or faulty. 0 – Healthy 1 – Faulty Set-point Refers to the level of load at which the machine is operated. 1 – sp1 50% 2– sp2 100% Fault type Simulated stator winding shortcircuits by an adjustable external resistor. Severity The amount of shunting current in each fault type. Class Aggregation of the other categorical variables to be used during supervised training. 0 – Healthy 1 – A31-A32 2 – A31-A41 3 – A41-A42 0 – Healthy 1 – 500 mA 2 – 1000 mA 3 – 2000 mA The combination of all others to define one class different domains or transformations. In the majority of the cases considered, the function was directly applied over the data frame containing all the signals. The pseudo algorithm, presented in Table II, calculates the windows size with the overlapping factor of 10% for each feature dataset. TABLE II. ALGORITHM OF FEATURES CALCULATION 1.1 Load filtered dataset for a given machine condition 1.2 Calculate Tt iteration sequence from the t timestamp 1.3 Tt = from [min (t) + step] to [max(t)], step = 22cycles / 50Hz 1.4 For each jj along Tt do 1.5 1.6 Subset data having t >= Tt(jj) – step AND t <= Tt(jj) Normalize data sub-set 1.7 Calculate frequency domain features 1.8 Calculate Clarke transformation features 1.9 Calculate Park transformation features 1.10 The class that is used to train the model utilizes the format represented in Fig. 2. Fig.2. Formatting used for initial categorical variable Class. B. Observation; samples window size To select the number of samples per observation, the main criteria applied was to maintain both resolution and low processing cost using a second-order Goertzel algorithm to compute multiple harmonics from 50Hz up to 2,5kHz. The Goertzel algorithm performs a more efficient Fast Fourier Transform calculation than N-point DFT. The frequencies of the harmonics in the DFT procedure depend on the length of the transform N. Therefore, the chosen length is N=7600, which gives a minimum FFT resolution of 2.5 Hz and comprises 20 cycles of the fundamental harmonic. Inspired by P. Welch’s methodology, time averaging estimation of the spectrum was performed and contrasted with different windows functions. The selected window of 7600 samples was overlapped by a factor of 10% (2 cycles) and multiplied by a Hamming window, hence the effect of considering incomplete cycles was minimized. The validity of this method has been published in [4]. Each observation comprises a subset of 8360 samples, based on which various features were calculated. After these transformations, each incorporating 120 seconds of the collected signals, four datasets containing 2990 observations each were generated. C. Features calculation, modeling of the signals. Each calculated dataset corresponds to a specific group of features, calculated based on various functions related to three End Frequency features: a periodic function can be represented using Fourier series. In this example the first 50 harmonics are calculated. A feature is defined to contain the information on the magnitude and phase of each harmonic, from 50 to 2500 Hz. The phase A is taken as the phase reference, and all harmonics are divided by the fundamental. In total 3700 features were extracted. • Clarke transformation features: the voltages and main and parallel branches current are referred to the 0αβ stationary two-axis reference frame, and considering the power invariant transformation. Then the magnitude and phase of the main harmonics are calculated, as well as the relationships between voltages and currents defining the electric power, using vector and scalar mathematical forms including the Clarke-dataset of 1224 features. • Voltages and currents features or electric power features. Using calculated vectorial magnitudes or the harmonic content the relationships between currents and voltages were considered, including the VoltageCurrent features dataset of 114 features. D. Supervised learning Supervised learning is the most commonly studied learning task. It relies on a set of input features, target features, and a set of training examples where the mapping between the target and the input features is given. The task is to identify the new input features. Enough data was collected to balance the classes over all machine conditions. Hence, the supervised learning task is focused on a multi-class problem used to detect different fault conditions and severities categorized into various target features as indicated in Table I. A common method for describing the performance of a classification model is the confusion matrix. Given m classes (where m ≥2), a confusion matrix is a table of at least size m by m. Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on October 08,2020 at 04:27:57 UTC from IEEE Xplore. Restrictions apply. For a classifier to have good accurracy most of the elements TP – true positives and TN – true negatives should be represented along the diagonal, with the rest FP – false positives and FN – false negatives of the entries being zero or close to zero. Table III presents the confusion matrix after training a sparse Discriminant Analysis used to extract features from the Voltages - Current dataset. TABLE III. Predicted Classes 0100 VOLTAGE-CURRENT DATASET CONFUSION MATRIX, SP1 Reference Classes 01 11 11 11 00 11 12 13 70 0 2 0 11 21 0 11 22 1 11 23 0 11 31 0 11 32 0 11 33 0 0 1 2 0 0 0 2 1111 0 79 3 1112 0 2 55 0 0 3 0 2 0 2 1113 1 0 0 67 0 1 2 0 0 1 1121 0 0 0 0 77 0 0 0 0 0 1122 5 1 10 1 0 57 0 3 0 0 1123 0 0 0 2 0 0 71 0 0 0 1131 0 1 3 0 0 7 0 61 0 2 1132 0 0 0 1 0 0 1 0 71 0 1133 0 1 12 1 0 0 0 0 0 62 In order to evaluate the learner’s performance, the number of correctly classified observations expressed as a percentage of all examined observations is calculated as the accuracy. Then, to evaluate the ability of correct classification of the classes the sensitivity is calculated as the true positive rate as the fraction of observations of a class that was correctly predicted and the precision as the fraction of correct predictions for a certain class. E. Features extraction, multiclass problem. No rigid rule defining the number of necessary features needed for each classification model type was established Classifiers that generalize easily, e.g. linear classifiers, Naïve Bayesian, allow a greater number of features since the classifier itself is less expensive. Classifiers that tend to model non-linear decisions boundaries very accurately do not generalize well. As such algorithms such as neural networks, decision trees, k-Nearest Neighbors classifiers, etc., are prone to overfitting the data. It is important to highlight that overfitting occurs both when estimating several parameters in a lower dimensional space take place, and when in a highly dimensional space, relatively few parameters are estimated. Based on the above-explanation, it was decided to process each dataset independently. In the following section, the extraction of the most important features per each set-point and dataset domain was explained. In datasets containing highly correlated features, it is desirable to remove them. Many classifiers perform better if highly correlated features were removed; one exception is the partial least squares. Choosing the threshold to identify highly correlated features might be a complicated task that requires several iterations to build a model with high accuracy and no overfitting. This step can be avoided applying ℓ2 penalty within model regularization; this penalty helps to mitigate the effect of correlated predictors. It is also desired to implement models that are easy to interpret and the computation time, at least after the learning process, is fast enough to detect faults before they lead to machine breakdown without compromising the accuracy to assess machine state. Therefore, it is of interest to find a classification algorithm that will let us identify, per class, the most important features from each dataset. This way it can be used as a feature extraction method and for building a sufficient matrix of knowledge for the development of a reliable, fast and robust fault classification model. F. Sparse Linear Discriminant Analysis - sLDA The authors in [17] have developed and implemented an R package for the sparse version of the LDA. This model is a regularized version of the LDA with two tuning parameters: Least Absolute Shrinkage and Selection Operator (LASSO) using the ℓ1 penalty that eliminates unimportant predictors and hence provide feature selection, and Elastic Net using the ℓ2 penalty that shrinks the discriminant coefficients towards zero. For a given x data matrix and a vector of length the outcome y, the ‘LASSO’ solves the problem: minimize β y − X ⋅ β2 +λ β 1 of (1) and the ‘Elastic Net’ solves the problem minimize β y − X ⋅ β2 +λ β 1 +γ β 2 (2) Where λ, γ are nonnegative tuning parameters and β the coefficients used to fit the model. The pseudo-code of the calculation is shown in Table IV. TABLE IV. 1.1 ALGORITHM OF FEATURES EXTRACTION For Each features dataset 1.2 Create data partition, training 75% / testing 25% 1.3 1.5 Define control training repeat 3 times, 5-fold Cross Validation Define the number of selected variables and tune grid for lambda values Declare parallel backend 1.6 Pre-processing training dataset: center and scale 1.7 Fit sparseLDA model, metric Accuracy 1.8 Extract relevant predictors and importance score, append to previous iterations Plot fitting results 1.4 1.9 1.10 End Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on October 08,2020 at 04:27:57 UTC from IEEE Xplore. Restrictions apply. Two set points at 50% and 100% of nominal load respectively, were considered to evaluate different conditions of the machine and for each of the domain-dataset under consideration. Each set-point was considered independently of the other; this doubled the total number of the data classes. To decrease the processing time and the number of iterations, the range of features to be evaluated per class was fixed to the maximum number of fifteen for each case under consideration. The selected features can be repeated between classes. Illustration of features extraction process by sLDA algorithm used for sp1 Clarke-dataset is shown in Fig. 3. Before fitting the model, the dataset was standardized. in a 3D scatter plot containing the three most important features per each feature dataset and set-point. Results obtained for four conditions of the synchronous machine at load 50% of the nominal are presented in Fig. 4. TABLE V. SUMMARY TABLE, SLDA FEATURE EXTRACTION. COMPARISON BETWEEN SP1 AND SP2 Total features Setpoint Extracted features Process Time λ Features per class Clarke 1224 sp1 59 9.89 h 1 7 sp2 117 7.84 h 0 15 VI power 114 sp1 47 45.3 min 0 9 sp2 62 40.5 min 1 15 sp1 71 1.4 days 1 8 sp2 121 1.03 days 1 14 Totals 477 77.55 h NA 68 freq 3700 5038 Fig.3. Learning curves as the accuracy versus number of training features for different values of parameter λ . Optimal model was achieved with λ =1 and 7 features per class. The chosen strategy was to split the data into training subset, with 75% of all the data-points balanced for every class, and the remaining 25% data to evaluate the performance of the model and to avoid overfitting. Then, the sLDA training was validated using the repeated k-fold Cross Validation. V. RESULTS The number of total features considered in the above subsections was 5038 features. Using the sLDA approach, it was possible to limit the procedure down to 477 most optimal features for the classification task of assessing the condition of the synchronous machine and for three fault types at three different severity levels. Comparing the differences between sp1 and sp2 yields: Fig.4. Scatter plot of the observation points in the space of the three features in Clarke coordinate system for four conditions of the machine at the load 50% of nominal. • The time to process is 1.34 times higher for sp1 than sp2. The three most significant features arisen from “sp1 Clarkedataset” are U α - alpha component of three phase voltages of the stator, I1α - alpha component of currents in branches no. 1 and I 2α - alpha component of currents in branches no. 2 of three-phase coils. The cluster mark ‘0’ represents no-fault condition, cluster ‘1’ - short circuit in coil 1, cluster ‘3’ - short circuit in coil 2, cluster ‘2’ - short circuit in the whole branch no. 1. In each case, the current leakage due to the introduced short circuit equals to 0.5 A. Fig. 5 presents the analogous scatter plot in the I β , I1α , I 2α coordinate system representing • The number of optimal features needed per class and the total number of extracted features are twice as bigger in sp2 than sp1. conditions of the machine operating at the nominal load. It should be noticed that the three arisen features are sufficient to distinguish between healthy and faulty conditions at 100%. The sLDA was set to select up to 15 features per class with regularization to enhance the feature selection between variables that might be correlated. The summary can be found in Table V. The algorithm’s ability to classify up to 10 machine conditions is presented in a graphical form. The distribution of the observations for a given machine condition is represented To classify severity levels and localization in a coil of one branch, it is necessary to include the information contained in the other selected features during the sLDA modeling. Using the function varImp() from the caret package in R [18], it was possible to evaluate the performance between classes by ROC curve analysis for each predictor. • The accuracy in unequivocal classification of all classes is greater in sp1. Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on October 08,2020 at 04:27:57 UTC from IEEE Xplore. Restrictions apply. [2] [3] [4] [5] [6] Fig.5. Scatter plot of observation points in 3D space of the three most important variables for the Clarke features dataset. Each cluster represents one from ten conditions of the machine at the load equals 100% of nominal [7] The analysis is carried out by decomposing the problem into pair-wise problems. The area under the curve AUC is calculated by the trapezoidal rule for each class pair. CONCLUSIONS The methodology presented here proposes a reference method for condition-monitoring based maintenance. The procedure’s ability to distinguish between fault and healthy machine conditions was demonstrated by means of machine learning, using the sparse Discriminant Analysis. Using this classification algorithm, it was possible to extract the most important features.The method is considered to be effective for the early winding short circuit detection. Although the time to compute all features, train the model and identify the most important features approaches 77 hours, the time to classify a new observation is close to real time. For the cases published in the paper, the time to only processing/classify a new observation was 0.23 seconds only. The time to collect, store, pre-process and classify was smaller than 4 seconds. Next, it is proposed to improve the method by including other domains of features and to develop features extraction and classification independent from the workload of the machine. An application of this method clearly requires the training of the algorithm on each machine. The history of the machine is part of the “knowledge” extracted and used to train the algorithm. The presented example of the machine learning application can be treated as an introduction to a multicriteria classification of machine condition. [8] [9] [10] [11] [12] [13] [14] [15] [16] REFERENCES [1] A. Bacchus, M. Biet, L. Macaire, Y. Le Menach, and A. Tounzi, “Comparison of supervised classification algorithms combined with feature extraction and selection: Application to a turbo-generator rotor fault detection,” Diagnostics for Electric Machines, Power Electronics and Drives, SDEMPED, 9th IEEE International Symposium on, 2013, pp. 558–565. [17] [18] A. Bhattacharya, P.K. Dan, “Recent trend in condition monitoring for equipment fault diagnosis,” International Journal of System Assurance Engineering and Management, Springer India, vol 5, Iss 3, 2014, pp. 230-244. R. Casimir, E. Boutleux and G. Clerc, “Fault diagnosis in an induction motor by pattern recognition methods,” Diagnostics for Electric Machines, Power Electronics and Drives, SDEMPED, 4th IEEE International Symposium on, 2003, pp. 294–299. J.G. Ferreira, T.J. Sobczyk and A. Warzecha, ‘Multicriteria diagnosis of synchronous machine using the welch method,” Technical Transaction, Electrical Engineering = Czasopismo Techniczne, Elektrotechnika 2015, Iss 1-E, 2015, pp. 343–352. J.G. Ferreira, T.J. Sobczyk, “Multicriteria diagnosis of synchronous machines—Rotor-mounted sensing system. Rotor signal collector construction,” Control (CONTROL), 2014 UKACC International Conference on. 2014, pp. 450–455. Z. Glowacz and J. Kozik, “Feature selection of the armature winding broken coils in synchronous motor using genetic algorithm and Mahalanobis distance,” Archives of Metallurgy and Materials, Vol. 57, No 3, 2012, pp. 829–835. Z. Glowacz and J. Kozik, “Detection of Synchronous Motor Inter-Turn Faults Based on Spectral Analysis of Park’S Vector,” Archives of Metallurgy and Materials, Vol. 58, No. 1, 2013, pp. 19–23. R. Isermann, Fault-diagnosis applications: model-based condition monitoring: actuators, drives, machinery, plants, sensors, and faulttolerant systems. Springer, 2011. C. Kowalski and T. Orlowska-Kowalska, “Neural Networks Application for Induction Motor Faults diagnosis,” Mathematics and Computers in Simulation, vol.63, No. 3-5, 2003, pp. 435-448. S. Nandi and H.A. Toliyat, “Fault diagnosis of electrical machines – a review,” Electric Machines and Drives, 1999. International Conference IEMD, 1999. pp. 219–221. P. Neti and S. Nandi, “Stator interturn fault detection of synchronous machines using field current and rotor search-coil voltage signature analysis,” Industry Applications, IEEE Transactions on. Vol. l45, No. 3, 2009, pp. 911–920. T.J. Sobczyk, “Frequency analysis of faulty machines-possibilities and limitations,” Diagnostics for Electric Machines, Power Electronics and Drives, SDEMPED, 6th IEEE International Symposium on, 2007, pp. 121-125. T.J. Sobczyk, M. Zając, W. Juszczyk, Z.R. Kich, and M. Sulowicz, “Signal processing for extraction of characteristic features of induction motor stator currents at rotor faults”, Diagnostics for Electric Machines, Power Electronics and Drives, SDEMPED, 5th IEEE International Symposium on, 2005, pp. 1–6. K. Weinreb and P. Drozdowski, “Detection of Winding Faults of a Salient Pole Synchronous Machine by a Spectral Analysis of Currents,” Proceedings on International Conference on Electrical Machines, ICEM, 1994, vol. 2, pp. 56–61. K. Weinreb, M. Sułowicz and J. Petryna, “Faults detection in cage induction motor with parallel branches,” Technical Transactions, Electrical Engineering = Czasopismo Techniczne, Elektrotechnika, 2016, Iss. 2-E, pp. 53- 64. S. Sahoo, P. Rodriguez, and M. Sulowicz,” Evaluation of different monitoring parameters for synchronous machine fault diagnostics,” Electrical Engineering = Archiv für Elektrotechnik 2016, Article in press, p. 10. – doi, 10.1007/s00202-016-0381-6. – ISSN 1432-0487. L. Clemmensen et all, “Sparse discriminant analysis”, Technometrics. Vol. 53, Iss. 4, 2011, pp. 406-413. J. Wing, M.K.C. from J. et al., caret: Classification and Regression Training, R package 6.0- 30, 2015. Authorized licensed use limited to: Universidade Estadual de Campinas. Downloaded on October 08,2020 at 04:27:57 UTC from IEEE Xplore. Restrictions apply.