Comparison of Neural Networks and Logistic Regression in Assessing the Occurrence of Failures in Steel Structures of Transmission Lines

A.C.G Bissacot1, *, S.A.B Salgado2, P.P Balestrassi1, *, A.P Paiva1, A.C Zambroni Souza2, R. Wazen3
1 Institute of Industrial Engineering - Federal University of Itajuba, Brazil
2 Institute of Electrical Engineering - Federal University of Itajuba, Brazil
3 Companhia Paranaense de Energia (COPEL), Brazil

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 2348
Abstract HTML Views: 1220
PDF Downloads: 304
ePub Downloads: 207
Total Views/Downloads: 4079
Unique Statistics:

Full-Text HTML Views: 1180
Abstract HTML Views: 652
PDF Downloads: 232
ePub Downloads: 146
Total Views/Downloads: 2210

© Bissacot et al.; Licensee Bentham Open.

open-access license: This is an open access article licensed under the terms of the Creative Commons Attribution-Non-Commercial 4.0 International Public License (CC BY-NC 4.0) (, which permits unrestricted, non-commercial use, distribution and reproduction in any medium, provided the work is properly cited.

* Address correspondence to these authors at the Institute of Industrial Engineering - Federal University of Itajuba, Brazil; Tel: +55 35 88776958; E-mail:


In this work, we evaluate the probability of falling metal structures from transmission lines. It is our objective to extract knowledge about which variables influence the mechanical behavior of the operating lines and can be used to diagnose potential falling towers. Those pieces of information can become a basis for directing the investments of reinforcement structures, avoiding the occurrence of long turn offs and high costs as a consequence of damage to towers of transmission lines. The results are obtained using the history of 181 metal structures currently in operation in the state of Paraná/Brazil. For the classification of transmission lines susceptible to failures it is proposed to identify the most likely lines considering the following parameters: operating voltage, wind and relief of the region, air masses, temperature, land type, mechanical capacity, function and foundation structure. The classic technique of classifying binary events used in this type of problem is the logistic regression (LR). The more recent technique for classification, using Artificial Neural Networks (ANN) can also be applied. The results are compared through the area under receiver operating characteristics (ROC) curves.

Keywords: Artificial Neural Networks, Fall of Metal Structures, Logistic Regression, ROC Curves, Transmission Lines.


Aerial transmission lines are exposed to various risks associated with the environment, to changes in building characteristics and climatic variations. Often, these risks can lead to serious damage, incurring structure falls. The fall of a structure can interrupt the power supply of a location for a long period as well as generate costs in the reconstruction of tracks of the electrical system, the profit loss for the concessionaire and costs in compensations related to damages originated from lack of energy. Due to the importance of aerial lines, a quantitative analysis of their characteristics in order to identify and mitigate them has much to contribute to the planning, operation and maintenance of lines.

It is intended here to extract knowledge about the parameters and variables that influence the mechanical behavior of the operating lines and can be used to diagnose potential falling towers. This information can become a basis for directing the investments of reinforcement structures, avoiding the occurrence of long turn off and high costs as a consequence of damage to towers of transmission lines. Few studies of classification of failures in transmission lines are found in the literature and generally explore constructive aspects of lines, reliability and construction. Wazen et al. [1] evaluated the susceptibility of metal structures of a transmission line using logistic regression and approximate joint failures. This paper’s findings were obtained by exploring a real case from historical data of 181 metal structures currently in operation in the state of Paraná/Brazil. The dataset analyzed presents ten explanatory variables (voltage, wind, relief, cold air masses, hot air masses, temperature, land, capacity, function, foundation) and a binary response variable of type that considers the fall (or not) of the metal structure. Classification models using LR and ANN methodologies were applied to the data and compared through the area under the receiver operating characteristic curve (AUC).

The paper is organized as follows. Section 1 introduced the context of the analyzed problem. Section 2 presents a review of papers reporting the use and comparison of artificial neural networks and logistic regression in different domains. Section 3 presents the case under study highlighting the main aspects of transmission lines and the relevant variables in determining structural failure. Section 4 shows the classic technique of binary logistic regression and classification results for the historical dataset. Section 5 presents the classification methodology and results based on the use of ANNs. A comparative analysis between the proposed ANN and classical LR methods is also performed through the area under the ROC curves. The main results and conclusions are discussed in section 6.


Classification is one of the most frequently encountered decision making tasks of human activity. A classification problem occurs when an object needs to be assigned into a predefined group or class based on a number of observed attributes related to that object. Traditionally, statistical classification procedures deal with these kind of problems but one major limitation is that they work well only when the underlying assumptions of the model are satisfied. Thus, due to characteristics aforementioned, ANNs have emerged as an important alternative tool for classification [2].

Over the years, there have been an increasing number of papers exploring the use of ANNs as a promising alternative methodology in comparison to the most consecrated methodology of LR. The characteristics of each of the reviewed works are presented in Table 1. The first column demonstrates what work is being analyzed. The second shows the nature of the papers, that is, if the authors prioritized a conceptual, a review or an application approach. The third presents their objectives and the fourth exposes which of the two methodologies has had a better performance.

Table 1.

Characteristics of previous works.

Paper Nature Objective Performance (ANN x LR)
Tu, 1996 [3] Conceptual Predict medical outcomes Not conclusive
Schumacher et al., 1996 [4] Application General comparison of both methods Not conclusive
Vach et al., 1996 [5] Conceptual General comparison of both methods Not conclusive
Freeman et al., 2000 [6] Application Predict in-hospital death after angioplasty Similar
Leung & Tran, 2000 [7] Application Predict shrimp disease outbreaks ANN
Borque et al., 2001 [8] Application Predict pathological stage Similar
Chun et al., 2007 [9] Application Predict the probability of prostate cancer LR
Kawakami et al., 2008 [10] Application Predict the probability of prostate cancer LR
Ottenbacher et al., 2001 [11] Application Predict rehospitalization for patients with stroke Similar
Nguyen et al., 2002 [12] Application Predict death or limb amputation in meningococcal disease Similar
DiRusso et al., 2002 [13] Application Analyze survival in pediatric trauma patients ANN
Dreiseitl & Ohno-Machado, 2002 [14] Review General comparison of both methods ANN
Hajmeer & Basheer, 2003 [15] Application Classify bacterial growth ANN
Ottenbacher et al., 2004 [16] Application Address prediction questions epidemiological research Similar
Lin et al., 2010 [17] Application Predict living setting following hip fracture ANN
Ergün et al., 2004 [18] Application Classify carotid artery stenosis of patients with diabetes ANN
Yesilnacar & Topal, 2005 [19] Application Analyze landslide susceptibility ANN
Yilmaz, 2009 [20] Application Analyze landslide susceptibility ANN
Pradhan & Lee, 2010 [21] Application Analyze landslide susceptibility ANN
Choi et al.,2012 [22] Application Analyze landslide susceptibility LR
Song et al., 2005 [23] Application Differentiate between malignant and benign breast masses Similar
McLaren et al., 2009 [24] Application Detection and diagnosis of breast lesions Similar
Green et al., 2006 [25] Application Predict acute coronary syndrome ANN
Chiang et al., 2006 [26] Application Differentiate between web and traditional stores ANN
Liew et al., 2007 [27] Application Predict illness on patients undergoing bariatric surgery ANN
Gutiérrez et al., 2008 [28] Application Map Ridolfia Segetum (a persistent weed) infestation Not conclusive
Kurt et al., 2008 [29] Application Predict coronary heart disease ANN
Al Housseini et al., 2009 [30] Application Predict the risk of cesarean delivery in nulliparas ANN
Caocci et al., 2010 [31] Application Predict the occurrence of acute graft-vs-host disease ANN
Pavlekovic et al., 2010 [32] Application Recognize mathematically gifted children Similar
Trtica-Majnaric et al., 2010 [33] Application Predict influenza vaccination outcome ANN
Chen et al., 2012 [34] Application Differentiate between malignant and benign lung nodules ANN
Larasati et al., 2012 [35] Application Psychological research ANN
Pourshahriar, 2012 [36] Application Psychological research Similar
Swiderski et al., 2012 [37] Application Assess the financial condition of a company Not conclusive
Askin & Gokalp, 2013 [38] Application Assess students’ mathematics achievement Similar
Morteza et al., 2013 [39] Application Predict the level of albuminuria in type 2 diabetes Not conclusive
Vallejos & McKinnon, 2013 [40] Application Classify seismic records Similar

Fig. (1) summarizes the characteristics observed, reporting the percentage of works in which ANNs outperformed LR and vice versa. Not conclusive or similar performances are also presented.

Fig. (1).

Percentage of works in which each methodology was better than the other.


Transmission lines are circuits that interconnect substations, power plants or energy distributors. These circuits are composed of self-supporting towers or poles, as well as the flow of metal for power wires. Its main function is to transport large volumes of electricity with the least possible loss of energy. Transmission systems can happen in alternating or direct cables. Among the systems, the most used in Brazil is alternating and its application occurs in three-phase circuit chains with just one high voltage transmission line in direct current. These are compositions for interconnection of energy in the country, with different consumer centers as well as to supply large industrial facilities. Brazilian system transmits voltage of 69 kV, 88 kV, 138 kV, 230 kV, 345 kV, 525 kV and 750 kV.

According to Wazen et al. [1], in order to conveniently analyze the reasons for the discontinuation of energy transmission due to contingencies in transmission lines caused by external factors, it is necessary to describe the types and configurations of structures, cables and foundations.

  • Structures: The dimensions and shapes of the structures depend on the required disposition of the conductors, the distance between them, the size and kind of isolation, arrows projected for the conductors, minimum safety height and the number of circuits involved. The design of a transmission line structure depends on both the charge to be transported (directly linked to the capacity and performance of the wire) as well as the size of the structure to be used. These values are calculated considering the security and performance compared to the values of voltage and efforts that structures will be submitted. Structures can be classified on some criteria, described on Table 2.
Table 2.

Types of structure.

Type of Structure Description Classification Classification Description
About its function The function of a structure in a transmission line is associated with the efforts to which each structure must be submitted. Suspension Structures subjected to efforts of vertical and horizontal transverse components
Anchoring Structures subjected to vertical, horizontal transverse and longitudinal forces
About its resistance How the structures transfer the sustained efforts to the materials and the region they were applied Self-supporting The fittings are capable to support all the efforts applied on the same
Cable-stayed It uses cables to connect the structure and the soil
About its composition There are many ways to transmit power using structures of several materials Metal Usually made ​of carbon steel, normal or high strength profiled or tubular
Concrete Due to the material used, it must be made with the whole body, hindering its transport and assembly
Wood Despite being made of easy extraction equipment and low cost, the wooden pole does not have great mechanical strength

The types and configurations of structures in use are varied. The framework projects are not limited to the models already applied. But to define a new model, a large amount of information is required to find an efficient configuration and is not applied exclusively to a structure of a series of projected lines. The application of appropriate materials as well as voltage levels eventually turns even more difficult to define standardized structures.

  • Cables: The cables can be differentiated according to the various functions that a transmission line can have, which can be power conductors, protectors against outbreaks atmospheric and overcurrent, or even energy dissipation. The conductor cable can be called as an active part of a transmission line because it serves as a guide to the electric and magnetic fields. In power cables, the great majority is made of aluminium having core galvanized steel (steel core).
  • Foundations: The foundations are designed to balance the action of the forces acting on the top of the structure and its equipment, and it must take into account the type of soil where it will be located. With this, it is necessary to perform a reconciliation of these factors so that foundations can be better implemented without generating excess material or settling the forces to which it will be exposed.
  • Concrete Foundations: Some concessionaries use concrete in some of its foundations, and its use depends on the preparation of concrete. For this preparation the concrete should be mixed mechanically according to the amounts stipulated by the structure design. The amount of concrete prepared in each operation is strictly necessary for immediate use. Fresh concrete should suffer the least possible distance of transportation and be released immediately after mixing and kneading. In cases of use of waterproofing for concreting with the presence of water, this time interval for release should be extremely short, so that the mass is practically uniform. Considering these basic situations for applying concrete, follow the types of foundation applied to transmission lines. Among the types of foundation, we highlight the caisson foundation, which is a kind of deeper foundation, excavated with shovel, pick, or auger. Once you put the armor for concrete, concreting is performed. Generally, you use shackles of reinforced concrete, to make the bracing of the trunk (top layer of concrete) and the reinforced concrete shoe, which is a shorter concrete foundation in relation to the dimensions of the base. They are typically a square base and inclined or vertical shaft.
  • Metallic Foundation: Metallic foundations are used in regions where the ground has good cohesion. The foundations that implemented purely metallic structures of transmission lines have pyramidal shape, so that the connection to the structure is made from the top of the pyramid and all of its internal area is hollow and filled with soil. This format turns the soil itself in a mechanical barrier that prevents the base of the foundation (base of pyramid) to rise or move laterally. Variations on this type of foundation occur as the size height, dimensions of the stringers and sheet metal to be applied. As an example of variations, there is the metal grid, which is classified as a shallow foundation, connected to the foot (amount) of the tower at ground level. The foundation base (grid itself) consists of platters (U profile) and angles. They are shallow foundations with 2-4 feet deep, recommended for clay, sandy soil, but dry and with increasing strength with depth and with the possibility of being excavated in the open air.

Now that the structures, cables and foundations were described, it is important to stress that every type of equipment can suffer great efforts. A functional failure, which can even be a fall of structures, could happen. Fallen towers represent a critical issue and their causes must be examined. Table 3 presents part of the data set which comprehends 181 metal structures of transmission lines currently in operation in the state of Paraná/Brazil.

Table 3.

Part of the data set of 181 steel structures of transmission lines.

Case Voltage Wind Relief Cold Air
Hot Air
Temperature Land Capacity Function Foundation Result
1 69 14 plateau parallel perpendicular 17 C high anchorage grid none
2 69 14 plateau parallel perpendicular 17 C high suspension grid none
3 69 14 plateau parallel perpendicular 17 C low anchorage grid none
4 69 14 plateau parallel perpendicular 17 C low anchorage stub none
5 69 14 plateau parallel perpendicular 17 C low suspension grid none
6 69 15 plateau perpendicular transversal 18 D high anchorage grid none
7 69 15 plateau perpendicular transversal 18 D high suspension grid none
8 69 19 plain perpendicular transversal 21 B low suspension grid fall
9 69 20 plateau parallel perpendicular 17 B low suspension grid fall
10 69 20 plateau parallel transversal 19 A low suspension grid fall
11 69 20 plateau transversal perpendicular 17 A low suspension grid fall
12 69 23 plain transversal parallel 20 A low suspension grid fall
13 88 20 plateau parallel transversal 22 A low suspension grid fall
14 88 20 plain parallel transversal 22 B low suspension grid fall
15 138 14 ridge parallel transversal 16 B low anchorage grid none
16 138 14 ridge parallel transversal 16 B low suspension grid none
179 230 26 plateau parallel transversal 17 D high suspension stub none
180 230 26 plateau parallel transversal 17 D low suspension stub none
181 525 17 plain perpendicular transversal 20 B low suspension grid fall

The attributes selected for this article were: operating voltage, wind and relief of the region, air masses, temperatures in the region, land type, mechanical capacity of the structure, function and type of foundation structure.

The response of interest is dichotomous, i.e. if there was a structure falling or not. Among all selected explanatory variables or attributes in the data set, only three variables are quantitative and the others are qualitative. Quantitative variables vary within a certain range, according to its characteristic, and qualitative variables have different classifications according to their nature. Explanatory variables are described below.

  • Operating Voltage: The electrical system of the state of Paraná has transmission lines in the following voltages: {69, 138, 230, 525} where each value is given in kV.
  • Wind of the Region: Wind is an important feature that increases the susceptibility of occurrence of falling structure. It varies according to the region where the structure is located. The wind attribute has the following ranges {16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26} where each value is measured in km/h.
  • Relief: The land in which the structures are set shows formations that help in the visualization of the points where the structure has a greater chance of falling. This attribute is ranked {plain, plateau, ridge, valley}.
  • Air Masses: As air masses are different they have different senses of displacement, interference is studied independently. Each of the masses generates lateral forces to the cables may be of greater or lesser impact, causing a strain on the structures that support the cables. When the air mass is acting on perpendicularly to the wires there is a greatest possible force applied to the wires. When the incidence is closest to the direction parallel to the wires, the smaller the force applied by wind pressure on the wires. Therefore, the greatest force is generated on the wires with the mass of air in the direction perpendicular to the lower and occurs in the direction parallel to the wires. Thus, objects that represent the groups of air mass are classified as cold air masses which can be {parallel, perpendicular and transversal} and hot air masses, which are {parallel, perpendicular and transversal}.
  • Temperature: The southern region is where the greatest variations in temperature throughout the day are registered, due to its distance from the tropics and the fact that it is in the region of strong influence of masses of cold and warm air. This attribute has the bands measured in degrees Celsius {16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27}.
  • Land: The wind regime is influenced by factors such as topography and roughness of the land. This means that although the average values used, in some points these values can be smaller or larger. Ground can be differentiated into four categories, according to the coefficients of roughness that is: A) Vast expanses of water, shore plains and deserts plans; B) Ground open with few obstacles; C) Land with numerous small obstacles; D) lands and urbanized areas with many tall trees. The tracks which form the group of the attribute are {A, B, C, D}.
  • Mechanical Capacity: To select a particular type of structure the efforts that it will be applied will be considered and this definition deliberates in the deployment project. Therefore, for this attribute is considered as a {high, low} mechanical capacity.
  • Function Structure: Concerning this attribute, the structure can be applied as: {suspension or anchorage}, without considering intermediate possibilities. This item refers to the efforts that the tower is subjected, emphasizing that the anchorage compositions of the fittings are more enhanced than the suspension.
  • Foundation Structure: It can be considered that there are two types of foundation concrete and metal foundation. The concrete can be in various formats, but always with a metal frame that makes its mooring and connection to the tower body. As the vast majority of concrete foundations applied are stub type, this term is used in this work comprehensively, referring to all varieties of concrete. Metal foundations also have different designs, but as to its shape, we can say that all are pyramidal. Among the pyramidal shape, the application is a larger metal grid type. The objects that form the group are classified into this attribute {stub, grid}.

For the classification of transmission lines in their susceptibility to failures we propose applying two different models. First, a logistic regression model will be applied and discussed. Then the automated neural network model will be developed. The comparison of the results obtained will be made via the area under the receiver operating characteristics curve known as the area under the curve (AUC). Fawcett [41] affirms that the ROC curve is a two dimensional depiction of a classifier performance. To compare classifiers we may want to reduce ROC performance to a single scalar value, the AUC, which has an important statistical property: the AUC of a classifier is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance.


Regression modeling is one of several statistical techniques that enable an analyst to predict a response based upon a set of inputs. Linear regression models are commonly used when the range of the response is continuous, and can theoretically take any value. This model will be used to estimate the probability that a steel structure of transmission lines will fall due to certain conditions. As the output is restricted to the interval (0, 1), the assumption of an infinite range fails. An alternative is instead to use a logistic regression model [42]. The common form for a logistic model is,


where P [c | Xt] is the conditional probability that the observation described by the input vector Xt is a member of class c. What makes the logistic equation appropriate for probability modelling is the use of the sigmoid or “s” function.


The sigmoid function in equation (2) is a continuous mapping of the real line on to the interval [0, 1]. While this interval is open with regards to the closed probability interval, it does create a method of modelling percentages and probabilities.

In order to compare more easily the logistic regression model to the feedforward Neural Network model, the logistic model can be described in a matrix form:


In this matrix form, XtT is the transpose of the vector of inputs, is the vector of estimated parameters, is the estimated intercept term, and as before, G(o) represents the sigmoid function.

Although the purpose of this model is to predict the expected probability of steel structure rupture, the logistic regression model provides an additional benefit. This second use is its ability to provide insight into the model inputs or explanatory variables. The increase in the probability, in terms of the odds ratio, of a rupture when the variable is present is easily calculated rom the estimated parameters. If input variable i has an estimated parameter βi, the odds ratio can be calculated using equation (4).


Whether a linear regression model or a feedforward neural network is chosen for the model, the response data are dichotomous. It is because of this ability to model dichotomous outputs that the logistic model is a common tool in many fields.

The main findings for the regression model for the steel structure (Table 1) are described in the following tables and graphs. Minitab software was used to run the analysis and comments are also included. Table 4 describes the response information, factor information and logistic regression table given by the Minitab results. Table 5 shows the Mintabresults for the G Statistic, Goodnesss-of-fit Tests, table of frequencies and Measures of association. Fig. (2) presents the Delta chi-square plots and their respective interpretation.

With the probabilities of occurrence of failures given by the LR, the ROC curve demonstrated in Fig. (3) can be plotted. The calculated result for the AUC was of 0,983, which indicates an excelent perfomance for the classifier.

Fig. (2). Delta Chi-Square plots and interpretation.


Artificial neural networks (ANNs) have been used increasingly as a promising modeling tool in almost all areas of human activities where quantitative approaches can be used to help decision making. They have already been treated as a standard nonlinear alternative to traditional models for pattern classification, time series analysis, and regression problems [43].

ANNs were first used in the fields of cognitive science and engineering, are universal and highly flexible function approximators [44]. As cited by Tsay [45], ANNs are general and flexible tools for forecasting applications:

A popular topic in modern data analysis is ANN, which can be classified as a semiparametric method. As opposed to the model-based nonlinear methods, ANNs are data-driven approaches which can capture nonlinear data structures without prior assumption about the underlying relationship in a particular problem.

Fig. (4) shows the ANN structure employed in the present study: A multilayer feedforward network trained with Backpropagation. The ANN has three types of layers, namely, the input layer, the output layer and the hidden layer, which is intermediate between the input and output layers. The number of hidden layers is usually one or two. Each layer consists of neurons, and the neurons in two adjacent layers are fully connected with respective weights, while the neurons within the same layer are not connected. In this paper, the output layer has just a single neuron, which represents the one-step forecasting based on previous points.

Fig. (3).

ROC curve of the LR classifier.

Each neuron in the input layer is designated to an attribute in the data, and produces an output which is equal to the (scaled) value of the corresponding attribute. For each neuron in the hidden or output layer, the following input-output transformation is employed:

Fig. (4).

Multilayer feedforward ANN structure.


where v is the output, H is the total number of neurons in the previous layer, uh is the output of the hth neuron in the previous layer, wh is the corresponding connection weight, w0 is the bias (or intercept). fis the nonlinear transformation function (or activation function) also used in the output layer. The following transformation function, as example, is employed very often:


When the ANN is trained using the Backpropagation algorithm the weights and biases are optimized. The objective function employed for optimization is the sum of the squares of the difference between a desirable output (ytarget) and an estimated output (ybpn).

Review of ANNs from statistical and econometric perspectives can be found in [46]. Today ANNs are used in a variety of modeling and forecasting problems. Although many models commonly used in real problems are linear, the nature of most real data sets suggests that nonlinear problems are more appropriate for forecasting and accurately describing it. ANN plays an important role for this kind of forecasting.

The literature on ANN is enormous and its applications spread over many scientific areas with varying degrees of success. In the M-Competition [47], M2-Competition [48] and M3-Competition [49] many participants used ANNs. The main reason for this increased popularity of ANNs is that these models have been shown to be able to approximate almost any nonlinear function arbitrarily close.

Several factors have been considered in the literature when training ANNs. Table 6 presents the characteristic of the ANN constructed and details are given next. For the development of the net, the software Statistica (with Automated Neural Network toolbox) was employed (Statsoft, 2008).

Table 6.

ANN characteristics.

Net. Name Training Perf. Test Perf. Training Algorithm Error Function Hidden Activation Output Activation
MLP 23-13-2 99,31034 94,44444 BFGS 27 SOS Logistic Identity

1. ANN Architecture/Net. name: ANNs are nonlinear modeling algorithms. Examples of ANN for nonlinear time series are Multilayer Perceptrons (MLP), Radial Basis Function (RBF), Support Vector Machine (SVM), among many others. The multilayer perceptron is the most common form of network and the one used here. It requires iterative training, which may be quite slow for large number of hidden units and datasets, but the networks are quite compact, execute quickly once trained, and in most problems yield better results than the other types of networks. Each model has a name depending on its type, i.e. MLP (Multilayer Perceptron), number of inputs, number of neurons in the hidden layer, and the number of outputs. For example, the model named as MLP 23-13-2 refers to a multilayer perceptron network with 23 inputs, 13 neurons in each layer, and 2 outputs.

2. Training Performance/Test Performance: These columns indicate the performance of the network on the subsets used. The performance measure depends on the type of network target variable. For nominal variables (classification networks), the performance measure is the proportion of cases correctly classified, which is known as the classification rate.

3. Training Algorithm: This factor is related to the following training algorithm chosen for the MLP such as:

  • Gradient Descent. Gradient descent is a first order optimization algorithm that attempts to move incrementally to successively lower points in search space in order to locate a minimum.
  • Conjugate Descent. Conjugate descent is a fast training algorithm for multilayer perceptrons that proceeds by a series of line searches through error space. Succeeding search directions are selected to be conjugate (non-interfering). It is a good generic algorithm with generally fast convergence.
  • BFGS. BFGS (Broyden-Fletcher-Goldfarb-Shanno, or Quasi-Newton) is a powerful second order training algorithm with very fast convergence but high memory requirements due to storing the Hessian matrix.
  • The results present the algorithm used followed by the number of epochs for which the algorithm ran (if an iterative algorithm). For example, the code BFGS 27 indicates that the BFGS algorithm was used and that this network was found on the 27th cycle (the actual number of cycles used to train the model might be more than that).

4. Error Function: It indicates the error function used. It is either sum-of-squares (SOS) or Cross entropy (CE). CE is used for classification tasks only. SOS can be used for both classification and regression tasks.

5. Hidden Activation: This column indicates the activation function used for the hidden layer. Possible activation functions for MLP networks include Identity, Logistic, Tanh, Exponential, Sine.

  • Identity. Uses the identity function. With this function, the activation level is passed on directly as the output of the neurons.
  • Logistic. Uses the logistic sigmoid function. This is an S-shaped (sigmoid) curve, with output in the range (0,1).
  • Tanh. Uses the hyperbolic tangent function (recommended). The hyperbolic tangent function (tanh) is a symmetric S-shaped (sigmoid) function, whose output lies in the range (-1, +1). Often performs better than the logistic sigmoid function because of its symmetry.
  • Exp. Uses the negative exponential activation function.
  • Sine. Uses the standard sine activation function.

6. Output Activation: Indicates the activation function used for the output layer. Possible activation functions for MLP type of networks include Identity, Logistic, Tanh, Exponential, Sine, and Softmax. Softmax activation functions are used with cross entropy error which be used only for classification tasks.

Fig. (5) shows the receiver operating characteristics curve for the MLP 23-13-2. The area under the curve was of 0,994 demonstrating apparent superior performance when compared with the one obtained by the logistic regression (0,979) model.

Some papers have discussed how to test the statistical significance of the difference between the areas under two dependent ROC curves.The methods discussed in Hanley and McNeil’s [50] work and in Delong et al. [51] are the most significant in revised papers. We tested the statistical significance of the difference according to both methodologies and the results are presented in Table 7. As demonstrated by the significance level (p-values > 0,05), there is insufficient evidence that one area is more expressive than the other. In other words, logistic regression and neural networks have both excellent and similar classification performances for the example under investigation. Fig. (6) shows both curves plotted on the same graph.

Fig. (5).

ROC curve of the ANN classifier.

Table 7.

Results of the tests reporting statistical significance of the difference between AUC.

Hanley and Mcneil’s Method Delong et al. Method
Difference between areas  0,0108 0,0108
Standard Error 0,00790 0,00815
95% Confidence Interval -0,00470 to 0,0263 -0,00520 to 0,0268
z statistic 1,365 1,322
Significance level P = 0,1724 P = 0,1860
Fig. (6).

LR and ANN ROC curves.


In this paper, we discussed assessing the probability of occurrence of failures in steel structures of transmission lines through two different techniques: logistic regression and artificial neural networks to extract knowledge about which variables influence the mechanical behavior of the operating lines and can be used to diagnose potential falling towers. For the classification of transmission lines susceptible to failures, the following parameters have been considered: operating voltage, wind and relief of the region, air masses, temperature, land type, mechanical capacity, function and foundation structure.

The results of the logistic regression and neural networks modelling show a direction in relation to the structures that are more susceptible to fall. Analyzing the logistic regression results we can infer that variables with p-values inferior to (0,05) are significant and those with high coefficient absolute values influence more the outcome of interest. For example, relief p-values are very low while their coefficients are high, demonstrating that this variable has considerable influence on the outcome under investigation. On the other hand, wind p-value is high which implies irrelevant influence on the outcome. Thus, with these preliminaries evaluation of the structures vulnerable, studies and implementations of improvements and actions can be previously programmed, minimizing the costs of load shedding and avoiding high values of lost profits and damages. The risks and costs involved to a fallen tower for both the energy concession as for the general population are higher than acting preemptively.

Depending on the goals or the characteristics of the data one model can be more adequate than the other. The use of artificial neural networks may be particularly useful when the main goal is outcome classification and important interactions or complex nonlinearities exist in a data set, also it requires less formal statistical training and can be developed using multiple different training algorithms. A limitation of neural network models is that standardized coefficients and odds ratios corresponding to each variable cannot be easily calculated and presented as they are in regression models.

Logistic regression remains the clear choice when the primary goal of model development is to look for possible causal relationships between independent and dependent variables, and a modeler wishes to easily understand the effect of predictor variables on the outcome given that the model equation is also provided.

Numerically the performance of artificial neural networks was higher than logistic regression model. However, there was no statistical difference between them and both classifiers have excellent performances. In other words, it can be inferred that the performance of models selected by ANN and LR was quite similar, and the analytic methods were found to be roughly equivalent in terms of their classification ability as demonstrated by equivalent AUC graphs. The ANN methodology is more robust (i.e., it does not require a high level of operator judgment), and it uses a sophisticated nonlinear model to achieve high classification performance. On the other hand, logistic regression may generate many sets of models that yield similar performances, and the operator will need to make intellectual judgments to select the best models.


The authors confirm that this article content has no conflict of interest.


Declared none.


[1] Wazen RN, Fernandes TS, Aoki AR, de Souza WE. Evaluation of the susceptibility of failures in steel structures of transmission lines. J Control Autom Electr Syst 2013; 24(1-2): 174-86.
[2] Zhang GP. Neural networks for classification: a survey IEEE Trans Syst Man Cybern C Appl Rev 2000; 30(4): 451-62.
[3] Tu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes J Clin Epidemiol 1996; 49(11): 1225-31.
[4] Schumacher M, Robner R, Vach W. Neural networks and logistic regression : Part I. Comput Stat Data Anal 1996; 21: 661-82.
[5] Vach W, Robner R, Schumacher M. Neural networks and logistic regression : Part II. Comput Stat Data Anal 1996; 21(95): 683-701.
[6] Freeman RV, Eagle KA, Bates ER, et al. Comparison of artificial neural networks with logistic regression in prediction of in-hospital death after percutaneous transluminal coronary angioplasty. Am Heart J 2000; 140(3): 511-20.
[7] Leung P, Tran LT. Predicting shrimp disease occurrence: artificial neural networks vs. logistic regression Aquaculture 2000; 187(1-2): 35-49.
[8] Borque A, Sanz G, Allepuz C, Plaza L, Gil P, Rioja LA. The use of neural networks and logistic regression analysis for predicting pathological stage in men undergoing radical prostatectomy: a population based study. J Urol 2001; 166(5): 1672-8.
[9] Chun FK, Graefen M, Briganti A, et al. Initial biopsy outcome prediction--head-to-head comparison of a logistic regression-based nomogram versus artificial neural network. Eur Urol 2007; 51(5): 1236-40.
[10] Kawakami S, Numao N, Okubo Y, et al. Development, validation, and head-to-head comparison of logistic regression-based nomograms and artificial neural network models predicting prostate cancer on initial extended biopsy. Eur Urol 2008; 54(3): 601-11.
[11] Ottenbacher KJ, Smith PM, Illig SB, Linn RT, Fiedler RC, Granger CV. Comparison of logistic regression and neural networks to predict rehospitalization in patients with stroke. J Clin Epidemiol 2001; 54(11): 1159-65.
[12] Nguyen T, Malley R, Inkelis S, Kuppermann N. Comparison of prediction models for adverse outcome in pediatric meningococcal disease using artificial neural network and logistic regression analyses. J Clin Epidemiol 2002; 55(7): 687-95.
[13] DiRusso SM, Chahine AA, Sullivan T, et al. Development of a model for prediction of survival in pediatric trauma patients: comparison of artificial neural networks and logistic regression. J Pediatr Surg 2002; 37(7): 1098-104.
[14] Dreiseitl S, Ohno-Machado L. Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inform 2002; 35(5-6): 352-9.
[15] Hajmeer M, Basheer I. Comparison of logistic regression and neural network-based classifiers for bacterial growth. Food Microbiol 2003; 20(1): 43-55.
[16] Ottenbacher KJ, Linn RT, Smith PM, Illig SB, Mancuso M, Granger CV. Comparison of logistic regression and neural network analysis applied to predicting living setting after hip fracture. Ann Epidemiol 2004; 14(8): 551-9.
[17] Lin C-C, Ou Y-K, Chen S-H, Liu Y-C, Lin J. Comparison of artificial neural network and logistic regression models for predicting mortality in elderly patients with hip fracture. Injury 2010; 41(8): 869-73.
[18] Ergün UU, Serhatlioğlu S, Hardalaç F, Güler I. Classification of carotid artery stenosis of patients with diabetes by neural network and logistic regression. Comput Biol Med 2004; 34(5): 389-405.
[19] Yesilnacar E, Topal T. Landslide susceptibility mapping: A comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng Geol 2005; 79(3-4): 251-66.
[20] Yilmaz I. Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat—Turkey). Comput Geosci 2009; 35(6): 1125-38.
[21] Pradhan B, Lee S. Landslide susceptibility assessment and factor effect analysis: backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling. Environ Model Softw 2010; 25(6): 747-59.
[22] Choi J, Oh H-J, Lee H-J, Lee C, Lee S. Combining landslide susceptibility maps obtained from frequency ratio, logistic regression, and artificial neural network models using ASTER images and GIS. Eng Geol 2012; 124: 12-23.
[23] Song JH, Venkatesh SS, Conant EA, Arger PH, Sehgal CM. Comparative analysis of logistic regression and artificial neural network for computer-aided diagnosis of breast masses. Acad Radiol 2005; 12(4): 487-95.
[24] McLaren CE, Chen W-P, Nie K, Su M-Y. Prediction of malignant breast lesions from MRI features: a comparison of artificial neural network and logistic regression techniques. Acad Radiol 2009; 16(7): 842-51.
[25] Green M, Björk J, Forberg J, Ekelund U, Edenbrandt L, Ohlsson M. Comparison between neural networks and multiple logistic regression to predict acute coronary syndrome in the emergency room. Artif Intell Med 2006; 38(3): 305-18.
[26] Chiang WK, Zhang D, Zhou L. Predicting and explaining patronage behavior toward web and traditional stores using neural networks: a comparative analysis with logistic regression. Decis Support Syst 2006; 41(2): 514-31.
[27] Liew P-L, Lee Y-C, Lin Y-C, et al. Comparison of artificial neural networks with logistic regression in prediction of gallbladder disease among obese patients. Dig Liver Dis 2007; 39(4): 356-62.
[28] Gutiérrez PA, López-Granados F, Peña-Barragán JM, Jurado-Expósito M, Hervás-Martínez C. Logistic regression product-unit neural networks for mapping Ridolfia segetum infestations in sunflower crop using multitemporal remote sensed data. Comput Electron Agric 2008; 64(2): 293-306.
[29] Kurt I, Ture M, Kurum AT. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl 2008; 34(1): 366-74.
[30] Al Housseini A, Newman T, Cox A, Devoe LD. Prediction of risk for cesarean delivery in term nulliparas: a comparison of neural network and multiple logistic regression models. Am J Obstet Gynecol 2009; 201(1): 113.e1-6.
[31] Caocci G, Baccoli R, Vacca A, et al. Comparison between an artificial neural network and logistic regression in predicting acute graft-vs-host disease after unrelated donor hematopoietic stem cell transplantation in thalassemia patients. Exp Hematol 2010; 38(5): 426-33.
[32] Pavlekovic M, Bensic M, Zekic-Susac M. Modeling children’s mathematical gift by neural networks and logistic regression. Expert Syst Appl 2010; 37(10): 7167-73.
[33] Trtica-Majnaric L, Zekic-Susac M, Sarlija N, Vitale B. Prediction of influenza vaccination outcome by neural networks and logistic regression. J Biomed Inform 2010; 43(5): 774-81.
[34] Chen H, Zhang J, Xu Y, Chen B, Zhang K. Performance comparison of artificial neural network and logistic regression model for differentiating lung nodules on CT scans. Expert Syst Appl 2012; 39(13): 11503-9.
[35] Larasati A, DeYong C, Slevitch L. The application of neural network and logistics regression models on predicting customer satisfaction in a student-operated restaurant. Procedia Soc Behav Sci 2012; 65: 94-9.
[36] Pourshahriar H. Correct vs. accurate prediction: A comparison between prediction power of artificial neural networks and logistic regression in psychological researches Procedia Soc Behav Sci 2012; 32(2011): 97-103.
[37] Swiderski B, Kurek J, Osowski S. Multistage classification by using logistic regression and neural networks for assessment of financial condition of company. Decis Support Syst 2012; 52(2): 539-47.
[38] Askin OE, Gokalp F. Comparing the predictive and classification performances of logistic regression and neural networks: a case study on timss 2011. Procedia Soc Behav Sci 2013; 106: 667-76.
[39] Morteza A, Nakhjavani M, Asgarani F, Carvalho FL, Karimi R, Esteghamati A. Inconsistency in albuminuria predictors in type 2 diabetes: a comparison between neural network and conditional logistic regression. Transl Res 2013; 161(5): 397-405.
[40] Vallejos JA, McKinnon SD. Logistic regression and neural network classification of seismic records. Int J Rock Mech Min Sci 2013; 62: 86-95.
[41] Fawcett T. An introduction to ROC analysis. Pattern Recognit Lett 2006; 27(8): 861-74.
[42] Hosmer DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression. 3rd ed.. Hoboken, NJ, USA: John Wiley & Sons 2013; p. 518.
[43] Zhang GP. Avoiding Pitfalls in neural network research. IEEE Trans Syst Man Cybern C 2007; 37(1): 3-16.
[44] Balestrassi PP, Popova E, Paiva AP, Marangon Lima JW. Design of experiments on neural network’s training for nonlinear time series forecasting. Neurocomputing 2009; 72(4-6): 1160-78.
[45] Tsay RS. Analysis of Financial Time Series. 3rd ed.. USA: Wiley and Sons 2010; p. 712.
[46] Cheng B, Titterington DM. Neural networks: A review from a statistical perspective. Stat Sci 1994; 9(1): 2-54.
[47] Makridakis S, Andersen A, Carbone R, et al. The accuracy of extrapolation (time series) methods: Results of a forecasting competition. J Forecast 1982; 1(2): 111-53.
[48] Makridakis S, Chatfield C, Hibon M, et al. The M2-Competition : A real-time judgmentally based forecasting study. Int J Forecast 1993; 9: 5-22.
[49] Makridakis S, Hibon M. The M3-Competition : results, conclusions and implications. Int J Forecast 2000; 16: 451-76.
[50] Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148(3): 839-43.
[51] DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44(3): 837-45.