Business Case
for
Transformer On-line Monitoring
EPRI Diagnostic Conference
July 2006
John E. Skog P.E.
Maintenance and Test Engineering LLC
John.skog@mtec2000.com
Anthony Johnson P.E
Southern California Edison Co.
anthony.johnson@sce.com
A study on the applicability of multi-gas monitors was made
on an important population of Southern California Edison’s transformers
commonly referred to as the “A Bank Population”. This population represented both a
significant financial and functional investment. While the
In order to reduce the risk associated with an “A Bank” failure and ensure maximum operating life is obtained from the fleet, the implementation of a multi-gas on-line monitoring system has been shown to be both technically and economically prudent.
Figure 1 Aging "A Bank" Transformer Fleet
Beyond the risk of failure, there is a need to consider pre-emptive replacement of older units but unfortunately there is not a clearly identifiable age at which transformer retirement should take place. While it is expected that the retirement of older units will improve overall fleet reliability and reduce risk, it comes with a significant capital cost.
Current PM programs and operation practices have provided
The business case that follows conclusively shows that the
implementation of continuous multi-gas on-line monitoring on “A Bank”
transformers is of significant benefit to
· Economic: Greatly reduces the impacts of failure. A payback period of less than 5 years is expected for each multi-gas monitor installation (see Table 1).
· Risk Reduction: Significantly reduces the overall risk of transformer failure even when all modes of failure and current maintenance practices are included. A 39% reduction in the overall failure risk for the existing fleet is anticipated (see Figure 2)
·
Preservation
of Capital: Allows
Figure 2 "A Bank" Risk Profiles for Various PM Approaches
Throughout the Electric Utility Industry, there has been significant emphasis on the subject of transformer life management. There are numerous reasons for this focused attention, among them are:
· The average age of the transformer fleet is increasing.
· Organizational changes are segmenting the utility, reducing staffing levels resulting in fewer transformer and asset technical experts.
· The transmission grid is being operated with lower margins.
· The transformer is the single most costly item in a substation.
· Replacement transformer lead times can exceed one-year.
· Financial pressure to reduce both capital and maintenance expenditures have resulted in increased load factors and reduced spares.
· New technologies and diagnostic techniques are making on-line monitoring practical and effective.
For large capital assets like transformers, many times, the direct capital replacement cost is the most significant driver for extending a transformers operating life. The indirect costs associated with a transformer failure are becoming more and more significant resulting in the need to not just extend the life of a transformer but to ensure that any life extension is accompanied by an increase in reliability. Indirect costs of significance include:
· Transmission grid congestion
· Loss of supply to key customers
· Generation impacts
· Environmental damage
· Political repercussions
Power transformers represent a significant capital investment by Southern California Edison. Replacement of these devices occurs for three general reasons:
· Technical inadequacy such as capacity or voltage
· Failure of the transformer or one of its key subcomponents
· Risk of imminent failure
Current PM Practices significantly reduce the likelihood of a transformer failure by correcting deficiencies, replacing worn components and determining the transformer’s operating health. As the operating health of the transformer diminishes there is now a heightened awareness of imminent failure. For many critical transformer components, there is no economically effective method for correcting these health deficiencies thus continuing to operate the transformers adds an increased burden of risk. If the health deficiencies indicate a quickly deteriorating condition, immediate response is required, if the health deficiencies indicate a slowly deteriorating condition, a delayed response maybe appropriate. Unfortunately, it can be very difficult to determine the rate of deterioration, especially if the health analysis is based on data gathered at periodic intervals of low frequency. To reduce this burden of risk created by the uncertainty of when an incipient failure may become catastrophic, retirement and replacement of the transformer is many times the most prudent response. The resulting irony is that while maintenance reduces the probability of certain modes of failure, it heightens the risk awareness for other modes of failure.
The PM Practices employed by
· Improved reliability-reduction in catastrophic failures
· Extended operating life-longer return on initial capital investment
· Reduced risk
· An ability to overload the transformer without significant loss of life
The objective of this study is to determine if on-line, continuous measurement of dissolved gasses in transformer oil is an effective method of reducing in-service failures of large power transformers, extending their in-service life and significantly reducing overall transformer operating risk. This study focuses on maintenance, financial and risk issues; it assumes that the on-line monitoring system being employed is accurate and reliable and that all other forms of maintenance are performed in a responsible manner. The study will also determined if the expense of installing and operating such monitoring systems provides significant benefit compared to traditional alternate forms of diagnostic testing and maintenance.
In order to meet this objective, the study will examine:
· Critical functions performed by the transformer
· Common modes and causes of transformer failure
· Failure mechanisms
· Industry failure statistics
·
· Costs
· Risks
· Data management
This study is focused on “A Bank” transformers installed at
the Southern California Edison Company and the application of multi-gas on-line
monitors. These 188 transformers are
characterized as large power transformers that supply
Specific characteristics of this transformer population include:
· 220 KV to 115 or 66KV
· 12 to 280 MVA
· Single and Three Phase
· Average Age = 39 Years
· Max Age = 76 Years
· Replacement Costs $3M to $4M (on the pad)
Figure 3.
While the age distribution shown in Figure 3 depicts an aged population with almost half of the transformers exceeding the 40 year useful “nominal” useful operating life one must question if these units can be reliably operated for 60 or more years. One must also question if calendar age is a true measure of operating life. The need to replace any of these units should be based on:
· Technical obsolescence
· Incipient failure
· Excessive risk
This study will try to determine if the implementation of
on-line multi-gas monitoring systems can be an effective method of reducing the
risk of in-service failure of “A Bank” transformers allowing
A transformer is considered not to be an assembly of electrical and mechanical components but rather the integration of a number of specialized functional systems. Each of these systems performs a unique and important function having its own unique modes of failure. These critical functions must be preserved in order to prevent both major and minor failures. Important subsystems and their functions will be described in the following subsections.
Dielectric System:
The dielectric system provides electrical isolation between windings, phases and ground planes, included are all major and minor insulation elements found in the power transformer. The insulation system must be capable of withstanding specified operational electric stresses, considering a permissible level of overloads. The elements of a dielectric system include:
· Paper insulation used in windings
· Solid insulation used for blocking
· Lead insulation
· Insulating oil
· Electrostatic shields
Electromagnetic Circuit:
The electromagnetic circuit includes those magnetic elements that create, contain and couple the magnetic flux. Elements include:
· Core
· Windings
· Magnetic shields
· Grounding circuit
Current Carrying Circuit:
The current carrying circuit includes all conducting elements that carry load current. Included are:
· Winding leads
· Winding conductors
Mechanical System:
The mechanical system provides structural support of the current carrying and magnetic circuits.
· Clamping
· Lead support
Voltage Regulating System:
The voltage regulating system includes controls and tapchangers used to change the effective turns-ratio of the transformer. Included are:
· Load tapchangers
· Diverter switch
· Selector switch
· Contacts
· Drive mechanism
· No load tapchangers
· Voltage regulating controls
Containment System:
The containment system provides a physical boundary between the transformer and the outside world. It insures that the oil does not get contaminated with air or water and that the oil does not leak out into the environment. Included in the system are:
· Tank
· Inert gas system
· Conservator
· Pressure relief device
External Interface System:
The transformer interfaces with other equipment through sets of bushings. These bushing themselves are comprised of:
· Bushing dielectric system
· Insulating paper
· Oil
· Bushing conductor
· Bushing containment system
· Porcelain
· Ground sleeve
· Seals
Cooling System:
The cooling system includes all peripherals and auxiliary equipment required to ensure the transformer can carry rated load at a temperature that does not lead to a pre-mature loss of life. Components included in the cooling system include:
· Pumps
· Fans
· Radiators
· Temperature gages
· Cooling controls
It is extremely important to understand how a transformer fails before making the decision to implement any type of maintenance strategy whether it is traditional maintenance or the application of an on-line monitoring system. Failure to understand and correctly understand the failure mode distribution can result in the application of a technology that has little impact on improving overall reliability or extending the life of the transformer.
It is also important to differentiate between minor and major failures. For purposes of this document, IEC 60694 definitions will be used:
Major Failure:
A major failure will result in an immediate change in the system operating conditions, e.g. the backup protective equipment will be required to remove the fault, or will result in mandatory removal from service within 30 minutes for unscheduled maintenance or will result in unavailability for required service.
Minor Failure:
Failure of an equipment item other than a major failure or any failure, even complete, of a constructional element or a sub-assembly which does not cause a major failure of the equipment.
Modes of Failure
Each of the previous described systems and critical functions has a dominant mode of failure. Each of these failure modes generally has only a few leading causes and each cause has its own set of pre-cursors. If a pre-cursor exists, there is good likelihood that the failure can be detected in its incipient stage. The challenge for successful incipient detection is to employ a maintenance practice that is sensitive enough to detect the pre-cursor and at an interval that is shorter than the time period it takes for the incipient failure to become an actual failure.
Aging and Failure Rates
In order to pick the appropriate strategy for preventing failures and extending the life of an asset, one must clearly understand how the asset and its functional systems age. Along with understanding the aging mechanism, it is important to identify the precursor conditions prior to failure.
Effective condition monitoring strategies requires that an event or condition take place prior to failure so that intervention can take place to prevent the failure. To be successful, the condition monitoring strategy must be able to clearly differentiate between an acceptable operating condition and an approaching failure yet provide adequate time for pre-emptive maintenance response. Several of the previously described age-reliability curves, suggested that a condition monitoring approach may be appropriate. To better determine if condition monitoring is applicable, the failure pre-cursor conditions must be understood. Four general pre-cursor conditions are described below:
Some failures occur with little or no warning or require an external event to initiate. These types of failures, while random must be prevented by design enhancements or the addition of “safety features”. For power transformers, insulation failures caused by lightning or switching surges are of this type and difficult to predict; installation of Surge Arresters is an appropriate preventive approach.
Figure 10 "No Warning" Failure Pre-cursor
Some failures have a recurring pattern of failure followed by a short recovery period. These types of failure patterns can be corrected if the root cause of the problem is properly identified. Lubrication problems associated with infrequently used mechanical devices are a typical example; operation of the mechanical device temporarily rejuvenates the lubrication.
Figure 11 "Temporary" Failure Pre-cursor
Slowly deteriorating items are good candidates for condition monitoring if a reliable condition indicator can be identified. An “end-of-life” condition trigger must be properly identified so to allow pre-emptive action can take place and that a surprised increase in the rate of deterioration does not result in a failure that could have been avoided.
Many of the pre-cursor failure conditions identified by DGA
are of this type.
Figure 12 Slowly Deteriorating Pre-cursor
Rapidly deteriorating items are also good candidates for condition monitoring if a reliable condition indicator can be identified. An “end-of-life” condition trigger is easier to identify since the probability of prematurely taking pre-emptive is low. Some of the pre-cursor failure conditions identified by DGA are of this type.
Figure 13 Rapidly Deteriorating Pre-cursor
It is clear that three of the four pre-cursor patterns identified above are good candidates for application of a condition monitoring maintenance strategy. The strategy must:
· Significantly reduce the probability of failure
· Be technically effective
· Be economically effective
· Provide Maintenance and Operations with sufficient warning so they can intervene
· Result in higher reliability than other traditional strategies
· Be continuously managed since the time between the onset of a failure and an actual failure may be quite short.
· Result in an overall reduction in risk.
The previous sections focused on the ways a transformer can fail. In order to develop a predictive failure model, these theoretical failure modes must be “calibrated” with actual failure and trouble experiences.
Industry Experience
The utility industry has been relatively active in sharing information about failures, but very in-active when it comes to sharing failure statistics; in-other words, the industry knows about specific catastrophic failure events but knows very little about major failures that do not cause widespread outages, minor failures, failure rates and retirements; information that is necessary to develop accurate statistical transformer life models. One of the leading utility insurers, Hartford Steam Boiler Inspection and Insurance Company has published some of its observations based on data collected by the International Association of Engineering Insurers (IMIA) for the years 1997 through 2001 involving 94 reported failures. These observations are summarized in Table 3 below and give us only limited insight into the failure process.
Cause |
Impacted
System |
Percent of
Reported Failures |
Insulation failure |
Dielectric |
26% |
Manufacturing failure |
Unknown |
24% |
Unknown |
Unknown |
16% |
Loose connections |
Current Carrying |
7% |
Improper maintenance |
Unknown |
5% |
Overloading |
Dielectric |
5% |
Oil contamination |
Dielectric |
4% |
Line surges |
Dielectric/Mechanical |
4% |
Fire/explosions |
Dielectric/Containment |
3% |
Lightning |
Dielectric |
3% |
Floods |
Containment |
2% |
Moisture |
Containment |
1% |
Total |
|
1.00% |
Table 3 Cause of Transformer Failures as reported to IMIA-Source: HSB and Cigré
A summary of failures aggregated by the affected
transformer system is shown in Figure 14. It is only
possible to make some generalized conclusions from the data published by
HSB. These generalized conclusions
include:
· The number of reported failures appears to be much smaller than the actual number of failures experienced by the industry making statistically valid conclusions impossible.
· Only relative relations between failure modes can be made.
· Unknown and insulated related failures dominate the modes of failure.
Figure 14 Cause of Transformer Failure by Impacted System
Figure 15 “A Bank” Failure Events
While root cause analysis did not take place for many of the failures, the system initiating the failure could be determined by assuming that the cause of the defect was associated with the system containing the defect. The distribution of failures by impacted systems is summarized in Figure 16 below.
When comparing the failure distribution experienced
by Edison to that reported by HSB, it is interesting to note that insulation
failures are the dominant mode of failure for both yet LTCs are also a
significant mode of failure for
Figure 16 “A Bank” Failure Distribution by Impacted System
Many times troubles are a sign of pending
failure. Current troubles being
experienced by
Table 5 Current “A Bank” Problems
Figure 17 “A Bank” Trouble Distribution
It is interesting to note that
while dielectric problems are the dominant cause of failures, they are not the
leading cause of troubles experienced by
Summary of Experiences
Both
The linkage between troubles and eventual failures appears to be clear for bushings but the linkage between insulation failures and reported trouble experiences is much less obvious even though one could argue that inert gas problems and oil leaks result in a degradation of the dielectric system.
The utility industry does not have a good grasp on the
expected life of a power transformer. Insurance companies predict an expected
life of 35 years, regulators and accountants use a book life of 40 years yet
· Linear or constant failure rate
· Hartford Steam Boiler (HSB)Transformer failure model
·
Weibull equation based on
Linear Model
The linear or constant rate failure model is the
simplest and widest used predictor of transformer life. Simply stated,
Figure 28 Linear Failure Model (Estimated and Actual)
HSB Model
The Hartford Steam Boiler Inspection and Insurance Company has published several transformer failure models based on the 1825 work of a statistician named Benjamin Gompertz. The latest variation of the Gompertz model developed by W. M. Makeham is used in this analysis. The HSB model is:
f(t) = A + αeβt
Where:
A is a constant for random failures caused by lightning, vandalism, switching, etc.
A = 0.5
α = scaling factor
= 0.00007346
β = time constant
= 0.176190651
The expected annual failure rate over time predicted by the HSB model is shown in Figure 29 below.
Figure 29 HSB Failure Model (Estimated and Actual)
Weibull Model
The Weibull distribution is one of those generic models that has worked well to accurately describe the age-reliability relation of many failure modes. The primary advantage of Weibull analysis is the capability to provide accurate failure analysis and risk predictions with extremely small samples. Solutions are possible at the earliest stage of a problem without requirements to “fail a few more”. For purposes of determining optimum maintenance interval, the two-parameter density function is use. The Weibull function is:
Where: t = failure time
β = wear characteristics
β < 1.0 indicates infant mortality
β = 1.0 indicates random failures
β > 1.0 indicates wear out failures
η = Characteristic life (time when 63.2% have failed)
The expected annual failure rate over time predicted by the Weibull model is shown in Figure 30 below.
Figure 30 Weibull Failure Model (Estimated and Actual)
Prediction Comparison
Each of the above models appears
to reasonably estimate the expected failure rate of a relative young
transformer population. Application of
the models to the current
Figure 31 Predicted Failure Rates for the
Current
Failure Pre-cursor Model
In order for any condition monitoring technique to be effective, there must be a monitored attribute that changes state prior to failure. This monitored attribute must have at least three distinctive states:
· Acceptable state
· Incipient failure state
· Failed state
The change of states must be recognized with ample response time to allow intervention and failure prevention. For transformer dielectric failures, the failure precursor or incipient state is the appearance of various dissolved gasses in the oil. While the industry has significant successful experience of detecting and mitigating slowly evolving dielectric failures using traditional DGA sampling techniques, it has less favorable experience with quickly developing failures. It is this latter set of failures that multi-gas online monitors will detect at an early stage and prevent the occurrence of a major failure.
To account for this varying failure pre-cursor, a simple model was employed that assumed a wide range of times between the onset of a failure and the eventually catastrophic loss of the transformer. The model assumed a normal distribution or “bell curve” of failure pre-cursor gas evolution. Normal distributions are a family of distributions that have the same general shape. They are symmetric with more events concentrated in the middle than in the tails. Normal distributions are sometimes described as bell shaped. The area under each curve is the same. The normal distribution can be specified mathematically in terms of two parameters: the mean (μ) and the standard deviation (σ).
In this analysis, parameters used to develop the normal distribution were:
· Mean incipient failure time in months (mean time from abnormal gas generation to failure)
· Standard deviation of incipient failure times.
A graphical presentation of the Failure Pre-cursor model is shown below in Figure 32. The model assumes:
· Mean time between the evolution of abnormal combustible gas and a major failure is 14 months (incipient fault time).
· The standard deviation for an incipient fault is 6 months
· Periodic DGA sampling is on a 12 month interval
· The probability of detecting a major failure with periodic DGA is 63%
Figure 32 Failure Pre-cursor Model
The financial model looks only at the multi-gas sensor application for a typical transformer and will include:
· Installed cost
· Annual probability of failure
· Probability of catastrophic failure
· Collateral damage
· Probability of collateral damage
· Various PM Responses
· No Maintenance
· Periodic DGA
· On-line monitoring
· Replacement cost
· Indirect failure effects
· Environmental
· Political
· Contractual
· Regulatory
· NPV
· IRR
· Etc.
The Analysis
An economic analysis on the impacts of dielectric maintenance was performed on an “average A Bank” transformer. The analysis looked at the expenditures and benefits expected over a 20 year period. The analysis examined three maintenance scenarios:
· No maintenance of the dielectric system
· Periodic DGA
· On-line monitoring
Important parameters and assumptions employed by the model were:
|
Parameter |
● |
System Analyzed |
● |
Average Age |
● |
Replacement Cost (on the pad) |
● |
Cost of Non-Catastrophic repair |
● |
Percent of Major Failures Catastrophic |
● |
Probability of Collateral Damage |
● |
Failure Model |
● |
DGA Effectiveness |
● |
On-line Monitoring Effectiveness |
● |
Insurance coverage |
● |
Annual DGA Cost-Loaded (1 sample per year) |
● |
Cost of On-line Monitor |
● |
On-Monitor O&M Cost |
● |
Weighted Average Cost of Capital (WACC) |
Table 6 Key Financial and Technical Parameters
Benefits Used in the Model
The model included direct benefits associated with reduced O&M costs as well as some indirect costs associated public relations. Benefits available in the model the model are listed in Table 7.
|
Benefit |
Included |
● |
Reduced Failure Rate |
Yes |
● |
Reduced Maintenance Costs |
Yes |
● |
Cost of Non-Catastrophic repair |
Yes |
● |
Probability of Collateral Damage |
Yes |
● |
Outage time |
Yes |
● |
Environmental Cleanup |
Yes |
● |
Customer Claims |
Yes |
● |
Public Relations |
Yes |
● |
CPUC |
No |
● |
Replacement Power |
No |
Table 7 List of Benefits Available and
Included in the Financial Model
The Failure Model
The Weibull failure model was chosen over the other two models for three key reasons:
·
The model was derived from 15 years of actual
·
The HSB Model predicts a failure rate much higher
than
· The insulation failure mechanism is partially a function of the strength of the insulating paper. The strength of the paper decreases with age resulting in an expected higher probability of failure. A constant failure rate does not accurately model this known aging process.
The resulting Weibull aging model derived from 15 years of failure history is shown in Figure 33. The X-axis is the Cumulative Probability of Failure and the Y-axis is age. Since the available data only included 15 years of failures it predicted that 63% of the transformer insulation systems would survive until 165 years of age. This is not a flaw in the model but represents the fact that 61 years of failure were omitted from the analysis since no failure data was available for years 1928 through 1988. In order to correct for this missing data, the model was “calibrated” to achieve a 2005 failure rate that equals the 15 year average. This “calibration” now predicts that 63% of the transformer insulation systems would survive until 101 years of age.
Figure 33 Weibull Model of
The Results
The economic analysis revealed positive results for on-line monitoring as compared to no maintenance. The on-line monitoring program has excellent financial results:
· Initial Investment = $34,000
· Annual O&M Costs = $3,000
· IRR = 42.03%
· NPV = $110,738
· Payback = 5 Years
Figure 34 Cumulative PV Cash Flow of On-line Monitoring vs. No Maintenance
While the business case for installing an on-line monitoring system is excellent compared to performing no maintenance, it must also be compared to the current periodic DGA program. It is no surprise that the current periodic DGA has even better financial performance since there are no initial capital costs and the technical aspects of DGA are well proven. The economic results associated with periodic DGA testing were:
· Initial Investment = $5,000 (need to start a new program)
· Annual O&M Costs = $250
· IRR = N/A
· NPV = $104,289
· Payback = Immediate
Annual cash flow comparisons of all three options are shown in Figure 35 below.
Periodic DGA can be summarized as
a low-cost investment with a fantastic rate of return. The benefit it has historically provided
Annual Reductions in Expenditures
associated with On-line Monitoring Annual Expenditures
associated with No PM Maintenance program Annual Expenditures
associated with On-line Monitoring Annual Expenditures
associated with DGA Annual Reductions in Expenditures
associated with DGA
Figure 35 Cumulative PV Cash Flow of On-line Monitoring vs. DGA vs. No Maintenance
Deferred Transformer Replacement
One benefit excluded from the above financial model that has the potential of overshadowing all the other benefits is life extension. While most on-line monitoring will not renew an item (reducing its effective operational age), there is sometimes a desire retire transformers because they are “old” and have served their expected useful life. The main reason for this early retirement is risk reduction and assumed improvement in overall reliability. While there is merit to this approach, there is also the possibility that a portion the transformers available life goes unused. The implementation of on-line monitoring would solve the risk and reliability issues and potentially result in additional operating years. These extra operating years result in a substantial amount of deferred capital expense which can amount several hundred thousand dollars per year (see Figure 36).
Figure 36 Financial Impacts of Extending the Useful Life of a Transformer with On-line Monitoring
It is of low benefit to make an investment in a transformer on-line monitoring system if its overall impact on transformer reliability is minimal. The investment must be framed in the context of all dominant modes of failure and the impact on long term risk. The risk model for the four dominant transformer failure modes will be analyzed along with the risk impacts of various PM responses.
For purposes of this document, Risk is defined as:
Risk = Probability of Failure X
Severity of Failure
The No Maintenance Risk
Figure
37 shows the expected modes of failure distribution for
the current fleet of
Figure 37 Transformer Failure Risk Distibution if No Maintenance is Performed
How Maintenance Affects Risk
Obviously the implementation of a good PM program will reduce the risk of “A Bank” transformer failures. The question of how much risk reduction is achieved if various PM programs are implemented must be answered.
Using the same models described in sections 12 and
13, an analysis of the risk
· No Maintenance
· Existing PM program
· Existing PM with the inclusion of Multi-gas on-line monitoring
The results of this analysis showed a significant decrease in risk (see Figure 38) is currently achieved with the present PM program and meaningful additional reduction in risk can be achieved by implementing a multi-gas on-line monitoring system.
Figure 38 Transformer Failure Risk Associated with Various PM Approaches
On-line monitoring results in a significant reliance on data and communication processes in order to reap the benefits of “just-in-time” maintenance decisions. In order to reap the benefits offered by any on-line monitoring system, one must:
· Have an integrated approach to the storage and analysis of maintenance data
· Correlate with transformer operating data
· Have real-time operating data readily available for analysis and decision purposes
· Develop a 24 X 7 “on-call” process to respond to on-line data warnings
· Ensure communication links are highly available
· Be able to quickly take action when on-line data indicates a failure is imminent
“A Bank” transformers represent an important investment at
With an aging fleet of transformers,
· Improved “A-Bank” transformer reliability
· Reduced failure impacts
· Realization of full transformer useful life
· Identification of units in urgent need of repair/replacement.
· Substantial reduction in overall transformer operating risks
The models utilized in this analysis are at times overly
conservative but demonstrate the technical and economic value multi-gas on-line
monitoring would provide to