Maintenance Interval Optimization Program

 

John E. Skog

Doble Consultant

Maintenance and Test Engineering Co.

January 16,1999

 

 

Introduction:

The goal of Reliability Centered Maintenance (RCM) is to preserve the critical functions of equipment and systems. In pursuit of this goal, RCM does an excellent job of identifying the most appropriate and effective maintenance task necessary to achieve maximum equipment and system reliability.  It is understood that the resultant maintenance tasks are both technically and economically effective.  What is not included in RCM is the exact determination of appropriate task frequency.  Task frequency determination is as much of an economic decision as a technical decision based upon the probability and consequence of failure.  The techniques presented in this paper will provide a basis for determining maintenance task frequencies. These techniques have provided good service in many cases but are open for modification and improvement.

 

The Keys to Interval Determination:

Fundamentally, the process of determining an optimal maintenance interval is based ones knowledge and understanding of how the following 5 key items interact:

 

The cost of preventive maintenance, PM

The cost of repair or corrective maintenance, CM

The direct effects of a failure

The indirect effects of a failure

The aging mechanism of the item being maintained

 

These five keys will be discussed in greater depth.

 

The Effects of Failure:

As part of an RCM analysis, the effects of a failure are well documented.  These effects are generally categorized in terms of:

 

ú        Local Effects

ú        System Effects

ú        Remote Effects

 

Knowing the physical and operational effects of a failure makes quantifying the failure effects in economic terms relatively straightforward. To do so, one needs to assign costs to:

 

ú        Repair/replacement labor

ú        Repair/replacement materials

ú        Residential customer impacts

ú        Commercial customer impacts

ú        Industrial customer impacts

ú        Environmental impacts

ú        Revenue impacts

ú        Political impacts

ú        Litigation Impacts

 

These failure effects will be discussed in more detail.

Repair Impacts:

The repair cost of a failure is generally easy to estimate.  Utilities know the cost of replacement parts and equipment as well as the book value of the failed item.  Material, replacement labor and overhead costs are either known from experience or accurately estimated.

 

Customer Impacts:

Determining customer impacts is significantly more difficult to estimate and is generally not openly acknowledged by the utility.  Customer outage impacts are important  to understand, especially during this time of deregulation, since they provide a whole new insight into the value of maintenance.  One way to determine these impacts is to understand the value of electric service and quantify what the loss of service means in terms of inconvenience and lost revenue.

 

An estimate of the total value of electric service associated with transmission and distribution outages contains four major elements:

 

ú        Average financial impact of an outage segregated by customer class

ú        An outage cost modifier based upon duration

ú        An outage cost modifier based upon season

ú        An area modifier reflecting different customer mixes

 

Outage Cost by Customer Class:

Customer class typically segregates outage costs. For residential customers, this impact is generally a function of the number of outage events and their duration. For commercial and industrial customers, the impact is also a function of  kW or kWh. In order to incorporate standard outage estimates into a cost model, the following is assumed:

 

ú        Commercial and industrial outage costs per kW can be converted to average hourly loads for those groups based on the current  daily,  monthly or yearly consumption levels.

ú        Unless otherwise determined, the mix of customers on a typical feeder or substation is assumed to be the same as the overall customer mix.  A variation of this assumption is that the load from customers served directly from the transmission system is not included and treated separately.

 

Cost estimates for outage impacts may be based upon a 1991 EPRI study.[1] This study indicated the amount customers were willing to pay to prevent an outage. The results of the EPRI study are summarized in table 1.

 

Willingness To Pay

.25 Hour a.m. Outage

1 Hour a.m. Outage

8 Hour outage

Residential Class

$3.25*

$3.75*

$7.25*

Commercial Class

$0.00565116**

$0.00692053**

$0.00524739**

Industrial Class

$0.01551107**

$0.01847655**

$0.02183692**

 

Table 1.

Customer Willingness to Prevent Outages

*Outage cost per customer (residential only)

**$/Annual kWh (commercial and industrial)

Outage Duration:

Since the duration of an outage is variable, a linear approximation of the above is necessary. The above outage impacts may be restated in the form:

 

Outage cost = b+ a * outage duration (hours)

 

Class

Accuracy Range (hours)

Fixed cost per event

b

Cost per hour duration

a

Residential

0.5-8

$2.93*

$0.50*

Commercial

0.25-8

$50.32**

$7.37**

Industrial

0.25-8

$144.58**

$5.97**

 

Table 2.

Peak Period Outage Cost by Class and Duration

*Outage cost per customer (residential only)

**$/average hourly kWh consumption (commercial and industrial)

 

The average outage costs per industrial and commercial customer derived from table 2 reflect outages during the workday.  Adjustments to reflect the fact that outages occur randomly during the week may be necessary.  In order to adjust commercial and industrial outage costs, the following should be considered:

 

ú        The number of hours during the week the business is open

ú        Outage impacts for business in operation are five times that when the business is closed.

ú        Peak period loads for commercial and industrial customers are typically significantly higher than average hourly loads.

 

The resultant impact of the above is a 50% reduction in the numbers found in table 2.

Seasonal Modifiers:

Seasonal modifiers for the residential customer class appear to be in order.  In those areas with extreme climates, outage impacts to both the customer and the utility should reflect the affects of:

 

ú        Fuel adjustment costs

ú        Discomfort associated with loss of space heating or cooling

 

Seasonal adjustment factors in the range of 0.5 to 2.0 may be justified.

 

Outage Cost Equation:

The postulated equation for outage costs is:

 

            Cost = ENi * EPi * EDi * Sc(Wc*OC F(c,t) * Sc * Lc)

 

Cost = Outage cost in dollars

ENI = expected number of customers for event I

EPI = expected probability of event I occurring

EDI = expected duration of event I

Wc = weight for class C

OC F(c,t) = outage cost as a function of class C and duration T

Sc = Seasonal adjustment factor for class C

Lc = Load for class C

 

If no seasonal adjustment factor is used, the equation can be reduced to:

 

Cost = ENi * EPi * (EDi *a + b)

 

 

Example Customer Outage Costs:

The average customer outage costs (excluding transmission customers) for a utility having an 89% residential, 10.5% commercial and 0,5% industrial mix is:

 

 

 Customer Outage Cost = Number of Customers * [(Outage Duration in Hours *

$4.25 )+ $35.00]

 

 

 

Environmental Impacts:

The failure of some electrical equipment may have significant environmental and clean-up consequences.  If the insulating media contains PCB the cost of cleanup and disposal may far exceed any other cost element.

Revenue Impacts:

Costs associated with outages include the cost of restoration and the net revenue impact of lost kWh sales.  The impact of lost sales revenues of $0.02 to $0.05 per kWh.  For the customer mixed noted above, this lost revenue is in the range of  $0.05 to $0.12 per customer outage hour.

Political Impacts:

Long duration outages or those affecting sensitive loads sometimes receive adverse attention from the media or government regulators.  With performance based rate making not too far off in the distant future, reliability will be tied directly to a Utility’s Rate of Return.  The negative out fall of this attention may be economically quantified in terms of political impacts. 

 

Litigation Impacts:

Energy is vital to all business and sectors of society.  Loss of electrical power, even for a short duration has impacts on these groups.  Loss of electrical power due to negligence or imprudent maintenance practices may result in litigation activities. The cost of these activities may overshadow any of the other costs described above.

Aging Model:

Early RCM works in the aerospace industry revealed that failure processes were not as simple as traditionally thought.  The idea of most equipment following the traditional bathtub age-reliability curve was dispelled.

 

 

 


Figure 1.

Typical Age-Reliability Curves

 

 

The key to determining the optimum maintenance interval for a system or device is first, understanding which failure modes can benefit from periodic maintenance and then second, accurately modeling the aging process.  

 

Various models have been developed to predict the onset of various failure models.  Some models have rendered good generic service while others have been useful for only specific failure modes. 

 

The Weibull distribution is one of those generic models that has worked well to accurately describe the age-reliability relation of many failure modes.  The primary advantage of Weibull analysis is the capability to provide accurate failure analysis and risk predictions with extremely small samples.  Solutions are possible at the earliest stage of a problem without requirements to “fail a few more”.  For purposes of determining optimum maintenance interval, the three-parameter density function is use. The function is:

 

                        F(t) = 1 – e –((t-t0)/h)b

 

            Where:            t  = failure time

                                    t0 = guarantee period of no failures

                                    b = wear characteristics

                                                b < 1.0 indicates infant mortality

                                                b = 1.0 indicates random failures

                                                b > 1.0 indicates wear out failures

h         = Characteristic life (time when 63.2% have failed)

 

Maintenance Interval Determination:

Now that the total cost of a failure has been quantified and a model for the probability of failure determined, task frequency determination can commence.  The goal of a task frequency determination process is minimizing total ownership costs.  From our models, we know that as aging take place, the probability of failure also increases.  We also know that maintenance may reduce the probability of failure by returning the equipment to a “AS NEW” condition.  The challenge is determining the maintenance interval where all costs are minimal. 

 

Total costs need to be reviewed from two separate directions, from the customer’s eyes and the utility’s eyes. From the customer’s point of view, all costs need to be considered, from the utility’s point of view, customer impacts are ignored or discounted.

 

Total Costi = Costm +  (ENi * EPi * (EDi *a + b)) + (EPi  * ERi )   

Costi = Outage cost in dollars

Costm  = Cost of maintenance

ENi = Expected number of customers for event i

EPI = Expected probability of event i occurring

EDi = Expected duration of event i

ERi = Expected repair cost of event I

a = Cost per minute of outage

b =  Fixed cost per outage

 

Model Inputs:

The optimization model has several user inputs necessary to compute customer and utility failure impacts as well as modeling the system/equipment failure mechanism. These impacts are described below and are in the sequence found on the models input page.

 

Failure Mode:

The failure mode is a general description of the intended failure to be reduced or eliminated by Preventive Maintenance.

 

 

Monetary Denomination:

The currency units used in the model. This entry is applied to multiple outputs of the model in order to make them more user friendly.

 

Device/Apparatus (singular):

The singular name or title of the device or system to be analyzed.

 

Device/Apparatus (plural):

The plural name or title of the device or system to be analyzed.

 

Aging Units:

The scalar quantity used to measure equipment or system aging.  This also serves as a multiplier in order to input and output units in a similar range for various types of equipment and systems analyzed.

 

Current Population:

The number of similar systems or equipment represented by this model.

 

Present Maintenance Cycle:

The current maintenance plan.  The value should be in the range of .5 to 20.  It is should be expressed as multiples by the Aging Units above.

 

PM Maintenance Cost:

The cost of performing the maintenance task on one system or apparatus.

 

Average Number of Customers Affected:

Used to determine customer impacts. Enter in the number of residential, commercial and industrial customers affected by the failure of an average system or device.

 

 

Average Outage Impact Duration (Minutes):

The average number of minutes each customer class will be affected when a failure occurs.

 

Average Load per Customer Interrupted:

The average kW demand of commercial and industrial customers.

 

Average Impact Cost per Customer per Event:

The impact of an outage on residential customers excluding the effect of outage duration.

 

Average Impact Cost per Customer per Outage Minute:

The effect of outage duration on each customer class. This coupled with the kW loading will be used to determine commercial and industrial outage impacts.

 

Equipment Repair/Replacement Cost per Failure:

The average cost to repair a failure and return to an “AS NEW” condition.

 

No Failure Period:

The period after maintenance or installation when the probability of failure is virtually zero.

 

Wear Parameter:

The shape function, or beta, of the Weibull failure equation.  A beta less than 1 indicates infant mortality with a decreasing failure rate as the system/apparatus ages. A beta of 1 indicates random failures. A beta greater than 1 indicates failure rates increase with age.

 

Age after no failure period, when 63% would fail if no maintenance were performed:

The characteristic life of the equipment/system plus the guarantee or no failure period described above.  An estimate of how long the population can survive without maintenance before 63% have experienced 1 failure.  

 

Average Age Since Last Maintenance:

The current average age of the population under study. The average age of the population since last maintenance.  It is assumed that maintenance will return the equipment/system to an “AS NEW” condition.

 

On-line Monitoring and Continuous Diagnostics:

This section provides financial and life expectancy information for evaluating the concept of continuous diagnostics and on-line monitoring.  The basis for the analysis is to replace all routine maintenance activities with a continuous diagnostic program.  The program will prompt maintenance personnel to intervene just prior to end of life.

Model Outputs:

The outputs of the optimization model are both tabular and graphical.  The tabular output is listed just below the inputs. Graphical outputs are displayed on the next page of the workbook.

 

Customer Impacts:

Lists the estimated financial impacts on each customer class for a single outage.

 

MTTF:

Mean time to failure for the system or equipment if no maintenance was performed.

 

Optimum Maintenance Interval based on PM & CM Cost Only:

The optimum maintenance interval predicted by the model if customer impacts are excluded.

 

Optimum Maintenance Interval based on PM, CM and Customer Costs:

The optimum maintenance interval predicted by the model including customer impacts.

 

Maintenance Costs for various PM Intervals:

A list of costs and customer impacts for selected maintenance intervals.

 

Average Number of Failures:

The expected number of failures for the various maintenance intervals.

 

On-line Monitoring and Continuous Diagnostics:

This reflects the range of capital investments one may make in On-line Monitoring and Continuous Diagnostics equipment.  The maximum value accounts for customer impacts that would result if a failure should take place along with the elimination of PM and CM costs. The minimum investment figure does not include customer impacts.

 

 

Know Problems:

The model estimates the number of failures for a discrete number of maintenance intervals.  It is assumed that the probability of a device failing multiple times in a maintenance period is low. In some cases, as the maintenance cycle is lengthened, the assumption is in error.  In these cases, the cost of maintenance and the impacts of failure appear to decrease with age.

 

 

 

John E. Skog P.E.

Maintenance and Test Engineering Co.

360.352.9977

skogje@wln.com



[1] Cost-Benefit Analysis of Power System Reliability: Determination of Interruption Costs for the Bonneville Power Administration, EPRI EL-6791