Managing Availability for Improved Bottom-Line Results

April

Managing Availability for Improved Bottom-Line Results

EP Editorial Staff | April 1, 2003

The reliability block diagram is the cornerstone of the availability model because it shows how failure in a plant element affects process uptime.

Over the past several years, managers up through the CEO have come to recognize equipment uptime as a key part of any successful operating strategy. Equipment availability is one of the key performance indicators of a maintenance organization. Goals are set based on “gut feel,” or by benchmarking with similar facilities within the organization or with similar organizations within the same industry.

Both these goal-setting methods involve high levels of uncertainty that can lead to overspending for maintenance and overtaxing of maintenance resources. The uncertainty of “gut feel” speaks for itself. Benchmarking involves high levels of uncertainty due to the difficulties created by not knowing the exact guidelines each facility uses for recording unavailability.

Here is a framework for managing availability goals to help meet the financial goals of an organization. We will examine availability in detail: the three types of availability and how they relate to each other, the factors that determine availability, and recommendations for improving the setting of goals.

Availability types
The three subtypes of availability are inherent, achievable, and operational (see Fig. 1 ). Each subtype has specific characteristics determined by:

Inherent availability (A_i): The expected level of availability for the performance of corrective maintenance only. Inherent availability is determined purely by the design of the equipment. It assumes that spare parts and manpower are 100 percent available with no delays.
Achievable availability (A_a): The expected level of availability for the performance of corrective and preventive maintenance. Achievable availability is determined by the hard design of the equipment and the facility. A_a also assumes that spare parts and manpower are 100 percent available with no delays.
Operational availability (A_o): The bottom line of availability. It is the actual level of availability realized in the day-to-day operation of the facility. It reflects plant maintenance resource levels and organizational effectiveness.

It is important to understand the distinctions among the three subtypes in order to design, measure, and manage integrated subgoals:

Achievable availability fulfills the need to distinguish availability when planned shutdowns are included.
Inherent availability fulfills the need to distinguish expected performance between planned shutdowns.
Operational availability is required to isolate the effectiveness and efficiency of maintenance operations.
These definitions and distinctions lead to crucial recognitions:
The shape and location of the achievable availability curve is determined by the plant’s hard design.
An operation is at a given point on A_a, based on whether scheduled or unscheduled maintenance strategies are selected for each failure. A goal of availability-based maintenance operations is to find the peak of the curve and operate at that level.
Operational availability is the bottom line of performance. It is the performance experienced as the plant operates at a given production level.
The vertical location of the A_o is controlled by decisions for resource levels and the organizational effectiveness of maintenance operations. By definition, its location cannot rise above A_a.

These factors have the following strategic implications:

It is crucial to know the location and shape of the achievable availability curve. Otherwise, it is not possible to determine what is reasonable and possible for operational availability and, therefore, plant production.
If the A_a curve is not known, manufacturing operations management may unknowingly attempt to achieve performance beyond that which is possible. The result is the overspending and overtaxing of maintenance resources.
Management must make strategic decisions for long-term relative positions of the two curves. As plant production increases over time, changing operating conditions will place greater stress on equipment and drive A_a down. Meanwhile, maintenance operation management will progressively move A_o upward to meet the demands of production. Eventually the two will converge to the point that additional availability can be acquired only by modifying plant design.

The conclusion from these factors is that eventually A_a must be known. Otherwise, many of the current goals to develop world-class maintenance operations are not possible. It is the organization that makes the most money—not the one with the highest availability—that wins the game.

Determining availability
Availability is a function of reliability and maintainability—in other words, how often equipment will fail and how long it takes to get the equipment back to full production capability. Reliability, maintainability, and therefore, availability, are determined by the interaction of the design, production, and maintenance functions (see accompanying section “Top- Level Factors That Affect Availability”).

The implication is that availability is largely determined by how well designers, operators, and maintainers work together.

Optimizing availability
Profitable plant availability is the result of optimizing A_i, A_a, and A_o. Because no plant can achieve availability higher than A_a, achievable availability is the first to be optimized (see Fig. 2).

All equipment fails based on its design even when operated and maintained perfectly. Every maintenance activity, whether scheduled or unscheduled, is representative of an equipment failure. Scheduled or time-based maintenance seeks to correct failures before they can affect equipment performance. Unscheduled maintenance is corrective maintenance performed as the result of breakdown or the detection of incipient failure.

Achievable availability is the result of several factors:

Plant hard design determines the shape and location of the A_a curve. Therefore, this design establishes the possible achievable availability.
Maintenance strategies determine the plant’s location on the A_a curve. Therefore, these strategies establish the actual achieved availability.
The right extreme of the A_a curve represents the hypothetical extreme of 100 percent scheduled maintenance. There are no surprises because all maintenance is performed during a scheduled maintenance period. Availability is well below optimum. This extreme can be compared to coming into the pits during every lap of a race to ensure that you have no breakdowns on the racecourse. It could be done, but you would never win the race.
Trading off scheduled maintenance for unscheduled maintenance results in a climb back up the availability curve to the left. A nearly linear increase in availability occurs until you reach the point where unscheduled maintenance due to breakdowns takes away from availability gains. Operating farther to the left places the equipment under more stress and increases organizational chaos.
After reaching the left of the peak A_a, further reductions in scheduled maintenance become poor strategies.

The cost curve represents strategic decisions to invest large amounts of capital up front to increase A_a through hard design, or to spend operating dollars to increase A_a through more intensive maintenance strategies. These decisions are driven by many factors, such as the need to get a product to market quickly, the availability of capital, and the operating mentality of the company.

Availability and costs
The availability/cost curve relationship highlights the fact that availability is a proxy of revenues. At some point of either extreme of the cost curve or the availability curve, the cost of availability will exceed the income it allows. Without availability management, operating beyond those intersections can occur without management’s awareness; normal accounting practices and other maintenance performance indicators cannot easily reveal this practice.

The difference between achievable and operational availability is the inclusion of maintenance support. Achievable availability assumes that resources are 100 percent available and no administrative delays occur in their application. Therefore, maximum operational availability theoretically goes to achievable availability. In reality, every human endeavor has a natural upper limit of obtainable perfection that prevents A_o from reaching A_a.

The shape and location of the operational availability curve are determined by the level of maintenance operation resources and organizational effectiveness. Resources and organizational effectiveness have upper bounds above which additional spending will not yield better results. At that point, achievable availability must be increased to give A_o room to move upward. A_a can be increased by new maintenance strategies, provided that the plant is not operating at the peak of the A_a curve. Capital investment is required to move the A_a curve upward if the plant is operating on the peak.

This is important. Without availability engineering and management, it is easy to unknowingly spend beyond the point of maximum return. This may occur when plant performance falls short of management’s desired productive capacity. Management tries to achieve gains with increased stress on maintenance support. However, the operational availability curve has already been unknowingly forced against the achievable availability curve. The result is throwing good money after bad. Spending is in the loss zone to the right of the intersection of the achievable availability and cost curves.

Determining achievable availability for an existing facility
Few physical asset managers have had the luxury of being an integral part of the design phase of their physical plant. Therefore, they need to analyze the current physical plant to determine its achievable availability.

Determining achievable availability is a four-step process:
1. Build a reliability block diagram (RBD) of the plant’s critical systems. Use publicly available reliability data for failures. Using plant data skews the results based on plant organizational effectiveness. Use plant data or works estimation techniques to determine mean time to repair. Again, using plant data skews the results based on plant organizational effectiveness.

2. Determine logistical delays created by plant hard design: to/from shops, to/from stores, accessing equipment.

3. Add in scheduled maintenance downtime for the chosen preventive maintenance strategy.

4. Perform availability simulations.

The scope of the analysis is determined by resources, time, and the desired quality of the result.

Building the RBD
The RBD is a graphical representation of the plant systems, subsystems, and components arranged in a way that reflects equipment interdependence (see Fig. 3). The RBD is the cornerstone of the availability model because it shows how failure in a plant element affects process uptime.

It is important to note the reliability implications of the systems presented in Fig. 3. Serial systems are inherently unreliable. The failure of a single element in the system results in a stoppage of the overall system. Fully redundant parallel systems are inherently reliable. The system stops only if all the redundant systems fail at the same time. Redundancy is an important tool in improving overall system reliability. See Practical Machinery Management for Process Plants, Volume 1, 3rd Edition: Improving Machinery Reliability by Heinz P. Bloch for a more complete discussion.

All complex machines are built from the same few basic machine elements of couplings, bearings, gears, motors, belts, and so on. The RBD is refined by breaking down the top-level RBD into several RBDs that represent each top-level system (see Fig. 4).

Obtaining failure and repair data
After the RBDs are built, failure and repair data must be obtained for use in availability simulations. Obtaining this data is a time-consuming task. The desired degree of certainty dictates the level of effort required for this stage of building the model. It is important to remember that this is not an exact science. Perfection is not required. You need only be better than your toughest competition

There are many sources for failure data. This is not an exhaustive list:

Reliability Analysis Center—Electronic parts reliability data (EPRD), non-electronic parts reliability data (NPRD). Available in print and software versions.

Paul Barringer’s Web site—Weibull data for many components plus links to other available data and reliability web sites.

Practical Machinery Management for Process Plants, Volume 1, 3rd Edition: Improving Machinery Reliability, Heinz P. Bloch, Gulf Professional Publishing, ISBN: 087201455X—Table of equipment failure data plus practical information on improving equipment and system reliability.

Plant data—Failure data depends on the robustness of the data-collection system. Using plant data skews the analysis by including plant organizational effectiveness.

Binomial and Weibull distributions typically are used to present failure data for modeling purposes. Most availability simulators accept either type of data.

Obtaining repair data is a much more difficult task. Repair data is typically not available anywhere in tabular form. Repair times are very dependent on the configuration of the equipment and the plant. Equipment with a great deal of guarding and with parts located in tight spots requires much longer repair times than equipment with little guarding and plenty of space in which to work. The two primary methods of obtaining repair time data are analyzing current plant data and using works estimation systems such as MOST to estimate times (see accompanying section “Obtaining Repair Time Data”). Each method has its own set of difficulties.

In the next installment of this article, we will discuss using the availability model to determine plant bottlenecks and increase throughput, the impact of the need for modeling and analysis on the maintenance and engineering organization, and offer suggestions on how to close the natural gaps between the three types of availability. MT

Bill Keeter is president of ARMS Reliability Engineers-USA, LLC, 8450 N. Devonshire Woods Pl., West Terre Haute, IN 47885; (812) 535-1445

THREE TYPES OF AVAILABILITY

Fig. 1. It is important to understand the distinctions among the three subtypes in order to design, measure, and manage integrated subgoals.

back to article

OPTIMAL AVAILABILITY/COST

Fig. 2. Because no plant can achieve availability higher than achievable availability, A_a, it is the first to be optimized.

back to article

RELIABILITY BLOCK DIAGRAMS

Fig. 3. The RBD is the cornerstone of the availability model because it shows how failure in a plant element affects process uptime.

back to article

BREAKING DOWN RBDs

Fig. 4. Breaking down the top-level RBD shows several RBDs that represent each top-level system.
back to article

TOP-LEVEL FACTORS THAT AFFECT AVAILABILITY

Reliability Is increased as the frequency of outages is reduced. Time between failures or shutdowns is increased.	Maintainability Is increased as the duration of plant, subsystem, or equipment downtime is reduced.
Reliability Factors Driven by Design Operating environment Equipment rated capacity Maintenance while the system, subsystem, or item of equipment continues to function Installed spare components within an equipment item Redundant equipment and subsystems Simplicity of design and presence of weak points	Maintainability Factors Driven by Design Accessibility to the work point Features and design that determine the ease of maintenance Plant ingress and egress Work environments
Factors Driven by Maintenance Decisions
Preventive maintenance based on failure-trend data analysis Trend diagnoses and inspection of equipment conditions to anticipate maintenance needs Quality of maintenance tasks (including inspections) Skills applied to maintenance tasks	How maintenance tasks are detailed, developed, and presented to the maintenance technician Quality of the system of maintenance procedures The probability of human, material, and facility resources being available to maintenance tasks Training programs Management, supervision, and organizational effectiveness Durability of handling, support, and test equipment
Factors Driven by Operations Decisions
Use of equipment relative to its rated capacity How spares are incorporated in normal process operation Shutdown and startup procedures	Organizational effectiveness as a factor in the troubleshooting process Organizational effectiveness and procedures to ready equipment for maintenance and startup
(Source: Availability Engineering and Management, Richard G. Lamb, Prentice Hall; ISBN: 0133241122)

back to article

OBTAINING REPAIR TIME DATA

Method	Advantages	Limitations
Analyzing plant data	Usually does not require special training Data is usually available in plants that have mature maintenance reliability programs	Data may be unreliable Data is affected by organizational effectiveness
Works estimation system	Eliminates organizational effectiveness as a factor Provides a good standard against which to judge actual achieved repair times Provides detailed work steps and procedures	Requires training on the system used Requires much time to analyze the equipment and break repairs into tasks

back to article