Manage STO Risk
EP Editorial Staff | April 25, 2019
By Drew D. Troyer, CRE, Contributing Editor
A shutdown, turnaround, or outage (STO) is critical to plant reliability. A smooth one restores the plant’s reliability and enables trouble-free
production until the next STO. A rough one sets the stage for a subsequent period of painful reactive maintenance, which can compromise reliability, productivity, quality, even safety and/or environmental performance.
Successful STOs require proper planning, scheduling, and execution of proposed work scopes. The challenge stems from having numerous craftspeople, often from multiple contractors, executing so many different jobs, all of which place demands on the same permitting organizations and frequently require the same special tools, e.g., high-capacity cranes.
Unfortunately, organizations tend to repeatedly make the same mistakes in STO planning, coordination, and execution. There are several reasons why, starting with frequency: Most plants don’t execute major STOs very often, i.e., three-to-five-year cycles aren’t uncommon. Between times, a plant may experience substantial turnover, meaning institutional STO knowledge thins. The solution: risk analysis and management.
Most reliability professionals are familiar with root-cause analysis (RCA) and failure-modes-and-effects analysis (FMEA). The first, RCA, is effective for post-mortems of adverse events. When something fails, we employ RCA to identify mechanisms and associated causes leading to the failure so we can control them. For less-complex events, we employ less-intensive apparent-cause analysis (ACA) techniques such as “Why-Why” or “Five-Why.” Conversely, we employ FMEA to anticipate what could go wrong, the probable consequences of the failure, the likelihood of its occurrence, and the current risk-management measures to prevent, predict, or prepare for the event.
The two approaches are bookends of the same process. The differences? FMEA is conducted pre-mortem (before the fact) using inductive-reasoning. RCA is conducted post-mortem (after the fact) using deductive, or inductive-reasoning methods. We typically leverage these tools to manage machine reliability. But, they’re equally effective in managing process reliability, i.e., plant STOs.
The last thing an STO steering committee and management team should do after a shutdown is conduct a “lessons learned” session. Such meetings are usually informal, and associated notes are often scant and somewhat subjective. A post-STO session is more beneficial when managed as a formal process-RCA event with detailed notes and cause-effect and other diagrams. As with post-mortem analyses of machine failures, some process failures during an STO will be fairly straightforward, requiring only an ACA. Others will require a full-blown RCA.
The start of the next STO event is the time to review the formally analyzed and documented lessons learned from the last STO. However, rather than being approached informally, the session should include a full-blown pre-mortem STO-process FMEA incorporating those lessons and applying inductive-reasoning to anticipate what could go wrong given the scope of the upcoming event, estimate the consequences and the likelihood, and identify preemptive measures to mitigate the STO risks.
Repeating the same mistakes—time after time—in your STOs is costly. Applying tried-and-true RCA and FMEA tools to the process improves plant reliability, reduces STO cost and duration, decreases safety and environmental risks, and enables continuous improvement. Learn more about STO issues in my feature article in next month’s Efficient Plant. EP