Uptime: Learning From Sinking Ships And Aircraft Mishaps
EP Editorial Staff | May 28, 2012
“Human error” is an often-cited cause or contributing factor in accidents and failures. We routinely see this with our plant equipment. Without a structured process for identifying and quantifying the underlying human factors associated with specific incidents, however, eliminating human error can be quite challenging. Let’s look into a couple of historical examples for some valuable insight.
Just the tip of the iceberg
On April 15, 1912, the new and grand RMS Titanic, at the time the world’s most technologically advanced steam ship, struck an iceberg and sank, taking over 1500 souls down with her. We could truthfully state, “If it weren’t for the iceberg, the Titanic would have had a glorious maiden trans-Atlantic voyage.” That would be an oversimplification.
One of the technological marvels the Titanic sailed with was the Marconi wireless radio, which was operated exclusively by Marconi Company employees, solely as a service to the ship’s passengers. When icebergs were spotted in the shipping lanes by other ships in the area, their captains sent radio messages to the Titanic. Unfortunately, critical radio warnings from the SS Californian were later ignored in favor of a backlog of passenger messages. But, “If it weren’t for the iceberg…”
Fred Fleet, a surviving Titanic crewman, was in the vessel’s crow’s nest the evening of April 14. Although he didn’t have binoculars with him, Fleet was serving as a lookout for ice. It seemed that the binoculars had been stowed away in the telephone locker—to which nobody on board had a key. But, “If it weren’t for
Second Officer David Blair, who had been bumped from the ship at the last minute by bosses at the White Star Line, had forgotten to hand off the locker keys to his replacement, a more senior Second Officer Charles Lightoller. But, “If it weren’t for the iceberg…” When asked during the disaster investigation how much sooner the iceberg could have been spotted with binoculars, the lookout Fleet replied, “Well, enough to get out of the way.”
The Titanic disaster 100 years ago led to substantial changes in ship design, radio communications and other navigational breakthroughs. The most significant, yet preventable, contributing factors leading to the ship’s demise, though, could not be eliminated by design: They were the human factors. Human fallibility, at multiple levels, had gone largely unnoticed during this much-heralded, but ill-fated, voyage. Given the technology of 1912, what must happen to prevent future collisions with icebergs?
Flying on the ground
The Airbus A340-600 “super-stretch” aircraft is known for its new manufacturing techniques, state-of-the-art technologies, its extended range, fuel efficiency, unrivaled interior comfort and amenities, and reduced maintenance cost. The A340-600, powered by four Rolls Royce engines each generating 56,000 pounds of thrust for a take-off weight of over 400 tons, is the longest passenger aircraft flying today. Only 96 of these aircraft are currently in operation worldwide.
On November 15, 2007, a new Airbus A340-600 crashed during a ground engine test, at the Airbus facilities at Toulouse International Airport in France. The $250M plane—about to be delivered to Etihad Airways, in the United Arab Emirates—was a total loss after less than 13 seconds of movement.
The aircraft had just finished a static ground test, and the technician started the engines again for a full-power run to find the origin of oil leaks. The calculations had shown that “the parking brake was designed to hold under full engine power.” After about three minutes of running, though, the aircraft began to move. The technician applied the brakes, which released the parking brakes. Then, with the aircraft continuing to move in the direction of a test-pen wall, the technician turned the nose wheel, causing it to skid sideways as the plane accelerated. But, “The parking brake was designed to hold…”
The 242-ton aircraft hit the wall at 35 miles per hour, causing the forward fuselage to break off and fall forward down the other side of the wall. Engines 1 and 2 were damaged when they hit the wall. Because the throttle controls were severed, Engines 3 and 4 could not be shut down immediately. Engine 4 shut down after two-and-a-half hours—when water and fire-fighting foam snuffed it out. Engine 3 continued running for nine hours, until it finally ran out of fuel.
The cause of this A340-600 aircraft mishap was summed up as “a lack of detection and correction of violations to test procedures.” You guessed it: human factors, again. But, “The parking brake was designed to hold…”
Coming to the rescue
What can we learn from these two examples of human factors in preventable mishaps and failures? The first step in eliminating problems is to identify contributing factors—human factors being the most elusive. Alas, we often refer to this situation as “human error” (another oversimplification). Let’s see how the U.S. Department of Defense (USDOD) has identified and categorized contributing human factors.
In a May 2003 memorandum, the U.S. Secretary of Defense proclaimed, “World-class organizations do not tolerate preventable accidents. Our accident rates have increased recently, and we need to turn this situation around. I challenge all of you to reduce the number of mishaps and accident rates by at least 50% in the next two years.” This memorandum resulted in the formation of the “Department of Defense Safety and Oversight Committee” and a powerful emphasis on human factors engineering (HFE). The Committee’s statistics revealed that 80 to 90% of all mishaps were caused by human errors at some level. The result was the “Human Factors Analysis and Classification System” (HFACS).
The HFACS included the following four tiers and associated factors to help understand why a mishap occurred and how it might be prevented from happening again.
- Organizational Influences (20 factors)
- Management/Supervisory Conditions (19 factors)
- Preconditions (92 factors)
- Acts Committed by Operator (16 factors)
This original four-tier HFACS structure was later adapted for “maintenance events” and called “HFACS-Maintenance Extension” (HFACS-ME). The “maintenance” framework included these four tiers and associated factors:
- Management/Supervisory Conditions (24 factors)
- Maintainer Conditions (27 factors)
- Working Conditions (27 factors)
- Maintainer Acts (24 factors)
Note how both the HFACS and HFACS-ME have identified the “Acts” of the Operators or Maintainers as the category with the fewest causal or contributing factors. They have looked beyond the person at the controls or on the tools for causes and potential solutions.
Human-factors lessons from the Airbus mishap
Using the HFACS to assess and classify the contributing factors of the November 2007 Airbus incident resulted in the following report summary:
- Unsafe acts
- Errors: The technician was fixated on applying the brakes and steering the nose wheel instead of reducing power.
- Violations: Procedures require full power for only two engines at a time—the one leaking and one on the opposite wing. The use of wheel chocks is required for engine tests.
- Preconditions for unsafe acts
- The Airbus technician was unaware the aircraft was moving until the on-board customer representative told him so. The technician also testified he often carried out this test, but with higher aircraft weight.
- Unsafe supervision
- The technician admitted that some tests were conducted outside the scope of the Customer Acceptance Manual, due to the pressure of the on-board customer representative.
- Organizational influences
- There were oversights in safety processes shown in video recordings from several days prior to the incident. Some tests were performed with wheel chocks and others without wheel chocks, even though procedures require the chocks during engine tests. The lack of detecting procedural violations and taking corrective action was cited as promoting the test outside of the established procedures.
Application for capital-intensive businesses and industries…
What can we learn and put into practice based on the HFACS and HFACS-ME? The USDOD’s expressed goal: “To identify and eliminate the causes of aircraft mishaps and failures.” The same could be said with regard to equipment-intensive industries: “To identify and eliminate the causes of critical equipment mishaps and failures.” In other words, it means improving equipment performance and reliability.
Human factors in equipment-reliability improvement…
We should be able to identify and classify human factors that contribute to equipment failures and mishaps in our plants and facilities. Then, given those factors, we should be able to establish countermeasures to eliminate causes of such failures and mishaps in the future. Here’s a starting point based on the HFACS-ME factors that could easily apply to industrial plants—or any equipment-intensive operations:
Organizational influences/work culture
- Resources and policy deployment
- Organizational climate
- Organizational policies, procedures and work processes
- Inadequate supervision
- Planned inappropriate operations
- Failure to correct known problem
- Supervisory violations
- Environmental factors/working conditions
- Condition of individuals involved
- Personnel factors: operators and maintainers
Consider these points
First, bad decision-making can overcome the best of technologies. Second, a technology is only as good as the people using it. And third, complex systems rarely fail without warnings. Understanding the impacts of any “human” factor is key to improved performance.
Think proactively. When developing maintenance procedures, operating procedures, training programs and root-cause analysis methods, always take into account human factors that would contribute to flawless human and equipment performance in your plant. When investigating causes of equipment failures, always identify contributing human factors and evaluate the effectiveness of policies, procedures and work processes.
Making an HFACS model work in our plants and facilities has many powerful benefits. Among them is improved reliability (reduced breakdowns and equipment failures) plus improved workplace safety, product quality and on-time delivery. Stay tuned for more details on “Human Factors for Industrial Equipment-Reliability Improvement.” MT