A Personal History of the Art and Science of Reliability
Kathy | July 1, 2005
Lessons learned during a five-decade journey toward excellence in reliability and maintenance.
One of the advantages of having someone as old as me (75) on the podium is that I can give you a living history lesson about the art and perhaps the science of reliability and its relationship to maintenance. Because I have seen events unfold over the years, I can step back and see trends that others may not have had an opportunity to see. Perhaps my overview can be projected into the future, a future that you may shape and will certainly encounter.
My first job out of college was in a relatively small chemical plant. I was trained as a chemical engineer but very early I was made a maintenance shop supervisor. What an experience that was. Maintenance doesn’t really describe what we did; it needed adjectives. The entire job of my group was to fix machines, pipes, and anything else that broke. So, expensive breakdown maintenance is a more appropriate term for what we were instructed to do.
Since plant management was clearly focused on producing product, there was no tolerance for taking time to find out why anything broke. So after a couple of years of this silliness, I joined the U.S. Army as a commissioned officer in the health area. The contrast was like a breath of fresh air because the military provides wonderful training and gives young people a lot of responsibility. I became the sanitary engineer on a military post in Virginia. I worked out of a hospital and was also in charge of preventive medicine.
The philosophy of prevention
Well, before you get too excited about the lessons I probably learned, let me tell you what we did. We immunized babies and adults going overseas. The idea was to prevent illnesses. One of the things that I did was raising flies. That’s right, I was raising flies and sending them in hard shell pupae stage to 2nd Army Headquarters where they were allowed to develop into adults and were exposed to different insecticides so that the army could tell me which insecticide would be most effective to use during the summer months. I did a number of things like this that were truly preventive.
In the Army, I learned that training could be very effective when new things were learned and then applied in realistic circumstances. I also learned that young professionals should be assigned to areas where their education and natural inclinations lean. Finally, I learned that if we are creative enough, we could prevent bad things from happening.
After I was discharged into the reserves, I returned to my previous company, but to a recently built facility in Virginia. Again, despite my training as chemical engineer, I was placed in maintenance and I soon graduated to an authority position. To me that meant I had some space to try my own ideas.
Trying new ideas
In 1957, I purchased a rather heavy electronic box used to analyze equipment vibrations. Using this box, I was able to identify equipment problems before they caused us downtime. This is certainly old hat to you, but in the 1950s it was a revelation. The early warning systems allowed us to prepare for the fix in a way that considerably reduced downtime and sometimes eliminated the need for it.
During this time, I researched other nondestructive tools as they became available. With sonic equipment, we were gauging the thickness of pipes and tubes and with infrared thermography we were identifying furnace and heater problems as well as the condition of our roofing systems. This was great fun. And since process uptime, and consequently revenues, were rising, management was supporting my efforts—although I had to constantly remind them what our contribution was. As you know, when something good is happening there are always many people standing in line to take credit for it.
Using assets wisely
In early 1960 I was sweet-talked into a temporary move to another plant where I entered a new world. This was my employer’s largest facility. It had one batch and three continuous polymer facilities and a very large operation to produce various synthetic fibers used in products such as tires, carpets, and drive and conveyor belts.
My first assignment was to build a facility to make dies to extrude the polymers. It was my job to see to the building of the facility, the technology transfer from Italy for the manufacturing facility, and the hiring and training of people to do the work. Remember, I was a trained chemical engineer and although the assignment was interesting it was strictly mechanical.
The holes in these dies were very small, some so small they could not be seen with the naked eye. And to further complicate the project, most of the holes were not round, but Y-shaped and dumbbell-shaped. And each of the holes in a die had to be exactly like the other 50 to 100 other holes in that die and the holes in other similar dies.
We needed to acquire tools to make these holes but nobody sold tools that small at that time so we manufactured them on jeweler’s lathes under microscopes. Each tool took an hour to make and held up for less than 10 holes, yet we had orders to produce thousands of holes.
I used two of my assets to correct this problem. One was an engineer who had an insatiable curiosity and the other was mechanical design genius. I developed specifications for what we needed and gave it to the engineer and told him to find a machine that would make the tools in no more than 3 seconds. He spent a couple of months traveling throughout western Europe and finally found something close to what we needed in Switzerland.
He brought the machine back with him and my in-house mechanical design genius modified it to fit our specifications and we began making each tool in 3 seconds.
As I stated before, I learned that I would get the best results from technical resources if they were used in areas that they were most interested in.
Leveraging information technology
Soon after this I was promoted and became the head of engineering, maintenance, and utilities for that facility, where there were 20 producing cost centers. Each one had a supervisor who in turn had a cadre of production workers and a small group of maintenance workers whose job it was to quickly fix problems on the run when they were called upon. This type of maintenance service was pretty common in the fiber industry at that time. This was not a very efficient use of labor because when the equipment was running smoothly, the labor was idle.
I suggested that we develop software (remember this was the early 1970s) where production workers could request these jobs on handy workstations. The work requests would immediately go to a centralized computer that would prioritize the requests and direct a mechanic who had the needed skills to address the job. Since the computer would know where each of the maintenance craftsmen was working, this would be an efficient application of manpower.
The facility had more than 3000 employees who had to be trained to input information needed into the input stations. So if an operator found that yarn was breaking and wrapping on a position on one of her stations, she would input that information to the computer.
The computer would be receiving requests from the entire facility and prioritizing them so that the jobs that would provide the greatest safety and financial return would be done first.
Since the operator’s request also identified the skill needed, the computer would select a person with the required skills and the shortest travel time to perform the task.
When the mechanic arrived at the position needing maintenance, he signed in to the input device. After completing the task, an accounting of materials used would also be input and the computer would automatically see that the supplies were replenished. Once the task was completed, the operator would report to the computer the time when the position was restarted and production resumed.
Now we had captured the exact time that machine or machine position was down, the time it took for the mechanic to arrive, the elapsed time of the repair, and the lost time, if any, to restart the machine or the problem position. And this was done with 1970 technology.
Boldness opens dialogue
At the time I requested the computerized process, it was a bold and perhaps seemingly outrageous move. I have found that a really good strategy for large ideas is to wrap them in a bold and outrageous package. If you have developed a reputation for materially helping to improve output and lower costs, management is reluctant to turn you down. Boldness and outrageousness open dialogue while incremental improvements lose their luster very easily, if any luster was there in the first place.
Later in my tenure at this facility, I learned that if you wanted to reduce production costs you must produce larger packages. So I recommended to my engineers that perhaps we could spin a package that was 60 lb, double the present size. We did it and then transported these larger packages and loaded or creeled them on to drawtwisting machines.
Each drawtwisting machine was producing 72 5-lb packages at the time. I wanted the biggest package we could produce. My mechanical design genius said that he believed he could design a 20-lb package. We also decided that instead of hauling the 20-lb bobbins to the next operation, we would devise a conveying system that would become the creel for the next operations.
This was a bold move. The vice president wanted to support it because of its tremendous savings and his confidence in our abilities, but at the last minute he got cold feet and purchased thousands of 10-lb bobbins because his advisers were telling him that it would be impossible to develop a 20-lb bobbin that would not crush under the forces created by the nylon yarn winding up on the flanged spools. Because of this design concern, management decided to take the bobbin design away from us and give to the central engineering department designers who had more experience with bobbin design. The vice president purchased the 10-lb bobbins because of the failure of their designs.
I asked my designer if he could make a 20-lb prototype bobbin that would work and not crush. His first prototype performed as we intended. This was a severe blow to central engineering management.
I believe that the central engineering bobbin designers had developed paradigms of what would work and what would not work. My designer was a mechanical expert who had not designed a bobbin before so he had no built-in restraints.
Paradigms are extremely powerful. Although they can sometimes provide order, they can also provide obstacles.
Failure analysis pays off
While all this design work was going on, I had grown my maintenance engineering staff to about 12 engineers of different disciplines. This group of fine professionals entered into the world of failure analysis among other things. First, they gathered information about methods and techniques that were being used in the aircraft industry. These techniques used probabilities to forecast failure events. But I thought, why do that when we had plenty of actual failures to study and we could generate real information. As a result, my young engineers and I began to develop methodologies of our own.
Applying all the methods of reliability engineering that we assembled and developed resulted in our plant polymer and fiber processes operating at a very high level of on-stream time. For example, our four-polymer processes were on stream an average of 98 percent of the time for the 10 years that we kept records. Many people manipulate uptime figures by excluding times they feel that they were not responsible for or by other ways, but my figures always accounted for every incident of downtime and for every hour of the year.
This was a remarkable achievement but I learned one or two important human concepts:
1. It is difficult for most people to accept large ideas and big accomplishments. It is like throwing a $10,000 bill on the ground of a busy street. People see it but no one picks it up because they cannot accept that it is real.
2. When large achievements are made, only really great managers and executives will acknowledge the originator. Some are afraid to give that much attention to someone else. Many others are reluctant to acknowledge the originator of the idea because they are afraid of alienating the support group that worked on the accomplishment.
Well, because of my track record the corporation decided to move me to their R&D operations at their central headquarters to continue the development of this new reliability engineering technology and spread it to the rest of the corporation. I refused to move on the basis that the company had three producing facilities where I was and that was a better laboratory than the pristine facilities at the home office. They bought my argument.
In 1972, the Reliability Center was established to further develop reliability concepts and to spread these concepts to the entire corporation. I directed and managed this operation. In the years that followed, we continued our development of reliability techniques and we consulted and introduced our methods into most of the company’s chemical plants in the United States.
More lessons learned
This is some of what I learned in those years:
• Challenge parochial pride to develop innovative approaches to improve performance. One way to do it is to suggest that in lieu of their application outsiders will be brought in to help.
• Managers often lean on a confidant who usually has his own agenda. If performance is improving, the confidant may be needed as a competent sounding board for the manager. If performance has been deteriorating, it may be that the manager is getting poor advice.
• Everyone has an agenda. His agenda may or may not conflict with the goals of management. Steady improvement in facility performance is a good indicator that individual agendas support the managers’ goals. Steady deterioration in performance is a clear marker that they do not.
• Some employees like to know the rules and are quite content to follow them. Others need space and responsibility with accountability to perform. Take a lesson from the armed services and provide excellent and realistic training, responsibility with accountability, and great support. That is a formula for success, but only given to the people who need space.
I worked in the United States when maintenance was totally repair and replace or rebuilt activities. And it was very expensive. On reflection, it was probably a good thing because our inefficiencies provided employment for men and women who had returned from World War II.
As we moved into the predicted maintenance era and started to gain efficiencies, we were expanding our markets at home and abroad. As our population grew we used the evolving manpower to staff the new and expanded industries that were developed to meet the market demands.
Today, our maintenance efforts are developing toward prevention through proaction. You know I was using that word long before it made it into the dictionary.
Minor costs add up
In the 1950s, I studied all the routine jobs that our shift mechanics in the various crafts performed. I found a chain that drove a feeder that broke just about every shift and was routinely replaced. I found that very cheap sewer sampling pumps were being replaced routinely every week. I found a very large conveyor system that kept dropping material from its ore-carrying belt onto the emergency shutdown cords just about every hour and people had to be dispatched to restart the conveyer.
I found that these minor cost incidents occur in every operation. I have been in hundreds of manufacturing operations and I have seen minor mishaps like these that routinely cost a great deal of money. As we developed and honed root cause analysis, we found that these small occurrences made a major contribution to the much larger more expensive mishaps such as equipment wrecks, fires, explosions, and major process upsets.
I also began to realize that human beings are not very good at recognizing where our largest costs emanate from. When we have a large explosion we all can appreciate that the company will be hit with a rather large cost. But if we amortize that cost over 10 years it is very likely not our largest cost. What is our largest cost over that same 10-year period? It is those small mishaps that really do not cost much when they occur. But because their cost is so small, there is no driving force to remove the cause.
What is missing is our ability to recognize frequency. A minor incident that occurs every hour or every shift or every week amounts to really big money. A minor failure that costs $100 to correct but occurs on every 8-hr shift can cost more than $100,000 each year. If we learn to do root cause analysis and eliminate the causes, we will prevent the bigger mishaps from occurring. The smaller mishaps may not be directly related to the larger ones, but their elimination reduces the noise in our systems and builds in discipline in the way we do things.
Focus on root cause analysis
If I project what I have seen in my lifetime, I believe that the use of root cause analysis will intensify as industry, banking, healthcare, and government see its usefulness in bettering our society. I believe that the use of root cause analysis only to satisfy compliance to laws and/or standards will eventually get the bad name it deserves.
Further out, I believe that eventually predictive maintenance will yield to true root cause analysis and be displaced by it. I probably will not see it but it will come.