Causes interact in industrial accidents and life
Before discussing failure causes, it is helpful to understand why the idea of process safety management (PSM) has taken hold in the past few decades.
Before discussing failure causes, it is helpful to understand why the idea of process safety management (PSM) has taken hold in the past few decades. Applicable background information can be found in literature,1 which shows how process reliability and equipment reliability are always intertwined—to the point of being inseparable. Literature explains this intertwining by referring to the Bhopal incident that took place in December 1984 (FIG. 1). With a loss of many thousands of lives, Bhopal still stands as the most tragic single industrial disaster in human history.
FIG. 1. The 1984 Bhopal disaster, an intense toxic methyl isocyanate (MIC) gas leak from a pesticide plant, killed thousands and remains the world’s worst industrial disaster.
The referenced literature describes how the Bhopal pesticide plant was operating at its inception, and goes on to mention that the importance of designing and operating with safety in mind was appreciated by all involved in the manufacturing process. Here, as in other facilities, PSM was used to exercise a sufficient level of control to avoid the accidental release of materials involved in the process. All personnel subscribed to the goal of protecting their mutual interests by safely managing the industrial process involved. Indeed, safety was the foremost commitment in mind.1 Nevertheless, the accident that occurred affected the lives of thousands.
Learning from the past
Without engaging in largely academic research, common sense and personal experience can be used to make the argument that failure causes often interact or are interrelated. Interpersonal relationships and interactions are among the constants in people’s lives as they perform their respective roles and functions in industry, government, travel, sports and many more activities. Finding the root causes for incidents and accidents and then pursuing follow-up or corrective actions are of crucial importance. The right steps must be taken if future setbacks are to be avoided.
Members of academia and industry leaders have worked diligently to define and document the right steps and the very concept of “root cause.” The author is among those who have consulted competent sources, and one organizationa in particular has put forth extensive effort to define the best universal definition of “root cause.”
Root cause analysis
In early industrial environments, the term "root cause" seems to have had no clear definition, yet was used intuitively in industry. Knowledgeable industry experts explained what they called a “root cause,” but significant variation existed between these experts’ definitions. This organization began researching and developing a definition for “root cause,” and chose to define it as “the most basic cause(s) that can reasonably be identified and that management has control to fix.” It was implied that researchers and industry would endeavor to identify the problem roots (the most basic cause) in conscientious efforts to avoid the risks associated with failures of all kinds.
In the 1990s, the organization settled on a slightly modified definition: A root cause is the most basic cause (or causes) that can reasonably be identified that management has control to fix and, when fixed, will prevent (or significantly reduce the likelihood of) the problem’s recurrence.
Taking action
The above definition considers and identifies the purpose of the solution or needed remedial action, which must be considered from beginning to end. In the beginning, an existing condition was poor or risk-prone and led to an issue that must be fixed or remedied. At the end, the will to prevent a recurrence of the problem must be present. Concerted action taken to prevent recurrence means nothing short of full avoidance of more problems, which is the same as saving, preserving and protecting the fullest possible range of assets. Finally, assets are both physical and human/intellectual, and they include facilities, equipment, personnel, community goodwill and company reputations.
By studying a problem, it may be determined that several basic root causes can be reasonably identified. Management has the control to fix them and therefore prevent (or significantly reduce the likelihood of) the recurrence of the problem. It is implied that management actions will both fix and maintain in the as-fixed condition, regardless of what flaw-inducing root causes were associated with the original problem.
Competent and knowledgeable people must be involved every step of the journey. Competence needs to be maintained and assessed or audited. A measure of training should be organized and managed. In each of the previously mentioned definitions, the common denominators for finding solutions—the “movers and shakers” that cooperate to carry out the correction of mistakes—are the managers.
Incidents are rarely due to a single root cause, and things can become unwieldy when a single root cause results in more than one loss. Over the years, and especially as more experience is accrued in the field of root cause-related work, the trend has been to talk about root causes that affect only one loss. Methods of teaching root cause analysis have been devised in the past decades.
The compound
Still centered on the loss: the problem or incident damages, at least partially, what philosophers call “the compound.” In the real world, “compound” relates to materials and formal designs, both in real-world terms and real-world experiences.
Assume that something interrupts a process or its movement, preventing it from reaching its initially intended goals, which are usually labeled as “design goals.” With each incident, a root cause must be determined. Conversely, if no incident occurs, then a best practice has been followed and nothing has interfered with the process movement.
By spoiling raw material or disregarding a well-developed formal design that comports with science, industry risks the destruction of assets, both physical and human. Personnel fatalities are clearly tragedies of the first order, and they can result if the process movement is somehow interrupted or if the process is no longer contained. The environment in which we live can then be affected detrimentally. To claim otherwise is unsupported optimism and is typically at odds with science. Proclamations at odds with science are often accepted by the indifferent and willfully uninformed; once they morph into alternative facts, they can present an immense danger.
Re-examining the definition
Considering these points, the initial definitions were ready for re-examination. By 2005, the organization realized that the definition they had developed for a root cause in an incident investigation had negative connotations. While arguments ensued about management’s ability to fix the problem, the focus was meant to be on the fact that the problem was fixable. An improved definition that focused on improvement was needed. The organization published another version of their definition of root cause that stated, “Root cause is the absence of a best practice or the failure to apply knowledge that would have prevented the problem.”
The new definition’s focus is more positive and is meant to seek out best practices and to apply knowledge, or wisdom, to prevent problems. The root cause-finding model steers clear of looking for people to blame, or for flaws in management. With their eyes set on improvement, organizations must look for ways to perform work more reliably, an approach that concentrates on the dissemination of best practices, methods and procedures that lead to good long-term results.
These positive methods and procedures are definable and repeatable. To carry them out, personnel must adhere to fundamental ethics and common sense. Their judgment cannot be clouded by greed or subverted by claims of being imbued with superior gifts and insights unavailable to anyone else. True best practices must appeal to our noblest impulses, which are not fully aligned with pleasing someone’s ears, but rather carry the will to do what is right and to accept proven science. In this context, acceptable science refers to that which relates to the formal design or cause.
It follows that in their pursuit of failure analysis and prevention, a scientific person will make an exhaustive effort to consider every shred of applicable knowledge and wisdom. By recalling and then consistently practicing wisdom, we make applied knowledge the focal point of every failure avoidance endeavor. We can then rightfully see ourselves as contributors and value-adders. Having knowledge and acting on this knowledge are imperatives in any job position. To seek out best practices and put them in motion, the motives must be mastered in people’s minds and etched in their powers of logical reasoning before carrying out or implementing both improvement and remedial action.
A better definition?
The author asserts that a relatively recent definition sounds somewhat absolute. The words “would have prevented” could probably benefit from the addition “or could have significantly reduced the likelihood of.” This thought is an acknowledgement of real life, where it is often impossible to guarantee that a problem will be prevented from ever happening again. However, the “could have…” add-on was not advocated because the emphasis should remain as definite as possible. Statements that incorporate or convey “could have” and “might have” often lose potency. Selecting low-potency definitions for root cause is undesirable.
A focus is needed on the materials with which the industry works. These could be solid, liquid or gaseous substances. Although present regardless of what definition is involved, these substances often remain hidden or somehow unnoticed. Because one or more of these material substances is always involved, the dual meaning should not be disregarded: While these are the physical materials with which the process deals, their presence in failure events may well make them the material cause or key contributor.
The goal in determining root causes is to maintain the integrity of the process, stability and correctness of the movement. The definition has evolved from reducing the problem’s recurrence to producing something, improving that process and then establishing best practices to ensure safety and optimum performance. HP
NOTES
a TapRooT
LITERATURE CITED
- Bloch, K., Rethinking Bhopal: A definitive guide to investigating, preventing and learning from industrial disasters, Elsevier, 2016.
The Author
Grisi-Frisbie, J. - Sinergia International, Mexico
Jose Grisi-Frisbie academic quests led him to study engineering in Mexico City (UNAM University), and he later obtained an MS degree in business at ITESM/Monterrey. He developed his cause analysis expertise mainly in sales positions, including representing leading manufacturers of lubrication devices for mining machinery and parts in Mexico. In his present job, he manages Sinergia Inerpersonal, a sales and representations office. For the past 12 yr, he has promoted root cause analysis courses for the Knoxville, Tennessee-based TapRooT organization.
Related Articles
From the Archive