Unplanned production deferment events are a significant obstacle to operational excellence. The compound effect of corrective maintenance costs coupled with periodic loss of production causes operators to overrun budgets and miss out on market opportunities. On top of this, some events may have the potential for consequential injuries and environmental impacts.
Repairing a failed asset is critical but preventing its recurrence is just as vital. Root Cause Analysis (RCA) is a valuable tool that can uncover the contributing factors to a production deferment event or a near-miss that could have caused a loss in production.
Why front-line investigations may not be enough
Following a failure event, the fault finding and repair of an item of equipment will reveal which component has failed, e.g., a seal or bearing on rotating equipment. However, this information does not tell the whole story. Only a more in-depth investigation can reveal all the factors that contributed to that failure.
This principle applies in everyday life, not just for industrial equipment. For example, a car mechanic can replace the brake pads of a vehicle every time they wear out. However, where the period between replacement is repeatedly shorter than expected, the causes of wear could be in relation to a number, or a combination of factors. These may include, aspects such as: driver behaviour, brake disc condition, pad selection, improper installation, pad contamination, vehicle suitability for the terrain or loading, etc.
Front-line investigations for production machinery typically address only the immediate causes of failure, leaving other factors undiscovered - like the environment, work procedures, human factors and even organisational structure. These factors are likely to exist after the repair and remain likely to cause repeat failures in time. Merely repairing or replacing the failed component will not reveal underlying causes.
There are usually a number of reasons that contribute to the top event, in a one-to-many type of relationship. Only through the application of RCA techniques can all root causes, and other related findings, be uncovered.
It is an unfortunate fact that many organisations now find that they are running so leanly that they have little or no ability to dedicate resources towards investigating failures adequately. As such, failures are addressed one-by-one, opportunities to learn and improvements are not seized, and operational performance is consequently sub-optimal.
An overview of the RCA process
Several RCA tools exist in the market, but they all follow a similar process of asking the question "Why?" at each step of the investigation. Within five steps of asking “why?”, even the most complex of failure scenarios can be resolved as the immediate causes, underlying causes and the root causes of an incident are systematically uncovered.
However, conducting a successful RCA goes further than asking "Why?" Here are the steps that describe a typical RCA process:
- Create a clear and concise incident statement to define the scope of the investigation.
- Plan the investigation to identify potential contributing factors and to provide initial focus on specific areas.
- Investigate the incident by collecting relevant data and developing a storyboard.
- Analyse the data by interrogating each cause until the root causes are uncovered.
- Report findings for communication to stakeholders.
- Make recommendations that will prevent recurrence if implemented.
- Take action and review.
If an RCA is completed by an independent specialist, such as Vysus Group, the operating company would take ownership of the implementation and follow up reviews.
The benefits of an RCA – Case Study
Vysus Group recently completed an RCA for a production critical pump failure in an offshore application. The operator had modified three pumps, to allow either single pump or dual, in-series pump configuration to be used. A mechanical seal failed shortly after switching the pump to single mode after running for some time as the second pump in the dual pump configuration.
The obvious front-line finding was that the seal had failed in operation. However, our RCA analysis uncovered five significant contributary factors.
- There were no adequate documented operating procedures that covered the change between pump operating modes. In their absence, pump commissioning routines were being used that did not cover key operational sequences.
- Barrier fluid pressure had not been changed when the pump mode was changed. The barrier fluid pressure was too high (>400%) for the single pump operating conditions, contributing to the failure.
- In the design stage, there was a failure to consider high barrier fluid pressure as a threat, which led to its exclusion from control logic. The pressure inputs to the control system were available but not incorporated in the start-up sequence.
- Pipework was reportedly pulled onto the pump when the modification was done, which affected pump/motor alignment.
- There was no pump/motor alignment check after installation, which could have identified misalignment before commissioning.
Only a detailed RCA could uncover all these root causes and develop the recommendations to rectify them. By implementing the RCA actions, the operator was able to prevent recurrence of this incident and improve the reliability of the water injection system.
The cost/benefit of conducting RCA is invariably weighted towards performing the analysis.