Many years ago, we were introduced to the world of continuous business improvement methodologies like: LEAN, Kaizen, SMED / TAKT, 5S, 6-Sigma / SPC / TPM / TQM, 5-Whys / RCFA / RCA, FMEA / FMECA / Risk Analysis / Pareto / PMO / MTA, Fishbone Diagrams, Kanban / PdM/CBM, Poka-Yoke, Just in Time (JIT), MRO / Critical Spares Analysis, Culture Change Management / KPIs / Balanced Scorecard, and probably several more that we have not mentioned.
Our work with all these methods was in the realm of trying to get more production from a piece of equipment or process with no additional cost, and ideally less. These methods were taught and applied in a ‘tool box’ fashion. Meaning, that each methodology was just an approach in the grand scheme of approaches to be selected and applied where and when appropriate. However, this is where there was always a struggle – which method to pick and why?
Now depending on the problem, the selection of the first method was generally straight forward. For example: Say we want to increase the production output of a product packaging line. So, using LEAN and measuring output over a period of time is going to be critical to us, which leads us to TAKT and/or SMED analysis.
The problem was that we would always run into some other challenges to be overcome such as ‘waste’. But what kind of waste? …Overproduction? …Underproduction? …Off-quality? …Process waste? …and so on. Now, each of these wastes would drive us to the other methodologies…
- In the case of overproduction, we would look into JIT to reduce inventory and Work In Process/Progress (WIP) in the manufacturing process.
- For underproduction, we would have to find out the reasons ‘why?’ we were under-producing, which tended to be many. Is it because of delays in production process? Or is it because of the unnecessary activities of workers, assets, or materials in the production process? So that would drive us to brainstorming and fishbone diagraming. We often found that people make mistakes or the machinery was not running well and that would lead to 5-Whys/Root Cause Failure Analysis (RCFA) with a FMEA / FMECA to capture the results.
- If we had high Process waste or a high reject rate, we might look at Kaizen and TPM coupled with a 5S program. If we were making off-quality product, then we would enter the realm of 6 Sigma and TQM in an effort to eliminate defects.
The problem with the work we did for many years in continuous improvement was that there was not a fundamental guiding strategy of how these very different (yet useful) methodologies all fit together. To say it another way – All of these methodologies are just ‘Tactical’ tools in the proverbial toolbox.
Now, there’s an old expression that says: “If all you have is a hammer, everything looks to you like a nail!” So, suppose you have a screw that needs to be driven in. you could hit it like a nail but most of us know that is not a good solution. So, what if you use the tip of the claw end of a claw hammer as a flat screw driver?
You probably aren’t using the best tool for the job, even though it may work somewhat. So, what would cause a person to use a hammer for a screw driver? Here are some possible answers:
- A lack of a proper screw driver and the hammer was ‘handy’, or
- The lack of a principled understanding of how each tool in the toolbox fulfills a specific purpose and how each tool is interconnected.
What we’re getting at is this – most were never taught nor had ever seen how to ‘strategically’ apply the continuous improvement methods. The continuous improvement methods were always applied ‘tactically’ at individual problems. They were not applied ‘holistically’ in a disciplined and principled strategic plan.
We need to tackle not just a single problem, but an array of issues that are uniquely related to an entire process, system or subsystem. It was not until our introduction to the Aladon RCM2 methodology, where we found a strategic model that formally orchestrates all of the continuous business improvement methods together into a comprehensive strategic framework.
In an RCM2 analysis, the RCM2 facilitator behaves like a conductor leading an orchestra through a complicated musical score. Depending on what’s required at the time, the RCM2 facilitator will focus on a particular continuous business improvement method over all the others. Then, as the analysis Team proceeds and a different need arises, the RCM2 facilitator may shift to another continuous business improvement method.
Best of all, RCM2’s usage of these continuous business improvement methods is done ‘organically’ within the RCM2 process; so that, individuals not formally trained in LEAN, TQM, 5S, Kaizen, 6 Sigma, etc. will learn how to use these methods naturally as part of the RCM2 process. As long as you are being led through an RCM2 analysis by a Certified Aladon RCM2 facilitator training by a Certified Aladon RCM2 Practitioner, no additional training is required for the analysts beyond the Aladon 3-day RCM2 Introductory course.
The remainder of this discussion will be to generally describe why and where some of these continuous improvement methods are utilized in the process of the RCM2 analysis.
Kaizen is a compound Japanese word of ‘Kai’ & ‘Zen’. Kai having many meanings but one being: “to change or restore”. Zen means “good” or “better”. So, loosely translated, Kaizen means, “Better Change” or “Good Restoration”. Generally, the principles of Kaizen include the following:
- That the employees working closely to a problem are the subject matter experts, they are the best equipped to solve any problems that arise.
- To act now even if the improvement is only small. Some improvement is better than the status quo.
- Use a multidisciplinary team approach to problem solving will result in the best solutions.
- Management must support and empower the team to take action, and give them a clear mandate.
These Kaizen principles are inherently supported in RCM2. The RCM2 analysis is always performed by those closest to the production process. The RCM2 Group is made up of a multidiscipline team and fundamentally needs management support for activities.
However, RCM2 takes the process a few steps further. For instance, RCM2 establishes at least nine principles of reliability that must be understood before any process or system can be improved with sustainable results.
Furthermore, RCM2 takes the idea of employee empowerment and involvement to a higher level of sophistication. Kaizen efforts that we were involved in over the years always struggled to sustain the initial results. This was primarily due to the outcome of the Kaizen being based on an incorrect understanding about the nature of the equipment and how operators and maintainers interact and behave with it.
Insight about equipment nature and our behavior with it is given to the user through the RCM2 process. RCM2 provides an asset-focused context, which helps us reduce or eliminate the consequences of failure to a safe minimum.
SMED / TAKT
SMED & TAKT are time-related measures or ‘indicators’ used to define the ‘pulse’ of a company’s operations. These two measures are founded on production task completion and production cycle time.
SMED stands for Single Minute Exchange of Die and is a measure used to determine the minimum time it takes to ‘change’ a machine over to run a different product. TAKT is a German word for an orchestra’s conductor’s baton. So the TAKT sets out the tempo of the needed rates of production.
Both SMED & TAKT are very much alive within the body of RCM2. They are found at the first level of the RCM2 analysis – the establishment of the function performance standards. Any asset or system is purchased and put into operation to fulfill a function.
For example: The primary function of a milling machine might be: ‘To finish mill a work piece to a depth of 0.500 inch ± 0.050 inch’. A secondary function of the milling machine might be: ‘To retool the milling machine in not more than 3 minutes by a normally skilled & trained operator.’
At this point in RCM2, we would establish the minimum and maximum performance standards that are associated with SMED and TAKT measures. So SMED & TAKT analyses are natural ingredients of the RCM2 process.
The Five S’s often show up in RCM2 during the function, functional failure, failure mode or failure effect development stages in the analysis…
- Scrub / Shine / Sweep,
- Systematize / Set / Straighten,
- Standardize and
More precisely, the lack of a workplace that is sorted, straightened, scrubbed, systematized, etc. is often identified during RCM2 failure mode or failure effect development. Function development will define the all-important performance standards for the equipment sub-system.
Also, since RCM2 is a living program, a reference task is added in the CMMS System to regroup the RCM2 analysis team at least once annually to review the asset’s analysis for any changes. This routine work order will ensure continuous improvement, sustainability of the RCM2 process and, ultimately, the reliability of the physical asset sub-system.
Finally, 5S shows up in the RCM2 default actions where our review group has identified a credible failure mode, but no PM task to address it. Such as for our milling machine retool changeover taking four minutes instead of three minutes because steps in the setup process are being missed by the operators. No preventative maintenance can address this failure mode, but a RCM2 default action calling for the standardization of a check sheet or standard operating procedure will address this failure mode.
6-Sigma / SPC / TQM / TPM
When too much rework or a high scrap rate exists, it may be warranted to consider 6-Sigma within a SPC (Statistical Process Control) context. The overarching objective would be to use these in a larger TQM / TPM (Total Quality Management / Total Productive Maintenance) program to reduce or eliminate defects and off-quality product. (See Kanban below as it may be used to trigger signals using N.W.A.C as status indication that product quality is drifting)
RCM2 entirely satisfies 6-Sigma and SPC as it inherently supports the definition of a SPC system. (See page 27 of RCMII) This is done by using the P-F curve as the means to communicate the Potential failure to Functional failure relationship. (More about P-F curves another time) See below for a simple example of a number of production runs and how the normal distribution can ‘drift’ off-spec:
RCM2 also supports TQM and TPM since Aladon RCM2 Practitioners train and certify RCM2 facilitators to write Operating Contexts. The Operating Context delineates many things, not the least of which includes the following:
- Company management commitment to not only the RCM2 pilot initiative, but a long-term reliability improvement organizational strategy.
- The new Culture Change mindset using MoC (Management of Change) and other change management techniques to measure and track results,
- The RCM2 Team empowered to lead and implement the proactive reliability program, (Especially the rites and rituals – see Culture Change),
- Goals to be achieved: more production/uptime, less cost/safety/spills, build morale, knowledge harvesting for future trades, etc.,
- A thorough description of the physical asset sub-system: …batch or flow? …redundancy? …quality, environmental or safety standards? …shift schedules? …inventory? …labor repair time/costs? …critical spares? …market demand? …material supply? …process documentation? …etc.
5-Whys / RCFA / RCA
5-Whys and RCFA (Root Cause – Failure – Analysis) are related methods used throughout industry. 5-Whys repeatedly ask “Why?” to explore reasons that cause a defect / failure we are interested in resolving. The objective is to ask this until we find the Root Cause of the problem. It is generally observed that you need to ask “Why?” five times to arrive at the root-cause of the problem.
However, practically speaking, what if we found that asking “Why?” only three times sometimes arrives at the failure’s root-cause? And yet, the next time, it takes seven questions of “Whys?”
Most folks know that asking “Why?” only a couple of times may lead to superficial and sometimes dangerous results. However, we have found that the main challenge of RCFA is the avoidance of ‘Analysis Paralysis’. Why? …because if you ask why enough times, you will always arrive at ‘Creation’!
We have seen 5-Whys being used prescriptively. That is, purposely and blindly asking “Why?” five times, whether or not it is warranted because they think they should. We have also known others to just ask enough times until they are satisfied the root-cause they have found will adequately resolve the problem within some reasonable conditions. The key is to know when to stop.
Aladon RCM2 facilitators are trained how to know when to stop and the same goes for RCFA. Although, there is a distinct difference that can be made between RCM2 and traditional RCFA.
In a traditional RCFA approach, one or maybe a few, likely root-causes of failure are sought for a single piece of equipment. In contrast, RCM2 is zero-based and seeks to find ALL credible failure modes (root-causes) for a process or sub-system, not just a few associated with one piece of equipment.
FMEA / FEMECA / Risk Analysis / Pareto / PMO
The FMEA / FMECA (Failure Modes and Effects – Criticality – Analysis) are the means to capture the results of the RCFA / RCA. RCM2 innately includes this and extends its usefulness by also including the asset sub-systems Functions and Functional Failures in the Information Worksheet. A genuine wealth of optimal trades’ knowledge of the asset / sub-system in a very compact ‘one-stop-shopping’ format!
Furthermore, the failure modes on the FMEA / FEMECA are tied back to the original functions’ performance standards as detailed earlier in SMED and TAKT measures. This is done by identifying, prior to failure mode development, when a process or sub-system has ‘Functionally Failed’ to meet its user’s demands. (Performance standards)
Now, Risk Analysis is the process of using the corporate Risk Matrix (See below, Consequence vs. Probability) to determine which assets / sub-systems are critical to the organization and which are not so much. Failure of critical assets / sub-systems usually lead to very serious failure consequences like a safety incidents, environmental breaches, releases to the air or nearby rivers, streams, lakes, etc., or may cost us an exorbitant amount of money to keep the plant running. The goal is to perform a reliability improvement initiative on the critical assets / sub-systems to reduce their Risk to the organization.
In the RCM2 process, a Pareto analysis (the 80/20 rule or Asset Prioritization) is always performed on a list of the plant’s physical assets’ Performance Report. Our guideline is to select focus candidates from the top 20% of all the physical assets / sub-systems that cost us 80% of our annual spend. These so-called Bad Actors are ‘eating our lunch’, so-to-speak.
Commonly, RCM2 analyses are performed on the top 20% of all the assets / sub-systems because of RCM2’s thoroughness in finding all the likely Failure Modes. In other words, those critical physical assets in the organization that, should they fail, will likely put us out of business or severely cripple us.
The other 80%, the non-critical assets, will have a PMO or MTA (Preventive Maintenance Optimization, Maintenance Task Analysis) performed on them, which is less rigorous than RCM2. PMO or MTA evaluates an existing preventive maintenance program, evaluates its effectiveness, looks for critical omissions, synergies, opportunities and then repackages the results into a more effective program. (i.e. PM routes)
Nevertheless, it is worthwhile noting that world-class organizations choose to perform RCM2 analyses on ALL their physical assets! This way, there’s minimal risk of missing any failure modes that result in failure consequences.
Fishbone (Ishikawa) Diagrams
Developed by Dr. Kaoru Ishikawa, this is another popular technique used to identify possible causes for a problem or defect. This diagraming method groups possible causes of failure into the 6-Ms (categories) of production – Manpower, Methods, Measurement, Material, Machinery, Milieu (Environment). Or, into the 4 categories of administration – Personnel, Policies, Procedures, Plant.
Fishbone diagraming is sometimes used in RCM2 when identifying likely failure modes to ensure that none of the above categories are missed. However, RCM2 offers a slightly different approach to the identification of the categories. RCM2 looks for reasonably likely (i.e. credible) failure modes that are:
- Currently being prevented by a preventative maintenance program
- Have occurred on the same (or similar) equipment
- Have not occurred yet but are considered as real possibilities
- And, those that may not be likely to occur but whose consequences effect safety or environmental.
The philosophy of the RCM2 process is fundamentally different from Fishbone diagraming in another important way. In RCM2, it is not necessary to list all the intermediate failure modes. (I.e. the bones of the fishbone diagram) Some consider drawing the other failure causes ‘in-between’ as insightful or interesting. However, in RCM2, we are concerned with productivity. As such, we document only the root causes that can lead to failure consequences.
In this way, documenting root-causes in RCM2 can be faster than Fishbone Diagramming.
Kanban / PdM / CBM
Kanbans are simple signals or status indicators that are typically used to start a supply chain replenishment or manufacturing process. However, Kanbans can also be used to alert operators and maintainers that an action is required based on the condition of the asset. This is the thrust of CBM – Condition-Based Maintenance / Monitoring. (See also MRO section)
PdM (Predictive Maintenance) is the technological means by which we gather the condition of the asset for CBM and Kanban reporting. PdM tools are used to identify patterns in collected data. The goal is so that the start of asset failure can be determined, with enough advance notice, to mitigate safety, environmental or economic consequences. This can include vibration analysis, infrared thermography, and ultrasound, lube analysis, NDE / NDT (Non-Destructive Examination/Testing), Human Senses inspections, and so on. These may use a variety of readily available condition monitoring techniques: Dynamic, Particle, Chemical, Physical, Temperature, and Electrical. These methods are so relevant to RCM2 that John Moubray included over 100 in his world-class, best-selling textbook: “RCMII”. (See Appendix 4)
An example might be lubricant min / max fill lines associated with a sight glass on a gearbox. A simple, effective and sustainable system to implement Condition-Based Maintenance (CBM) tasks from Work IDs & Trades’ knowledge uses N.W.A.C. Define asset health Indicators each with a Normal, Warning, Alarm and Critical (N.W.A.C) states according to the following:
N – Normal (FULL – No action)
W – Warning (¾ FULL – Record & continue to monitor fill condition)
A – Alarm (½ FULL – Schedule work order to refill at next available downturn)
C – Critical (¾ EMPTY – Contact Maintenance for immediate refilling)
Another example might be to look for loose mounting base bolts on the gearbox and if any of the washers are seen ‘dancing’, action is taken to correct the problem. In these examples, the use of a Kanban finds its way into RCM2 at the action plan level and possibly at the default action as well.
A Poka-Yoke is an error-proofing tool that minimizes or prevents failure consequences. Here is a common example: At a filling station the diameter of the diesel pump nozzles are larger than the unleaded fueling receptacle in standard vehicles. This prevents someone from inadvertently fueling up with diesel in an unleaded gas car. Also, a 120 VAC electrical outlet will not accept a 240 VAC plug style.
In RCM2, Poka Yoke show up in Action Plans. In an action plan, an operator or maintainer may be given a tool or gauge used to check the wear of a component like a belt sheave. This is where Poka-Yoke finds its way into RCM2.
Before the 1980s, typical manufacturing processes kept their production lines moving by using WIP, to ensure any problems with upstream processes do not affect the downstream activities. These temporary stockpiles of partially finished / assembled products to draw from kept the production line going while the upstream problem was being fixed by maintenance.
With JIT, WIP is eliminated to reduce inventory – a capital cost to the business. Often, finished product is no longer stored at the manufacturing site. The idea is to produce goods, which get immediately shipped to the customer, continuously, without intermediary warehouse storage.
RCM2 fully supports the JIT model because it defines the performance standards in the function statements articulated by the users in the manufacturing plant. Furthermore, during failure management strategy development, RCM2 identifies the necessary skills to maintain those performance standards using a proactive PM task that is technically feasible and worth doing.
Ultimately, this leads to a much improved asset utilization, lower capital costs (since less inventory exists), which results in improved financial performance.
MRO / Critical Spares Analysis
MRO (Maintenance Repair and Operation/Overhaul) and Critical Spares Analysis are used to determine appropriate spare parts inventory levels. However, most of this work is based on historical failure rates and risk tolerance of a related failure.
RCM2 brings the concept of MRO and Critical Spares Analysis to a pinnacle. It offers the user a result that is process-driven through knowledge and logic that is wholly defensible.
To do this, we must understand how an asset fails at the failure mode level. With an asset’s Failure Modes formally documented, it then becomes possible to apply one of two major categories of maintenance strategies to the asset:
- A Condition-Based Maintenance (CBM) strategy for assets such as bearings, which fail randomly but are eligible for a proactive task for the detection (i.e. it gives warning signs) of the given Failure Mode (via the plant’s choice of proactive options such as inspections, PdM technologies, PM, etc.). The chosen proactive task results in an understanding of the state of the given Failure Mode at the point in time of the inspection.
- A No-Scheduled Maintenance (NSM) strategy for assets such as electronic devices and other complex kit like pneumatics and hydraulics, which fail randomly with little or no notice.
Both of these maintenance strategies require a different approach for making storeroom decisions whether to stock a spare versus not stocking one.
- CBM Strategies (See also the Kanban / PdM/CBM section)
In order to ensure the CBM collection frequency is correct, RCM2 uses a time horizon at the individual failure mode level called the P to F Curve. (see below) P is the point of Potential failure and F being the point of Functional Failure. If an asset inspection discovers the state of P, such that a corrective task is required, then the remaining time to point F (called the ‘Nett’ P-F time) must provide an adequate corrective task time horizon. This P to F time horizon is formally documented during an RCM2. Some simple math then determines the Nett P-F time, which is the remaining time available to the Maintenance Planner once a given P point is detected.
With the remaining time to functional failure clearly understood and available for comparison against the vendor’s spare part lead time, an informed stock vs. no-stock decision can be made. For example, a 4 to 5 day vendor lead time on the spare part required for a failure mode with a 2 year P-F interval leads to a clear no-stock decision on the spare.
If the failure mode is random (of which upwards of 80% of failure modes are) and the P – F curve is of no practical use – such as in the failure of many electronic device(s) – then on what technical basis can a no-stock spare decision be validated? The answer is found in the RCM2 analyses with two additional pieces of data captured in the Failure Modes:
- The Consequence of the failure.
- The Statistical Probability of the Failure (SPF) occurring derived from MTBF data.
Once determined, the SPF and MTBF are compared against the vendor lead time, cost of downtime/repair, cost of the part including expediting, and the number of asset locations that the part will spare. Based on this information a probability cost model is calculated, which determines when (in number of years), for a population of installed identical parts, the cumulative probability of failure will equal or exceed 50%. It is this number of years to reach the 50% probability of failure of the population that is used to make a cost decision of whether stocking the part is cheaper than the downtime cost of not stocking it.
Once developed, the cost model requires only the following data inputs to provide the stock/no-stock answer:
- Unit cost to purchase
- MTBF in number of years
- Number of identical running units
RCM2 bring process, logic, and knowledge to bear on the problem of MRO and Critical Spares Analysis, which has historically been driven by ‘emotional spares’ (spares stashed away in toolboxes and lockers ‘Just in case’, due to lack of confidence in the Storeroom, Stores personnel, the work process, CMMS, etc.), ‘gut feelings’ or because “That’s not the way things are done around here!”. (See Culture Change next)
Culture Change Management / KPIs / Balanced Scorecard
Culture change management has only recently come into its own as a standalone discipline. Most of the popular beliefs around managing culture change are to:
- Simply state the new order of things,
- Put process in place to check that the new order is being adhered to, and
- Manage anyone who deviates from the plan.
Unfortunately, it is not quite that simple, because culture change deals with people and people are very complex. A heavy handed top down ‘positional authority’ approach will only find resistance to the change. Now, the resistance may be hidden from the measures, but believe this – the resistance will be there!
People need to have a ‘compelling reason’ to change or at least a good reason to give the new order a chance. RCM2 wholly satisfies this requirement to give its participants that need for change. What’s more, RCM2 sets up a new cultural infrastructure of the new order of things:
- Providing maintainers, operators, supervisors, etc. with a common set of personal and organizational values with a common ‘language’
- Teamwork builds morale through collegial sharing/learning work issues
e. Greater safety and environmental integrity and operating performance
- Widespread ‘pride of ownership’ since the Team devised the PM tasks
- A clear view of resources needed: time, trades, spares, tools, materials…
- Real empowerment as they execute THEIR resulting PM routes/programs
- Establishment of a new set of rules guiding correct behavior (Rituals) and,
Events/activities that reinforce the correct behavior (Rites)
Physical asset / sub-system performance and PM task completion must be measured as a general rule in business management. The objective is to identify gaps between current performance and expected / desired performance. Ultimately, we use this ‘delta indication’ as a measure of progress towards closing the gaps. Well-chosen KPIs (Key Performance Indicators) highlight what areas of the business operation (production, maintenance, technical, etc.) need action to improve business performance.
Once these KPIs are implemented, measured, analyzed & optimized, then an organization has an important opportunity to integrate them into a Balanced Scorecard format. This crucial method enables us to compare the value of a company’s Financial performance with its Customer Satisfaction performance, Learning & Growth performance and Internal Business Processes performance. Using RCM2 fully supports the Balanced Scorecard’s goal to align business activities with the company’s vision, mission and strategy.
It is important to note that a company’s cultural behaviour can be modified through the Goal Achievement Model. The model puts in place KPI measures for day to day activities that support organizational initiatives, which realize corporate goals that are aligned with the company vision / mission.
Note RCM2 is critical from the Culture Change Management / KPIs / Balanced Scorecard perspective because it helps identify what KPIs must be constructed from Function performance standards, Kanban signals, PdM/CBM indicators, etc.
At the end of the day, the ‘tools in the toolbox analogy’ is valid, once you realize that RCM2 IS the toolbox.
Carlo Odoardi and Jay Shellogg
Principal Members, The Aladon Network
COCO NET Inc.
Tel: (905) 536-0865
Strategic Maintenance Reliability LLC
Tel: (903) 293-3539
More information on Aladon and the Aladon Network can be found at: www.thealadonnetwork.com .
 Stephen J. Thomas, “Improving Maintenance & Reliability Through Cultural Change”, © 2005, Industrial Press Inc., New York, USA
 Ricky Smith, Bruce Hawkins, “Lean Maintenance – Reduce Costs, Improve Quality and Increase Market Share”, ©2004, Elsevier Butterworth-Heinemann, Burlington, MA USA
 Ramesh Gulati, “Maintenance & Reliability Best Practices”, © 2009, Industrial Press Inc., New York, USA
 John Moubray, “Reliability-Centered Maintenance”, 2nd Ed., © 1999, Butterworth-Heinemann, New York, USA
 John D. Campbell & James Reyes-Picknell, “Uptime: Strategies for Excellence in Maintenance Management”, 3rd Ed., © 2015, CRC / Productivity Press, New York, USA