By Eric D. Tiffany, Solomon Associates
Every power generation manager is, in a sense, a risk manager balancing maintenance spending, technical performance and unavailability to maximize the financial performance for their assets. To combat this risk, companies employ a variety of different management strategies. However, many times, the analysis tools to support the strategies and tactical decisions are so complex and time-consuming that managers default to their experience and biases rather than undertake meaningful and objective analyses of their program’s effectiveness. Managers are in need of a simple, data-driven method that describes the effectiveness of their maintenance program and quantifies the risk of underperformance of major systems, the asset, groups of assets, or a fleet within the plant and allows for a more balanced approached to implementing maintenance programs.
Beyond the analyses and the numbers, managers are also in need of a credible tool that provides the basis for communication of the maintenance program’s effectiveness at the various levels of the generation organization. This communication tool must work equally well for both technical and financial managers.
Solomon developed its maintenance risk methodology to both describe maintenance effectiveness and provide a means of communication throughout the organization, allowing for the greatest potential improvement. The methodology involves a visual framework that focuses on the risk of sub-optimization of the financial and technical performance of the asset rather than merely emphasizing one aspect such as reliability.
Unreliability is considered by most as the probability that a system or component will not be able to perform its mission over a period of time. The effect of this unreliability can be measured in terms of the financial impact of the event causing the reliability. The management of the probability (that is, frequency) and the consequences (that is, severity) is typically captured through a company’s maintenance management program.
However, do all events result in the same severity? More specifically, what affects the financial performance of the asset more, a few costly events or numerous low-cost events? Our maintenance risk methodology simultaneously addresses both frequency/unreliability and severity/cost to assess the effectiveness of the maintenance program and provide the manager with a tool to improve their decision-making capabili-ties in balancing the maintenance program.
To begin, risk is defined by the following equation:
- Risk = Frequency x Severity
- Inserting industry-accepted metrics into this equation results in the following modified version:
- Risk (US $/MWh) = EUF (%) x Maintenance Index (US $/MWh)
- EUF = Equivalent Unavailability Factor
Maintenance Index = Solomon’s proprietary metric involving annualized maintenance costs
Equivalent unavailability was chosen as an indication of frequency because of its prevalence in the industry. Solomon’s maintenance index was selected because it involves a normalized version of a unit’s maintenance costs on a production basis. Maintenance Index, as defined by Solomon, includes non-overhaul (routine) maintenance expense and annualized overhaul, as well as special project maintenance expense.
As part of Solomon’s comparative performance analysis, major projects are annualized over the time the project benefits the unit. It is critical to annualize maintenance expenses to mitigate wild swings in costs between overhaul and non-overhaul years. In this regard, the maintenance index provides an estimate of the “true” maintenance cost of the unit rather than a cost that may be excessively high one year and low in others.
Reverting back to our definition of risk, when frequency and severity for varying conditions are plotted, lines of equal risk are created, showing constant risk product at any point on the line (that is, frequency x severity). Figure 1 presents a conceptual rendering of constant risk.
As shown in this figure, the high-impact, low-probability (HILP) event reflects the same level of risk as the low-impact, high-probability (LIHP) event. That is, they cost the same to the generating unit.
Companies typically understand the need to analyze and prevent HILP events, but LIHP events usually do not receive the same attention. Operations and maintenance personnel often become masters at working around chronic problems. It is entirely possible that a chronic problem such as reduction in load due to bowl mill issues occurring 300 times per year in a multi-plant facility will produce the same maintenance risk as one HILP. Both should be appropriately considered in structuring a maintenance program that considers spending and reliability.
Plant, Unit and Component Analyses
A series of constant risk curves can be generated over varying conditions to differentiate the risk between plants, units or equipment being compared.
The axes can be linear, which will yield curved constant risk lines, or log-log, which will yield straight constant risk lines.
Consider Figure 2 that portrays the maintenance risk of five plants comprised of seven generating units, approximately 300 MW each, having capacity factors ranging between 60 percent and 80 percent, all of which burn lignite.
Two plants in the upper right-hand corner of the graph reflect risk products that are substantially higher than the other three plants. The considerable difference between these two plants and the others suggests that a more detailed analysis is required. A reason must exist that explains (or causes) the behavior of the data such as equipment design, operating regime, maintenance program and so on. Although causality cannot be concluded, such an analysis provides an indication of problems with the plants and instructively calls attention to potential maintenance program issues that may need to be addressed.
Whereas Figure 2 involves five plants, the same methodology can be applied to the seven generating units that comprise these plants to further explore the potential causes. Additionally, the methodology can be applied at the component level to pursue detailed insight into the systems that are the major contributors to substandard performance. The maintenance risk of the components associated with Plant 2 (shown in Figure 2) is illustrated in Figure 3.
As shown in Figure 3, the boiler, turbine and coal pulverizers/hammer mills are the components carrying the highest risk in the plant. Whereas this information is useful, the boiler, turbine, and coal pulverizers/hammer mills often carry the highest risk in any planteven in top-performing plants. Additional perspective is gained by comparing the risk carried by these components to those carried by peers, as demonstrated in Figure 4.
Plotted on a log-log scale, this figure portrays the two units that exhibit high maintenance risk in the coal mills compared to the maintenance risk in the coal mills for five peer plants. All of the plants are approximately 300 MW, with capacity factors of 63 percent to 88 percent and burn lignite coal. The maintenance risk in the mills associated with the two subject plants is more than 10 times that of the peer plants, suggesting that the problem is significant and must be better understood. A review of the practices to determine why there are such significant differences in similar units is needed to diagnose and address the problem.
Figure 5 shows the maintenance risk of the five-plant, lignite-fired fleet used in the previous analyses over a 5-year period. If the history is examined without the use of constant risk curves, the path of maintenance risk seems somewhat erratic. However, when examined with the use of constant risk curves, the trend of maintenance risk is steadily moving across risk lines, downward and toward the left. This trend suggests that both the maintenance index and EUF are decreasing, which means that the maintenance program is effectively managing (and decreasing) the risk.
Which Way to Go?
In some instances, moving in the direction suggested by Fig. 5 is not attainable or desirable due to circumstances such as market conditions, economics and so on. It is up to the individual generator to decide what level of risk is acceptable (that is, what direction they would like to move) and what needs to be done to get there.
Once opportunities have been identified and goals set for improvement programs, several different actions are possible for managing the risk toward the desired direction, any of which can yield the same improvement. (See Fig. 6.)
Severity (cost) can be reduced moving down the vertical axis, frequency (unavailability) can be reduced moving from right to left on the horizontal axis, or any combination inbetween, yet the maintenance risk remains constant.
Solomon’s experience suggests that seldom do companies reduce risk by reducing severity (cost) first, primarily because there tends to be an underlying frequency (unavailability) issue that is driving the higher costs. Cutting costs would likely only exacerbate the problem. Unavailability should typically be addressed first, even at a higher, near-term cost to allow the generator the best chance of reducing costs in the long term.Nonetheless, maintenance risk provides a means to address these issues and set realistic targets for improvement.
The initial development of our maintenance risk methodology was based on the premise that capital is deployed to reduce total unavailability. That premise is sound for base-load power plants, whose spark spread allows them to participate in the market most of the time.
For intermediate or peaking plants, the methodology holds true using forced outage rate in lieu of EUF.
An additional use of maintenance risk involves capital rationing. If the generation organization is forced to reduce overall costs, is it better to cut all plants equally or cut proportionately with respect to maintenance risk and how it can be managed or incorporated? If existing practices have produced low cost and low unavailability, then funding for maintenance programs could likely be cut with less impact than cutting funding for high unavailability areas.
Maintenance risk is not complicated, yet its value is tremendous. In this regard, convincing engineers of its merits should be straightforward. As such, using maintenance risk facilitates the communication of the rationale, strategy and logistics of maintenance programs and improvements to all audiences throughout the organization. In a series of a few figures, managers can quickly step from plant to unit to major system to quickly identify opportunities for potential improvement in the context of their peers and in support of their actions.
The use of objective data is much more convincing than just a mandate to improve.