May 2018

Special Focus: Maintenance and Reliability

Advanced steam system optimization program

The steam system forms an integral part of the safe, reliable and profitable operation of a process plant. Steam constitutes approximately 30% of the energy used in a typical petroleum refinery.

Hou, A., Mita, T., TLV International Inc.

The steam system forms an integral part of the safe, reliable and profitable operation of a process plant. Steam constitutes approximately 30% of the energy used in a typical petroleum refinery.1 It is utilized throughout the plant for motive, heating and process purposes, such as in the steam turbine driver for the recycle gas compressor, the reboiler for the depropanizer column, and for stripping steam for the crude distillation unit. 

The production process and its feed and product streams are the lifeline of the plant. In comparison, the steam system often may receive less attention and might even be treated as a “black box.” For instance, the recycle gas compressor, together with its steam turbine driver, are regarded as a critical asset, but the steam traps around it, which are essential to ensure reliable operation of the turbine, are often not managed in the same manner. In addition, a reactive strategy is generally adopted for maintenance and optimization of the steam system, and, as a result, action is taken only after a problem becomes too severe to ignore.

However, there can be great rewards in proactively optimizing the steam system. The benefits fall into two main categories: 

  1. Energy savings
  2. Plant reliability improvement and reduced risk of production loss.
FIG. 1. Estimated and actual reduction in steam loss at one refinery (verified by flowmeter).

Energy savings. Since 2005, plantwide steam system optimization programs implemented at refineries and petrochemical plants in cooperation with a steam specialist company have produced significant energy savings.a

For example, 37 metric tph of steam losses were reduced between 2005 and 2008 at a major Japanese refining group as a result of improved steam trap survey and management practices (FIG. 1).2 The estimated steam reduction identified during the initial survey was compared against the actual flowmeter-measured steam reduction when subsequent maintenance work progressed. The actual results confirmed the survey estimation and opportunity expectations for energy reduction.

Based on more than 60 surveys focusing on steam energy reduction conducted by the authors between 2005 and 2016 at petroleum refineries and petrochemical plants around the world, an average of 4.6% steam reduction potential was identified (for an average total plant steam generation rate of 640 metric tph).

FIG. 2. Failure of a steam pipe joint due to water hammer.

Plant reliability improvement. Energy savings aside, the steam system can have an even greater impact on the safety and reliability aspects of plant operations. A catastrophic failure of a joint along an 18-in., 200-psig steam pipe at a large petrochemical complex in 2000 caused the disruption of steam supply to downstream plants and suspended production for 4 wk (FIG. 2).3 The subsequent investigation identified the failure mechanism as condensate-induced water hammer, a dangerous but unfortunately common phenomenon that can occur in a steam and condensate system.4

Several other instances of production disruptions, injuries and, in some cases, fatalities caused by steam system problems, typically involving water hammer, can be found in public records.5,6

Apart from water hammer, a “non-optimized” steam system can lead to plant and equipment reliability problems in other ways. Wet steam supply is known to cause internal damage to critical equipment, such as steam turbines or heat exchangers. Steam ejector vacuum system performance, vital to process stability and product specifications, can be severely impaired by poor steam quality.7 Even steam tracing lines, a commonly overlooked part of the steam system, can have the potential to cause substantial accidents and production outages if not maintained and managed properly.8 

To realize the previously mentioned reliability improvement and risk reduction benefits while maintaining energy efficiency, a comprehensive, structured and sustainable approach to steam system optimization is required. With emphasis on ensuring the competitiveness and profitability of any process plant, the key phases of such an integrated approach are presented. 


Phase 1: Optimize all condensate discharge locations. The key aspect of a healthy steam system is its ability to supply dry, high-quality steam to its users, while continuously discharging the condensate that is inevitably formed due to heat loss, without unnecessary steam leakages. The element that shoulders the main responsibility is the steam trap found at each condensate discharge location.

Steam trap failures or design issues can quickly escalate into production problems on a larger scale, such as the cases referenced in the previous section. At the very least, they can be a significant source of energy loss, as past inspection results show that, on average, approximately 6.6 metric tph of steam leakage due to trap failure can be expected from a medium-sized refinery with a trap population of 10,000.

The more serious issues result from insufficient condensate discharge, such as steam trap blockage failures, operational mistakes or design inadequacies. The water hammer accident previously highlighted was caused by the isolation of a steam trap at a critical condensate discharge location. Inadequate condensate discharge from stripping steam lines was identified as the leading cause of water-induced pressure surges that led to damage of distillation tower internals.9

Steam system optimization starts with ensuring the proper design and operation of these steam traps. Regular surveys of the condensate discharge locations, combined with timely maintenance action for the failed steam traps, are the basis of Phase 1 optimization.

However, conventional steam trap surveys may not be able to fully identify the problem locations. Typical challenges include:10

  • Providing efficient and sustainable database management of a large steam trap population
  • Ensuring that diagnostic equipment is accurate and state-of-the-art
  • Ensuring that inspection personnel and methodologiesare at the highest standards
  • Identifying root causes of failure
  • Selecting practical and cost-effective lifecycle solutions that improve the performance of target applications
  • Coordinating the inspection results with effective maintenance actions.
FIG. 3. Mapping trap condition, with an emphasis on locations with repeated failure. Deeper-color shades indicate higher failure frequencies.

A systematic and sustainable program that addresses the challenges has been developed and implemented at 58 plants in the refining and petrochemical industry since 2005.b The accumulated inspection records from more than 250,000 steam traps over 12 yr as part of this program have proved to be unique and valuable data sources. When applied at the individual plant level, the program enables time-based, location-specific historical analysis that can reveal sections of the plant with higher-than-usual failure rates, indicative of deeper problems that had previously remained hidden (FIG. 3).




These inspection records have also been used as the basis of an extensive set of generic failure frequencies for steam traps, covering a wide range of designs, operating pressures and applications. The large sample population size, the significant length of recording time and the accuracy of the inspection data lend statistical credibility to failure frequencies, therefore providing a reliable and effective basis for risk-based decisionmaking.11

Phase 2: Optimize steam applications. The steam-using equipment, or steam applications, in a process plant may come in a variety of configurations, but the basic principles of steam engineering and utilization do not change (FIG. 4).

FIG. 4. Examples of steam applications.

Based on these basic engineering principles, plantwide surveys have been performed for refineries and petrochemical plants, covering all steam applications (typically 200–300 in each plant).c

During one such survey at a petroleum refinery, a vaporizer in the lubricants unit was found to be operating suboptimally. The operators had long suspected that the solvent deasphalting process was bottlenecked by the solvent recovery rate (FIG. 5).

FIG. 5. Vaporizer problem in the deasphalting process.
FIG. 6. Condensate buildup in heat exchanger tubes.

An onsite investigation identified that the total backpressure from the steam condensate return system was higher than the steam operating pressure of the vaporizer. This situation, known as a “stall” condition,12 caused the steam condensate to build up and subcool in the heat exchanger, reducing heat transfer rates (FIG. 6).

In these situations, it is common for the unit operator to resort to either operating the equipment with the condensate bypass valve open, or discharging the condensate to drain, removing the backpressure. However, this usually results in steam loss (and increased backpressure in the return line) as the open bypass is unable to modulate to load changes, or discharges valuable condensate to drain, which could otherwise be recovered.

Instead, a more appropriate solution was engineered using equipment specialized for overcoming stall conditions. The production rate was increased, which resulted in an annual benefit of $600,000.

In addition to energy loss, water hammer, stall issues—such as those previously outlined—and other common steam application problems identified through Phase 2 steam application surveys include:

  • Heat exchanger temperature cycling
  • Steam turbine damage
  • Sulfur pit, tank coils and steam trace heating issues
  • Flare tip damage.

In many of these cases, the main driving factor for optimization is the reduction of risk, whether from production loss, component damage, environmental impact or personnel injury. Accordingly, the decision-making process toward prioritizing and justifying the optimization action relies heavily on a risk-based approach, such as the guidelines developed by the American Petroleum Institute (API).13

Up to this time, risk-based assessments of process plant assets have generally discounted the influence from the steam system components, such as the steam traps. However, the risk contribution from these components is undeniable, as seen in the examples described previously.

An original methodology has been developed for the quantitative risk assessment of steam-using equipment and steam distribution systems.d The base probability of failure (PoF) of the steam-using equipment is derived from industry-generic failure frequencies and combined with the PoF of the components associated with the equipment (e.g., steam trap PoF based on generic failure frequencies outlined in the previous section). Actual onsite conditions are accounted for using probability factors that tailor the PoF for specific steam-using equipment.14

FIG. 7. Risk assessment of a depropanizer reboiler. The 5-yr risk mitigation value was estimated at $1.2 MM.

The matrix, shown in FIG. 7, is an example of this methodology as applied to a depropanizer reboiler system in a refinery. The risk was quantified based on the conditions at the time of assessment, and the potential reduced risk was simulated based on the mitigation actions identified to be appropriate for that system.

Assessment data enables the asset owner to objectively visualize the equipment’s criticality against the other equipment to be maintained, while providing a means for cost-benefit analysis and selection of the most appropriate course of action.

Furthermore, the calculated PoF is time-dependent, so the failure risk at the time of assessment and in subsequent years can be projected, enabling proactive risk mitigation planning.

A risk-based approach to steam system maintenance optimization not only prioritizes the various steam assets and applications, but also enables selection of the most cost-effective maintenance actions.

FIG. 9. Example of a steam balance for a refinery.

Phase 3: Optimize the steam balance. The balancing of a plant's steam, water and electrical power is a delicate and continuous effort (FIG. 9). For instance, processes and utilities may be adjusted to meet new requirements due to product mix changes, or steam applications may be optimized, causing the steam balance to shift into a venting state (i.e., excess low-pressure steam). In some cases, a straightforward solution is to alternate between steam turbine and electric motor drivers, or to adjust letdown valves to maintain an optimal system balance.

However, these actions may not fully eliminate the venting situation, and a combination of other methods may be required to optimize the steam balance. Practical and efficient means to more fully utilize excess lower-pressure steam, or tweaking the pressure of a particular steam level, are examples of methods that have been used effectively to rebalance the steam systems of large process plants.

The impact of such projects on the overall steam/water/power balance can be simulated using proprietary balance models.e The practicality and accuracy of the simulation model depend heavily on the quality of its parameters, especially the steam consumption (or generation) flowrates of every steam application. This can be challenging when flowrates for most applications are unmetered.

For this reason, an effective approach is to combine Phase 3 balance analysis together with the Phase 2 survey, where all steam applications are reviewed for optimization, as was performed at a North American refinery in 2011.15


After implementing each phase of steam system optimization, changes to the steam system mean that the system may gradually start to deviate from its optimized condition if no further actions are taken.

Just as the human body can maintain its healthy state through a habit of regular health checks for early problem detection and timely treatment, a process plant can maintain its optimal state only through a sustainable program of regular steam system inspections and timely corrective actions.

Past analysis of large accidents in the hydrocarbon and chemical industries indicated that, after mechanical failure, the second leading cause of accident losses was direct operational errors.16 In other words, human and organizational factors play a significant part in the safe and reliable operation of a plant. This is also recognized in API RP 581, which defines the PoF as:

PoF = gff · Df (t) · FMS


gff = the generic failure frequency of the equipment item 

Df (t) = the damage factor, which accounts for the relevant damage mechanism and inspection effectiveness 

FMS = the management systems factor.

FMS accounts for the quality of the organization’s management system and its influence on the overall plant integrity. In this definition, a weak management approach can increase the probability of failure by a factor of 100 over a plant with “perfect” management.

Accordingly, an effective approach to steam system optimization must include improvements to the human aspects of plant operation. For example, best practices identified through field surveys can provide feedback to improve standard operating procedures. The problems observed and the implementation progress of their solutions can be tracked to ensure completion and prevent future recurrences, while being recorded for use in ongoing training and educational materials.

Technological advancements and next-generation analytics are enabling more powerful tools to assist and streamline decisionmaking throughout the organization. A key element is the increasing visualization of performance and reliability indicators made possible by advanced (or more affordable) sensors. However, simply expanding the scope of online monitoring increases the risk of operators being overwhelmed by data overload and alarm fatigue, as well. Instead, the true value of visualization should be captured by integrating the data with expert knowledge and system experience to create relevant outputs, such as failure prediction models.f These outputs can subsequently complement reliable, risk-based selection of the mitigation actions and ultimately allow time and resources to be redirected to more critical tasks or further optimization opportunities in the plant.

Takeaways. The accumulation of know-how in the engineering and study of steam systems for process plants has brought about the realization that problems present in the steam system can have harmful effects on production. Simultaneously, it has created advancements in the condition monitoring and timely optimization of steam-using equipment and their associated steam system components. A novel methodology has been developed for the quantitative, risk-based assessment of the steam system and related equipment. Integrating this methodology into a structured, sustainable program provides the means to ensure higher efficiency and integrity of the entire steam system asset. HP


a  Refers to TLV Co. Ltd.’s Steam System Optimization Program (SSOP)
b  Refers to TLV Co. Ltd.’s Best Practice of Steam Trap Management Program (BPSTM)
c Refers to TLV Co. Ltd.’s CES Survey
d Refers to TLV Co. Ltd.’s Steam System Risk Mitigation (SSRM)
e  Refers to
f  Refers to


  1. United States Environmental Protection Agency, “Energy efficiency improvement and cost saving opportunities for petroleum refineries,” February 2015.
  2. Hara, Y., “Jyumandai no suchimu torappu kara no jyoki roe sakugen” [Reduction in steam losses from 100,000 steam traps], Journal of Energy Conservation, April 2010.
  3. Galante, C and S. Pointer, “Catastrophic water hammer in a steam dead leg,” Loss Prevention Bulletin, ISS. 167, October 2002.
  4. Health and Safety Executive, “Major incident investigation report, BP Grangemouth Scotland 29th May–10th June 2000,” August 2003.
  5. Kletz, T., “Imperial chemical industries petrochemicals division,” Safety newsletter number 43, August 1972.
  6. Kletz, T., “Imperial chemical industries petrochemicals division,” Safety newsletter number 57, October 1973.
  7. Lieberman, N. and R. Cardoso, “Troubleshoot operation of a steam ejector vacuum system,” Hydrocarbon Processing, February 2016.
  8. Kletz, T., What went wrong? Case histories of process plant disasters, 4th Ed., Elsevier, 1999.
  9. Kister, H. Z., “What caused tower malfunctions in the last 50 years?” Chemical Engineering Research and Design, January 2003.
  10. Walter, J. P., “Implement a sustainable steam-trap management program, Chemical Engineering Progress, January 2014.
  11. Pittiglio, P., P. Bragatto and C. Delle Site, “Updated failure rates and risk management in process industries,” Energy Procedia, January 2014.
  12. Risko, J. R., “Steam heat exchangers are underworked and over-surfaced,” Chemical Engineering, November 2004.
  13. American Petroleum Institute (API) Recommended Practice 581, “Risk-based Inspection Methodology,” 3rd Ed., April 2016.
  14. Cane, B., “Risk-based methodology for industrial steam systems,” Inspectioneering Journal, May 2017.
  15. Chen, M. and F. Roberto, “ExxonMobil Beaumont Chemical Plant study of steam and condensate systems for the entire plant,” 34th Industrial Energy Technology Conference 2012.
  16. Krembs, J. A. and J. M. Connolly, “Analysis shows process industry accident losses rising,” Oil and Gas Journal, August 1990.

The Authors

Related Articles

From the Archive



{{ error }}
{{ comment.comment.Name }} • {{ comment.timeAgo }}
{{ comment.comment.Text }}