Case Study: TRUXX Gas Conditioning Skid Reliability Study

Briana Liddell
Jun 15
6 min read

Designing Data Center Grade Reliability for Fuel Gas Conditioning Systems

Client Info

Algas-SDI International

Oil & Gas / Data Centers

April–May 2025

Summary

Algas-SDI engaged Solvify to quantify what their TRUXX fuel gas conditioning platform can deliver for data center applications, across multiple single-skid and multi-skid redundancy configurations
We applied Event Tree Analysis to build a component-level reliability model covering all mechanical, electrical, and controls failure modes
The most significant finding: shared systems (LEL detection, waterbath, ESD) drive skid-level common-cause failures that internal train redundancy does not address; independent skids are the architecture that eliminates the exposure
Operational execution, specifically spare parts availability on site, staffing readiness, and preventive maintenance, drives realized uptime as much as equipment design does

Introduction

As behind-the-meter power generation has become foundational infrastructure for hyperscale data centers, the systems that condition fuel gas for those generators have moved from afterthought to critical path. When conditioned gas stops flowing, engines stop turning, and compute capacity goes with them. The fuel gas conditioning skid, which most people treat as a utility component upstream of the real equipment, turns out to be one of the most consequential single points of failure in the entire generation stack.

Algas-SDI, a manufacturer of fuel gas conditioning systems for the oil and gas and industrial power generation markets, engaged Solvify to quantify what their TRUXX line could realistically deliver for mission-critical data center applications, and what it would take to get there.

Our background in nuclear licensing and design for operation put us in an unusual position to answer it: which configurations could credibly meet Tier III and Tier IV standards, and what would have to be true about the design, the installation, and the ongoing operations program to sustain that performance.

The Challenge

Availability questions cannot be answered by looking at components in isolation. Individual failure rates tell you very little about how a system actually behaves, because uptime depends on architecture: which components are redundant, which failures affect multiple process paths simultaneously, how reliably the system transfers load from a failed element to a standby, and how quickly a failed component can be returned to service. All of those factors interact in ways that require explicit modeling.

Equally important was operational reality: a model built on ideal conditions is of limited value to a customer planning maintenance staffing, spare parts procurement, and response capabilities. Real-world uptime is shaped as much by how a system is operated as by how it was designed

Objectives

The project was structured around four objectives:

Build a comprehensive, component-level reliability model of the TRUXX 3000WB covering all mechanical, electrical, and controls failure modes
Evaluate multiple system configurations and redundancy architectures against recognized data center uptime standards, with guidance on which configurations qualified on what architectural basis
Identify the operational practices necessary to sustain the modeled reliability under realistic site conditions
Develop a preventive maintenance program grounded in industry standards and manufacturer specifications

Our Approach

The methodological foundation of this work was Event Tree Analysis, a forward-looking, inductive risk assessment technique with roots in nuclear safety analysis. Rather than cataloging failures independently, Event Trees model how an initiating failure propagates through a system with multiple protective barriers: when a component fails, does the system transfer load successfully to a redundant element, and if that element is also unavailable, what path does the failure follow from there? This approach lets you calculate the probability of every outcome, including those that ultimately lead to loss of gas flow.

We built a component-level model of the TRUXX 3000WB covering every element affecting uptime, working from P&IDs, wiring diagrams, and PLC logic provided by Algas-SDI, supplemented by direct collaboration with their engineering team. Failure rates combined manufacturer-specific data, Algas-SDI's field service history, and established industry reliability databases. The Event Tree structure explicitly separated train-level failures, where redundant trains can continue to supply gas, from skid-level failures caused by shared systems that affect both trains simultaneously. That distinction was essential to correctly evaluating how different types of redundancy contributed to availability, and we evaluated the model across a range of single-skid and multi-skid architectures at both 35 and 70 MMSCFD operating scales, treating uptime performance and tier eligibility as independent measures throughout.

Technical Highlights

Within a single TRUXX skid, certain components serve all process trains simultaneously. The LEL combustible gas detection system, the waterbath, and the emergency shutdown (ESD) circuit are all shared across trains: if any of them trips or fails, every train on that skid shuts down regardless of which one was carrying the load. These are common-cause failures in reliability engineering terminology, and they represent a structural characteristic of how waterbath-style conditioning skids are designed.

Our analysis surfaced and quantified their significance for the TRUXX architecture. Adding a second train within the same skid protects against train-level faults but provides no protection against a skid-level shutdown triggered by LEL detection, a waterbath event, or ESD activation. Adding a fully independent skid, with its own detector, waterbath, and ESD system, eliminates that exposure entirely, and the quantitative difference between these two paths to redundancy is considerably larger than a simple train count would suggest. It is also worth being explicit that the shared safety systems are not a design flaw: LEL detection and ESD are intentionally skid-wide and intentionally fail-safe, and reliability improvements are achieved by designing around that architecture through independent skids, not by compromising safety functions.

Additionally, the sensitivity analysis found that the model's uptime predictions were robust to variations in diagnostic time, defect rates, and preventive maintenance scheduling, but that spare parts availability drove more sensitivity than almost anything else. When logistics delays from external parts sourcing replaced the on-site availability assumption, the impact was material, confirming that the gap between theoretical and realized uptime is primarily determined by operational execution.

Results

The modeled TRUXX node achieved baseline uptime performance well in excess of 99.9% under the defined operating conditions, consistent with the availability requirements of permanent data center installations. Across the full range of configurations analyzed, the results showed a clear, quantified relationship between architectural choices and availability outcomes: a single-skid configuration with no skid-level redundancy delivered uptime well below data center standards, while fully redundant multi-skid architectures with cross-connected headers achieved performance that comfortably exceeds the five-nines threshold. The most impactful design decision was consistently the move from single-skid to multi-skid operation, driven by the elimination of shared-system failure exposure rather than by component-level changes.

The analysis also established realistic operational planning baselines. The expected labor hours per year for corrective maintenance, estimated equipment failures within the architecture, and an annual approximation of the scheduled preventive maintenance burden were provided. Even with that maintenance load integrated into the model, the system remains above the 99.9% threshold, provided maintenance windows are managed to take down only one train or skid at a time within a redundant architecture.

Validation

The reliability model was subjected to a structured sensitivity analysis to confirm that the conclusions held under variations in the assumptions most likely to affect the results. Diagnostic time, manufacturing defect rates, and preventive maintenance scheduling were each varied across realistic ranges, and in every case the uptime predictions remained above the 99.9% threshold, with no individual sensitivity materially changing the relative ranking of configuration options.

The spare parts finding warranted specific operational guidance and was explicitly called out in the final deliverable: when logistics delays reflecting external parts sourcing replaced the on-site availability assumption, the impact was large enough to change outcomes for some configurations, directly informing the recommendations on stocking levels and staffing readiness. The component inventory and functional logic underlying the model were reviewed across multiple document revisions with direct input from Algas-SDI's engineering team, and all failure rates were cross-referenced against manufacturer datasheets, field complaint history, and multiple independent industry reliability databases, with each source documented and traceable in the study report.

Conclusion

This engagement demonstrates what becomes possible when Event Tree Analysis and nuclear-grade reliability methodology are applied to industrial fuel gas conditioning systems. Algas-SDI now has a traceable, quantified, defensible answer to what the TRUXX can deliver for data center applications and under what conditions. The broader lessons apply well beyond this project:

Operational execution drives as much of realized uptime as equipment architecture and availability percentage are independent measures that both belong in any serious specification.
Shared-system effects inherent in skid-level architecture are quantifiable and directly relevant to how redundancy should be specified.

For anyone procuring or designing fuel gas conditioning for high-availability power generation, that kind of analytical clarity is exactly what this market needs.

Solvify brought nuclear-industry rigor to a reliability question our data center customers were increasingly asking. The Event Tree methodology gave us a defensible, traceable model, exactly the kind of analysis that holds up when sophisticated buyers ask hard questions.

— Sean Guichon, Sales Director Gas Systems, Algas-SDI International

Case Study: TRUXX Gas Conditioning Skid Reliability Study

Designing Data Center Grade Reliability for Fuel Gas Conditioning Systems

Recent Posts

Comments