Semiconductor manufacturing operations are highly complex. Downtime can be a major challenge, costing a multi-billion dollar fab millions in lost profitability. Even a relatively small breakdown can bring production to a standstill and push the entire schedule back. With high lead times and a 24/7 production schedule, no modern day fab can afford breakdowns. This is especially critical when it involves an equipment manufacturer located halfway across the globe; it may take days to resolve the issue. Add to that the complexity brought about by the COVID-19 pandemic, which restricts free movement — downtime is an event to be avoided at all costs.

Typically, a wafer fab with a construction cost of $7 billion would need to recover roughly $4 million per day just to amortize the investment. Any breakdown which brings production to a halt even for a minute is highly undesirable.

The current state of the semiconductor industry is that demand is far outstripping supply. The whiplash effect of global supply chain shortages is causing semi manufacturers to employ drastic new measures to squeeze out efficiencies in production. One of the largest contract chip manufacturers in the world has resorted to tactics such as delaying certain maintenance tasks and speeding up, just fractionally, the speeds of the line. In the long term, it’s not a sustainable practice, but it addresses in the short term capacity constraints.

Another unexpected aspect of the shortage? The delivery of the equipment to build the chips is seeing a doubling in lead time. So even if manufacturers could afford to add processing equipment (which can be $100 million or more for a single machine) it wouldn’t address the capacity constraints they are facing.

Turning to Predictive Maintenance

Predictive Maintenance then becomes a strategic initiative in a highly complex, automated semiconductor fab. For a semiconductor plant, predictive maintenance is crucial to the operation. Predictive Maintenance allows process owners and maintenance personnel to proactively detect equipment-related issues before there is a breakdown. Production and schedule adherence are protected, and unplanned stoppages avoided. 

IEEE published a document in 2015 which set forth guidelines for Predictive Maintenance (PdM) for semi. They stated that PdM capability has migrated from a Proof of Concept/Fault Detection system to a fab-wide solution. The guidelines included that it should be portable across instances of tool types, and across tool types. The model should be customizable to the fab’s specific needs, taking into account the equipment, existing practices and people.

What this critical finding signifies is that with the rapid adoption of IoT sensors, the advancement in data collection and processing technologies and advanced analytical capabilities brought forward by AI and ML, it is now possible to predict equipment failures with a degree of certainty through PdM modelling and advanced analytical algorithms. It should be able to scale to incorporate the inherent complexity of a modern automated wafer fab and its operations.

IoT Data Platform

From a PdM perspective, process orchestration software like a MES IoT data platform will provide extensible and scalable predictive maintenance models. The MES can deliver high quality and precise predictions to offset possible failures.

Using sensors detecting even the minutest of changes in equipment states and metrics, ranging from internal vibrations, acoustic anomalies, temperature variations and pressure fluctuations, the MES IoT data platform collects this data from the edge and compares findings at the backend. Deploying complex predictive analytics algorithms, it informs the relevant personnel with the precise time of a possible breakdown. This advance notice provides enough prep time to reschedule the line, plan a repair and conduct maintenance for optimum uptime. It can remotely guide engineers through a complex maintenance activity without any disruption in process orchestrations. Because it is the same MES which detects a possible issue, provides alternate product routes, recipe amendments and tooling options to maintain lead times and product quality parameters.


McKinsey defines this new predictive maintenance as PdM 4.0. It is an asset-wide analytics system to inform trained operators on how to respond to predicted failure events. While it is understood that PdM 4.0 is highly desirable for a semiconductor fab, McKinsey also highlights the challenges which exist in achieving this advanced level of maintenance functionality.

These challenges from a semiconductor manufacturing lens are:

1. Insufficient Data

Insufficient, inaccessible and low quality data are a major impediment in achieving PdM 4.0-level maintenance. In a typical semiconductor fab, data unavailability may be resolved through adoption of IoT sensors and harnessing data from lower level process automation applications. To make the data accessible and high quality, it is imperative that the fab deploy a modern MES platform. A modern MES not only harnesses the data, but also makes it accessible to modelling engineers and data scientists. This ultimately leads to advanced analytic algorithms that detect patterns in raw data and convert them into actionable predictive maintenance intelligence.

2. Inadequate Technology

This refers to an absence of edge sensing and inability of the existing IT infrastructure to support PdM 4.0 operations. For semiconductor plants which lack the requisite amount of sensors on mission-critical tools, if guided by the right partner with industry experience in MES and automation, it is entirely possible to retrofit an existing brownfield fab with modern IoT sensors. Having the right data from the right equipment is where the expertise and experience of the partners come into play. Once data acquisition issues are resolved, the current IT infrastructure needs to be evaluated. Is there integration of existing point solutions and legacy systems? How will an MES data platform be best deployed to simultaneously integrate with essential applications while ‘rightsizing’ the irrelevant ones?

Without the right MES application as the foundation of the modern application framework, being able to extract equipment data in real time, adding AI and ML to trigger instant alerts and run the complex algorithms necessary to enable PdM, will be extremely challenging.

3. Asset Prioritization

Many companies lack a clear view of which assets need to be included in a PdM program. For semiconductor fabs, due to the complex nature of manufacturing, with multiple re-entrant loops and long production cycles, it is almost a given that all process equipment need some level of monitoring to restrict sudden breakdowns. Achieving this fab-wide monitoring is impossible without a modern MES acting as an overarching and central system. The MES allows data from equipment, the fab environment, production and tooling data to be pooled and cross-referenced. This can be used for latent patterns to be detected, which enable a predictive maintenance model. Unless this oversight exists, achieving PdM 4.0 will be difficult for a semiconductor fab

4. Missing Capabilities

Analytical models are vital to a healthy preventive maintenance program which is extensible, scalable and self-actuating. The kind of data science and engineering expertise which is required to build these PdM analytical models is oftentimes missing in a semiconductor manufacturing facility.

This is where your MES vendor can play a critical role. The ability to compare data from tools with past failures, current conditions and to predict failure keeps an operation optimized. Rather than focusing on acquiring data scientists, you can divert that energy to identify and use a capable partner. They can deliver a seamless and successful transition towards predictive maintenance. The Gartner Magic Quadrant for MES is one such tool which is helping companies choose the right MES partner

5. Weak Change Management and Low Economic Return

The final two reasons a PdM implementation is derailed hold true for any project. If the change to a state where predictive maintenance leads to definitive action is poorly managed, chances are breakdowns will still occur. If poorly managed while a PdM is already implemented, since it is not well adopted and does not deliver promised returns, it may lead to further frustration and a failed implementation.

User-unfriendly designs and the high cost of creating models are cited as main impediments. Having the right partner in place could avoid these conditions. An experienced MES vendor already has the know-how and expertise of deploying complex analytical models. If the fab manufactures chips in standard 200 or even modern 300 mm fabs, an experienced MES partner would be able to deliver on most PdM use cases and would have standard models which may be modified to suit any custom needs.

Another distinct advantage of having an experienced partner is better change management. Since the vendor is aware of most issues, it is easier for them to help adopt the improved PdM practices.

6. Tech partners

Finally, and as one of the Golden Rules of obtaining a successful PdM 4.0 implementation, McKinsey points out that Tech partners matter, and we agree. The right MES partner brings in the data mining, data analysis and data modelling expertise allowing to build models which form the basis of predictive maintenance. Through their data platform the acquisition, analysis and dissemination of information happens in real-time.

Needs are customizable and incorporated, based on the inherent and unique operational structure and complexity therein. ‘One size fits all’ is a myth when it comes to PdM in semiconductors. An experienced partner will not only provide technical capabilities, but will assist in training staff, incorporating contingent requirements and functionality modifications. They will provide end-to-end work management and implement effective equipment integration practices, which in turn creates value through actual results and reduced downtime.