How IBM built an AI model to discover railroad defects before they’re critical

IBM Research used the Norwegian railroad authority’s trove of photos to train a new AI model to detect and track early-stage issues before they can cause problems. The aim is to clear maintenance backlogs and free up skilled workers from tedious inspections.

Walking the length of a railroad track looking for structural issues seems old fashioned. Believe it or not, that’s a large part of how rail inspections are conducted today.

IBM Research is changing that with a visual inspection model, deployed by Maximo and Technology Expert Labs (TEL), on rails owned and maintained by Norway’s railroad authority, Bane NOR. Its purpose is to more easily ensure that thousands of kilometers of rail are safe and secure.

The new AI model can accurately detect 10 different railroad defects to flag small problems to rail inspectors and engineers before they become large problems. It can also help track these early-stage issues even if they don’t yet need to be repaired, letting inspectors monitor them over time and see how they evolve.

Using images of the tracks, the model will make it possible for these skilled workers to spend their valuable time making repairs, and less time walking the tracks. The model is part of a larger project to modernize Bane NOR’s maintenance systems, and for IBM Maximo Civil Infrastructure, it’s part of a broader move to improve and automate inspection processes for heavily regulated industries, according to Thomas Knowles, a senior product manager for Maximo.

This project grew out of previous work from IBM Research inspecting concrete defects on aircraft runways, relying on AI analysis of drone-based photography. In 2023, an IBM Research team conducted a proof-of-concept experiment with Bane NOR, demonstrating that they could use AI to detect very tiny cracks on rail sleepers, the concrete beams that lie perpendicular beneath train rails to distribute loads and maintain the track’s proper width, or gauge.

“We saw with a few adaptations, we could already get pretty good results,” said IBM Research scientist Florian Scheidegger, who worked on the runway crack detection project and led the development of this new model. Bane NOR was impressed enough by these initial results that the company signed on with IBM’s TEL and Maximo divisions to develop and deploy an improved model that caters to their high-priority track defects and integrates with its asset management systems. This pre-trained model is now available to Maximo customers in the 9.1 release of Maximo Civil Infrastructure and can be deployed using Maximo Visual Inspection.

This latest project builds on years of visual inspection innovations that IBM Research has brought to civil infrastructure applications. For diverse infrastructure applications including bridges, roads, manufacturing equipment, and other areas where defects are often small and rare, domain-specific large vision models fine-tuned on customers’ proprietary datasets are giving engineers new tools to ensure the safety and efficiency of structures meant to last for generations. Now, all that experience, learning, and technology transfer has yielded a sophisticated visual inspection model for Bane NOR.

Railroads in other countries may face different conditions than chilly Norway, but they’re all looking for the same basic defects. “Any rail customer that does visual inspections — which they all do — can take advantage of the work we’ve put in here to provide this out-of-the-box pre-trained rail defect model,” said Knowles.

Minor threats

A railroad track superstructure — or “overbygning,” as it’s called in Norwegian — consists of three main parts: the rails, the sleepers, and the fasteners. The train’s wheels roll on the rails, sleepers tie the rails together, and fasteners are small clips that attach the rails to the sleepers with the help of tie plates and insulators. Inspectors walking the rails make sure none of these components have broken or otherwise degraded. One loose clip or one cracked sleeper may not be critical, but enough defects in sequence can be disastrous.

Some flaws, like tiny pitting on the surface of metal rails, can elude an inspector’s gaze. “There’s a risk of under-reporting and misreporting,” said Claire Tinker, a sustainability solution architect at TEL. That’s not for lack of trying, though. “These are skilled workers, but they’re dealing with a ton of environmental factors. In some parts of Norway, they’re getting only a few hours of daylight during the winter, and the weather can get nasty.” Additionally, many different companies use the tracks, and inspectors can only walk them when there is no train traffic, limiting the hours they have access.

Since the beginning of this year, IBM Research staff have been working with rail images from Bane NOR, as well as the company’s civil engineers and other technical staff, to learn about which specific defects a model must be able to detect. In its current form, the new model can reliably identify 10 distinct objects in the superstructure. ¹

Traditionally, only the main defects identified during inspection are addressed, and there is no way to track all the other small defects that are not yet worth repairing. “Inspections are done without necessarily reusing the data from the previous one,” said Cristiano Malossi, principal research scientist and manager at IBM Research. “Instead, with the model, we can identify and keep track of every defect, giving future inspectors the possibility to go back in time and see if and how the defect evolved.”

Three rows of photographs show railroad superstructure components highlighted in different colors. Rail defects are highlighted in purple, broken sleepers in orange, and fasteners with missing insulators are highlighted in pink. — The new visual inspection model is fine-tuned on thousands of photographs of railroad superstructure components. Three of the objects it can identify are rail defects (top), broken sleepers (middle), and missing fastener insulation (bottom).

Training on a budget

One of the challenges of developing this model, Scheidegger noted, is the relative scarcity of rail defects, and as a result, training data. “This is data that isn’t typically included in any model, so you need to start from scratch and build it on your own,” he said. Fortunately, Bane NOR had about 600,000 images of rails, sleepers, and fasteners taken in 2023. Part of the reason for the scarcity of training images is that Norwegian rail lines are kept in relatively good condition. Even with hundreds of thousands of images, there’s a low likelihood of finding many problems.

The existing defect images also lacked annotations of their flaws, something that is needed for fine-tuning a model. With the support of Bane NOR engineers, the team created a defect catalog and annotation guidelines to run a multi-phase, quality-controlled annotation campaign. Limited data availability and annotation quality prompted Scheidegger and his team to pursue new strategies, such as separating common objects into a dedicated workflow and requesting client-provided annotations. Lessons learned in that process led to refined guidelines and a split workflow, where rail engineers identified relevant images, and the IBM Research team performed annotation on those selections.

In some cases, the combined expertise of Bane NOR and IBM Research yielded ways to get the annotated defect photographs they needed for model training. For example, some rails had gone longer intervals without major maintenance, and the rails near stations are often under greater stress due to acceleration and deceleration forces, making defects more likely in these cases. And by initially running non-annotated images through an earlier version of the model, the team obtained hints about where defects for fine-tuning would be more likely to occur. Those outputs were then annotated, aiding in fine-tuning the next generation of the model.

While there are more than 10 possible railroad defects, Tinker noted that the project limited the model in its current state to the ones where the highest-quality data was available. “Once it is deployed, we can consider adding additional defect classes as may be necessary,” she said.

Real-world deployment

For some of the defect types the model is able to detect, manual rail inspection has long been the only way. But Amalie Ravnåmo, a stream leader at Bane NOR, suspects rail workers would probably prefer spending their time on more impactful labor like repairs.

“Our workers are very skilled at what they do, but making on-foot inspections is hard, and some defects are getting missed,” said Ravnåmo. “Also, valuable information is sometimes lost between the inspection and the reporting, meaning that our workers might have to revisit the site.”

To start, the model will be deployed on images collected by a track geometry cart, a specialized rail inspection vehicle that collects images, track measurements, and other data critical to assessing track health.² Currently, the images are analyzed by a third-party company, but Bane NOR aims to bring that process under their own control with the visual model in Maximo.

An InfraNord track geometry cart sits on a track. — Track geometry carts like this one collect detailed photographs of railroad superstructure components that are fed into the visual inspection model.

“This means that we will have more data on both defects that require urgent action and defects that are in development,” said Ravnåmo. “As we continue improving our maintenance, we might be able to catch up with our maintenance backlog and work predictively, removing minor defects and renewing track before it leads to serious defects and consequences for train traffic.”

And just as rail infrastructure is important for freight and passengers, technical infrastructure is crucial for running the visual inspection model. Training, evaluation, and testing were done on IBM’s research cluster equipped with multiple high-end GPUs. To accommodate real-world use, Tinker said, performing inference on millions of new images will require bulk inference systems, including proper load balancing between different servers, as well as converting results into a format that Bane NOR maintenance staff can review in a useful way.

“What we get with IBM is the ability to see the project as a whole,” said Ravnåmo. Maximo makes it possible to integrate results from the model into Bane NOR’s maintenance program in a way that is useful to Bane NOR’s employees.

IBM recently returned the test results to Bane NOR, along with the models. Bane NOR is now working to validate the model against unseen data captured in 2024 and 2025. Once the model is cleared for production, track engineers will be able to start viewing model results within Maximo Civil Infrastructure’s inspection tool, said Tinker. If all goes well, Bane NOR and TEL will work to expand the solution to perform inspections in every context presented by Norway’s diverse rail landscape, and keep trains moving for the 81 million train rides taken in the country each year.

References

These objects are: long crack in sleeper, crack in sleeper close to fastening, short crack in sleeper, fastener out of position, missing fastener, missing insulator, rail weld, rail defect, insulated rail joint. ↩
Image credit: G och J, Creative Commons 4.0 ↩

Meet the IBM researchers trying to raise AI’s intelligence-per-watt ratio
Q & A
Kim Martineau
21 Jan 2026
- AI
From atoms to chips: Thermonat models heat with unprecedented accuracy
News
Peter Hess
20 Jan 2026
LLMs have model cards. Now, benchmarks do, too
Release
Kim Martineau
16 Dec 2025
Boost your tools: Introducing ToolOps, the tool lifecycle extension in ALTK
Technical note
Himanshu Gupta, Jim Laredo, Neelamadhav Gantayat, Jayachandu Bandlamudi, Prerna Agarwal, Sameep Mehta, Renuka Sindhgatta, Ritwik Chaudhuri, and Rohith Vallam
11 Dec 2025
- AI

Minor threats

Training on a budget

Real-world deployment

References

Related posts

Meet the IBM researchers trying to raise AI’s intelligence-per-watt ratio

From atoms to chips: Thermonat models heat with unprecedented accuracy

LLMs have model cards. Now, benchmarks do, too

Boost your tools: Introducing ToolOps, the tool lifecycle extension in ALTK