Top Story

‘Bankenstein’ and a cold calculation means banking crashes will continue

Banks are operating a “spaghetti” tech structure that is creating a game of “technology Jenga” for IT managers updating their systems.

Making changes to IT systems has huge knock-on effects, with upgrades risking downtime. But banks are willing to accept a certain level of downtime – this will only change when the cost of downtime in the form of customer compensation and fines is higher than the cost of achieving 99.999% availability.

In the aftermath of a three-day outage at Barclays, bosses at the UK’s biggest banks were forced to explain to MPs why their digital systems had been unavailable for a combined 800 hours in the past two years.

Barclays CEO Vim Maru told the MPs the cause of the recent outage was a software problem in a critical module of its UK mainframe operating system, which “caused progressively severe degradation of mainframe processing performance”.

Data received from banks by MPs on the Treasury Committee revealed at least 158 banking IT failures between January 2023 and February 2025.

System changes were the major cause of problems, followed by third-party issues, bugs and hardware failures.

Chris Skinner, fintech industry expert and CEO at The Finanser, said the structure at banks, with many different players and providers, along with a need to regularly update systems, creates a Jenga-like problem.

“There is a spaghetti structure that I call Bankenstein – a whole load of pieces stitched together to work, but if one part dies, the whole system goes down. You see the same issue with airlines and others. Systems go down due to one piece not working – it’s like one massive technology Jenga,” he told Computer Weekly.

Skinner said the complex legacy environment has been supplemented with a complex multi-supplier environment, making “the myriad of dependencies far more complex”.

Information received from the banks revealed that Barclays is expected to pay out up to £12.5m to customers in compensation.

The acceptable cost of downtime

According to one senior IT professional in the banking sector, who wished to remain anonymous, it all comes down to a simple question: how much will it cost to ensure 99.999% availability, and is it cheaper just to pay compensation when things go wrong?

The IT expert, who has worked at a variety of the UK’s biggest banks, said: “It looks like accepting the risk of outages is more economic than running ultra-high availability systems. If you can recover a key system relatively quickly, and that is cheaper than running at ultra-high availability, it may make more sense commercially to accept the risk of outages and rely on swift recovery instead.”

It looks like accepting the risk of outages is more economic than running ultra-high availability systems
Banking IT expert

The banking IT expert suggested the high expense of building and running systems at the “five nines” (99.999%) uptime standard means there is a cost reduction incentive to relax this for some systems.

“Barclays expects to pay more in compensation than the other banks, but the estimate is insignificant in comparison with the annual IT budget. The banks will also have internal costs, and maybe fines in addition to compensation, but these are not mentioned in the letters,” he said.

“If the banks can estimate the potential cost of fines and compensation for outages compared to the cost of higher uptime targets, it is possible to work out the optimum commercial uptime target for each system or service. “My guess is this would be lower than the ‘five nines’ target,” said the expert.

Balancing costs

Fives nines availability would amount to five minutes’ downtime per year. Four nines (99.99%) uptime would allow almost an hour of outages per year. Three nines (99.9%), meanwhile, would allow almost nine hours of outages.

Each extra nine can add significant IT cost.

“I know the banks wish for 100% reliability and availability of systems, but this is never likely to happen as it would be technically impossible and economically unviable,” the banking source told Computer Weekly. “The best defence for customers is to have accounts with several banks and hope they don’t all break at the same time.”

The banks wish for 100% reliability and availability of systems, but this is never likely to happen as it would be technically impossible and economically unviable
Banking IT expert

Following the explanations from banking bosses, Treasury select committee chair Meg Hillier said: “For families and individuals living paycheck to paycheck, losing access to banking services on payday can be a terrifying experience. Even when rectified relatively quickly, it can cause real panic, which is why we wanted to get a proper understanding of why unplanned banking outages happen and how banks and building societies respond. 

“The fact there have been enough outages to fill a whole month within the last two years shows customers’ frustrations are completely valid. The reality is that this data shows even the most successful banks and building societies hit technical glitches. What’s critical is they react swiftly and ensure customers are kept informed throughout.”

Regulators have imposed major fines for the most serious outages. In 2018, while migrating to a new core banking system, TSB experienced major problems. Over a five-day period, users were locked out, experienced money disappearing, and some were even able to see other customers’ accounts.

The UK regulator fined TSB nearly £50m for its failures. The bank also paid £32.7m in redress to customers who suffered detriment. Its then CEO, Paul Pester, fell on his sword, leaving the company soon after the disaster. The UK Prudential Regulation Authority also fined TSB’s former CIO, Carlos Abarca, £81,620 for his part in the catastrophic migration.

Related Articles

Leave a Reply

Back to top button