# Why the flight you booked 3 months in advance is already delayed

Dan Kitwood / Getty Images

*This article was written by Angela Zutavern and Josh Sullivan and first published here. The opinion of the authors does not necessarily correspond with that of the editorial team. Want your opinion to be featured on AeroTime? Send us a line at editor@aerotime.aero.*

Imagine you are flying over a major city at night—say, Chicago, Paris, or Beijing—and it is completely dark below. Then imagine that someone flips on the power grid, and you see today’s web of human activity light up.

This ability to “flip the switch” to see formerly hidden detail expresses the potential of machine intelligence, which enables us to ask profound new questions of data that go beyond what’s possible with traditional analytics. Leading an organization driven by machine intelligence—a mathematical corporation—requires embracing an evolving model in which leaders and machines share the cognitive workload.

A key area where leaders will increasingly want to let machines hold sway is with numbers, and not just in spreadsheets but with hefty computations. This may seem obvious, but let’s paint a picture that shows how much more important this is becoming. The Federal Aviation Administration (FAA) was looking to predict flight delays with more accuracy and further in advance—weeks or even months in advance. That doesn’t sound feasible, but it turns out that delays stem from a cascade of effects that, in a complex system, reveal themselves well before the day of departure.

The FAA, according to technology manager Kevin Hatton, has long relied on simple models that use actual departure times to predict how busy airports will be when planes arrive. If arrival windows look too busy, traffic controllers delay plane departures. The problem? “If predictions are wrong on departures,” says Hatton, “the planned system response can be wrong.”

Departure times are not entered into the system in any sophisticated way. In a longstanding model, when a plane doesn’t leave on schedule, the computer simply adds five minutes to the departure time, and it keeps adding five minutes for every additional delay, ad infinitum. “It’s crude and doesn’t represent reality at all,” says Hatton.

Work at the FAA today is contributing to a larger vision, an aspiration to regulate more airspace and more airlines, and even unmanned planes. To safely manage the increasing volume of air traffic, the FAA needs to move from solely relying on predefined flight patterns to managing open air space using satellite guidance. This new system will change the whole approach to air traffic control.

What’s obvious is that people alone, using simple models, cannot grasp the complexity of so many planes flying in open air space. Work at the FAA on flight delays takes advantage of faster computing, parallel processing, and cloud-based computing with advanced statistical techniques to accomplish an inordinate amount of number crunching. To make sense of the complexity, the FAA uses what’s called a Bayesian belief network, or BBN.

Imagine a Tinkertoy structure of joints and connectors. That’s what a Bayesian belief network looks like. The factors that influence the system, in the FAA’s case, say, the weather, are represented by the joints. The level of interplay between the factors is represented by the connectors. A drawing of the network, then, shows every factor contributing to delays and their connections to every other related factor.

When data scientists run the model, they use equations to calculate the probabilities of each factor affecting every other. “The idea from chaos theory is that one small change can make a big difference in a chain of events,” says Hatton. “This looks at all these chains of events.”

In flight delays, factors range from aircraft cleaning and baggage loading to crew problems, flight clustering, and weather. The factors might correlate directly or indirectly, a fact reflected in the calculations. In the FAA project, the team established probabilities both by machine learning, which required four different algorithms, and by expert opinion. (Yes, opinion still counts.) The web of relationships was mind-boggling. No person could grasp all the connecting influences, even if someone had a good “feel” for system-wide interactions.

“The real beauty of the BBN is that it doesn’t require any human to think what causes delay. The machine learning algorithm . . . figures out what the connections are,” says Hatton. “It can identify patterns that we may not be aware of or wouldn’t even think to look for”—so long as it has the computational power to do all that learning.

To appreciate the scope of the calculations, consider that, if every factor had just two states (e.g., on/off), the number of values needed to calculate final delay probabilities would be 2 to the nth power, where n is the number of variables in the model. Suffice to say, in this case, the model had 47 variables, for a potential set of probability values numbering around 140 trillion.

Tapping the power of the machine to crunch such numbers is imperative. And it will remain imperative for all leaders who want data-driven answers. You can no longer come up with quick rules of thumb to get better answers than machines in such complex cases. At the FAA, the team applied its computing horsepower to a data sample of 52 million flights over five years. The sample included 5.25 million rows of data. The computations were even more complicated than anticipated because the data were not clean; the Bayesian belief network was needed because it can estimate missing values amid all that complexity.

The machine is learning, improving more than the most experienced leaders—not through experience, intuition, expert advisers, or any other means but through recent advances in machine intelligence. It actually has an advantage in not having preconceived ideas of causes. Says Hatton: “When going from the causes, you’re only going to notice things you feel intuitively will cause delays,” and miss everything else, such as delays unrelated to obvious factors, like aged planes or thunderstorms in Chicago.

For the flight delay problem, the Bayesian belief network bested the most advanced, yet simpler statistical models previously in use. This was true for flight delays at all time horizons, long before and just before flights. And the BBN made better predictions even though it didn’t incorporate all the data because a comprehensive traffic flow management data set was unavailable.