Author: Christophe Briguet, Co-Founder, Head of Solutions Engineering and Pre-Sales Support
On June 1st, 2009, Air France Flight 447 disappeared during stormy weather over a remote part of the Atlantic, as it was carrying 228 passengers and crew from Rio de Janeiro to Paris. The flight’s crew and all its passengers tragically lost their lives. Recovering the plane wreckage proved a difficult task. After two years of unsuccessful search campaigns, the French authority (the Bureau of Investigation and Analysis for Civil Aviation Safety) asked the US company, Metron, for help.
Metron’s researchers used Bayesian inference to organize the available data: environmental information such as currents and winds, results from previous unsuccessful search efforts, data from previous plane crashes, characteristics of aircraft black boxes (flight data and cockpit voice recorders) and their associated uncertainties. From this data, they made various hypotheses about the wreckage location, particularly considering their doubts that the black box beacons had survived AF 447’s crash. After reviewing all the data, scientists and analysts produced a probability map for the location of the underwater wreckage.
In April 2011, after 9 months of data analysis and a week of exploration, the underwater wreckage of flight AF 447 was located in the southern part of the Atlantic Ocean some 14,000 feet below the surface, in what the Metron team identified as a high-probability location.
It should be noted that the wreckage was finally discovered in a location that had already been explored by previous search teams. What made the difference in this last search campaign was the analytical assessment of the highest likelihood areas and the analysts’ intuition that both black box beacons might have failed.
There is much security practitioners can learn from this tragic event. Even though the search for airplane wreckage at the bottom of over 3,000 square miles of ocean is very different than finding a single threat in petabytes of data, there is still some similarity in the search strategy.
1. Gather all available information and analyze data with the big picture in mind.
It can’t be said enough: being successful at detecting threats starts with the meticulous collection and management of heterogeneous data — from network packets to application logs in various locations and formats, make sure you’re processing all of it. Not only will it extend your security team’s field of vision, but the quality of the data directly impacts the value of the analytics and ultimately the efficacy of the protection. If you leave out a piece of the data, you may end up making incorrect assumptions and inadvertently pass over what you’re looking for.
In the same way that AF 447’s beacons failed to activate resulting in a long and difficult search, a single blind spot on a specific asset can delay the discovery of a threat and significantly extend a threat’s dwell time for months, giving a malicious actor enough time to progress through the network and continue down the Cyber Kill Chain.
2. Boost the machine.
How do you get machine learning models to make the right prediction? By making sure they have been exposed to the full spectrum of a situation. The initial training (read: learning) of a machine learning system is often based on biased samples created from known and understood situations, so it requires feedback to improve the accuracy of its predictions over time.
The first applicable approach is called “reinforcement learning.” It allows analysts to provide feedback and update the machine learning algorithm as they observe the results (think of Amazon Alexa’s voice feedback: “Did Alexa do what you wanted? Yes/No”).
The second approach is called “continuous active learning.” It relies on the system’s capability to self-select the cases that need to be reviewed and labeled by analysts. The key here is that the machine chooses the examples that will best improve its accuracy by selecting difficult examples under its current classifier hypothesis, i.e. the areas that are “fuzzy” within its realm of understanding. In a security operations center (SOC), the system can request feedback on similar recurring signals to improve its noise-to-signal ratio.
Both approaches require analysts to contribute and both can be perceived as Sisyphean tasks. In a world where investigation time and budget are inevitably limited, the system must use analysts’ time effectively. False positives can cause alert fatigue and distrust of the system. Moreover, machine learning systems have an inherent inertia that doesn’t necessarily meet analysts’ expectations of instant effects. Feedback’s impact might take some time before it’s noticeable, and training the system might feel boring and tedious compared to the immediate reward of clearing the alert queue or finding a threat, particularly if the potential change resulting from the feedback is not obvious to analysts.
The third approach involves scaling the feedback loop and involving end-users. The typical incident management process in a SOC involves security analysts reaching out to the end-user involved in the case, or that person’s manager, to verify what happened and proceed with the investigation. So, instead of manually reaching out to employees, companies like Dropbox have developed a solution to delegate to end-users the task of providing system feedback to reduce the burden on the security team. Based on Slack’s messaging platform, the system (using a bot) reaches out to employees involved in an alert and asks them whether the event was good or bad. Then, it sends aggregate results back to the security team to help them sort through alerts more quickly. This interaction happens within minutes of the suspicious activity, similar to how banks and credit card companies send fraud alerts almost instantly after unusual activity occurs on a customer account.
3. Build analyst intuition
In the same way machines need continuous active learning to make correct predictions, analysts must continuously broaden their experiences to improve on their intuition and make correct assumptions. In the case of AF 447, Metron brought together a diverse group of experts that had been exposed to a full spectrum of experiences, and because of this, were able to select the right parameters for their prediction model into the likely whereabouts of the wreckage.
As human beings, we build our assumptions based on what we’ve encountered in life. Security analysts’ instincts become increasingly nuanced as they investigate more cases. For example, a senior analyst in an organization will likely know what typical behavior looks like for a group of users, so when there’s suspicious activity, the senior analyst can intuitively create a mental reasoning map based on the actual data and previous experiences. Then, they can identify multiple realistic scenarios and make a decision accordingly. With this approach, incorrect or uninterpretable predictions are not simply ignored, but used as precious learning materials to be reviewed and fed back into the system.
And let’s not forget about bias. There is bias in machine learning due to limited sets of training data. Successful analysts recognize bias in their own learning process, too. In particular, the human predisposition is to pay attention to or remember successes and forget failures, but both success and failure are extremely useful in learning and improving future outcomes. However, this is a topic that merits its own separate post.
Improving on the feedback loop improves outcomes
In the final campaign to find the wreckage of AF 447, exploration commenced in the center of Metron researchers’ probability distribution map, and the search team quickly found the wreckage. Their strategy of gathering all available data, and using statistics and probability was crucial in deciding where to focus that final search effort — and it paid off. The remains of many of the flight’s passengers and crew were reunited with their loved ones and put to rest, and both black boxes were recovered, which led to the determination of the cause of the crash.
In a kind of parallel, organizations all over the world are in a similar — albeit, far less dire — situation: They must identify threats within massive volumes of activity and provide insights that are actionable. Internalizing feedback significantly improves both human intuition and a machine learning system’s prediction accuracy over time. Machine learning systems help security practitioners move from manually manipulating data for analysis to focusing on strategy and recommendations that make decision-making more efficient, accurate, and effective.
Read about the how the research team from Metron was able to model the data and steer the search team to locate the wreckage here.
Special thanks to the Institute of Mathematical Statistics for their permission in using images from the research paper in this blog post.
Read about E8 Security’s use of machine learning in Ravi Devireddy’s post, Why E8?