Author: Christophe Briguet, Co-Founder, Head of Solutions Engineering and Pre-Sales Support
In my last blog post — the second in this series of three — I presented a three-layer user and entity behavior ontology architecture. We listed high-level classes and subclasses, and provided examples of question-driven use cases that could leverage those classes. In this final post, we will apply the suggested ontology to a specific case study and discuss how security practitioners would benefit from it.
Early Detection, Rapid Containment
Let’s start with a look at epidemiology, specifically at the strategy deployed in India back in 1974 to eradicate smallpox. Back then, 90% of the world’s smallpox cases — what is considered the one of the worst diseases in history — were concentrated in India. In his autobiography Sometimes Brilliant, Larry Brilliant describes how, as a World Health Organization consultant, he took part in the remarkable effort of surveillance and containment of the smallpox epidemic in India, even though eradication was considered an impossible task at the time. The main challenge was that many cases did not come to the attention of the authorities, and because mass vaccination was not a viable option at that time (imagine the cost and logistics to vaccinate 550 million people) they had to develop innovative strategies. They used a network of epidemiologists deployed in the most vulnerable areas to identify cases as early as possible to prevent outbreak. By vaccinating the cluster of people who had regular contact with each infected individual and by monitoring for signs of suspicious behavior — for example, by keeping track of clues such as extended absences from school or work — they were able to contain the outbreak fairly rapidly.
Now, consider cybersecurity. We are facing similar challenges, but to a lesser extent: limited people and resources, never a guarantee that an environment is completely secure, and signatures as the primary prevention against known threats. Early detection and rapid containment through targeted but continuous monitoring appear to be efficient ways to prevent malware outbreak and data breaches.
Compromised Hosts, Covert Channels, and Lateral Movement
One of the most harmful attack scenarios inside a computer network involves zero-day exploits to compromised machines and hidden or covered communication channels to transfer information between the compromised host and the attacker system. As a security practitioner, the objective is to identify those compromised machines before the exfiltration of critical information and before the attack spreads across the network. There are many ways to address this challenge, from relying on a library that encodes negative behavior (like a signature) to a library that encodes only positive behaviors.
The approach described here consists of detecting deviant behavior by comparing the observed activity with behaviors learned from the environment, such as past and peer-group behaviors. It consists of continuously searching for machines running an unknown process (a process that has never been seen before) and communicating to an unpopular website in an atypical manner. It also could be interesting to reduce the noise-to-signal ratio by taking account of the machine’s owner activity. For example, the presence of unusual data access patterns (lateral movement) that naturally coincide with the presence of unknown processes and potential command-and-control signals would increase the confidence in the characterization of malicious behaviors.
But how does our ontology enable the implementation of such a use case? What would the benefits be for security practitioners or data scientists?
First, the ontology supports an extensible model that defines rules and other parameters to automatically analyze potential malicious behavior with a high level of precision. The ontology framework allows categorizing behavior in a precise way, which is a fundamental step for further developments on reasoning and detection procedures. For example, security practitioners and data scientists, who often have very different backgrounds, can describe the use case using a common language, which facilitates consensus around the objective (what they’re trying to find and why) and the type of analytics to be used (how they’re going to find it).
This example, depicted below, captures the attack scenario described earlier: both behaviors from the user Mark Ingram and his machine laptop836 are analyzed over time. Their properties (e.g. IP addresses, user ID, etc.) are tracked and used to link those two entities together (i.e. the device laptop836 belongs to user Mark Ingram).
Checking the Facts
Early adopters have gotten used to living in the gray area; they are comfortable with the uncertainty resulting from analyzing behavior and identifying anomalous patterns. But what if they could fill the gap between the outcome of an analytic model and a definitive conclusion? In many cases, data visualization is the path to proof; it allows the surfacing of anomalous behavior, by consuming and interpreting behavioral data in a visual/spacial fashion. Ontology helps to define how the data will be visualized and quantify the behavior and build trust (all stakeholders understand the objective, and the process and data used to achieve it). Ontology also allows artificial intelligence to be transparent about how and why a specific behavior has been characterized in a given way. Essentially, it does away with the “black box” approach in which so many AI products are shrouded, and gives humans a way to validate that the output is correct.
For example, a SOC analyst can more quickly and easily grasp all of the complex relationships, data, and connections as a related set in a chart than they would memorizing each fact independently.
An ontology could be used to drive the visualization technique. In the example below, the foundational elements, such as the entities and abstracted actions, are represented as nodes and edges that connect elements together, respectively. The third layer, the meta-behavior, is represented with color. Using this layout, we can leverage advanced graph theory and topological analysis to expose unique insights from the entity activities. The graph enables security analysts to ask more meaningful questions:
- How has behavior changed over time?
- Which entities are behaving differently from others?
- Are some groups of entities behaving disproportionately? (percentage of overall population/percentage of deviant population)
- Are there other entities behaving like this?
With cybercrime out of control, where bad actors operate in anonymity without detection, and when we hear about massive data breaches every single day, it’s impossible to completely protect every user, device, system and application. Our ontology described here helps security professionals to characterize, resonate and engineer the abstract concept of behavior. No two organizations will have a the same structure, size, and topology, but the user and entity activities on the network will always have a similar set of qualitative characteristics. By collectively sharing such qualitative knowledge openly among security professionals, we can hasten response time by effectively identifying who is infected and “inoculating” others.
Read more of Christophe Briguet’s “User & Entity Behavior Ontology” series: