Big data, without a scalable way to make sense of it, is pure chaos.
Business intelligence groups in sales and marketing have been dealing with big data for years to get inside the minds of their prospects and customers, including how to better market their offerings and where gaps may be. But in cybersecurity, the massive amount of potential insights from big data is a newer issue, and we’re all still figuring out the best way to deal with it.
The rise of big data has necessitated a need for data science — the examination of different ways of slicing the data to interpret it correctly and figure out what it means — to make sense of some of the “chaos.” And now, along with the benefits of converging data science with cybersecurity, there are costs to consider. Here, we look at the benefits and challenges of the collision of big data and cybersecurity.
The Benefits of Leveraging Data Science within Cybersecurity
Data science is the path to truly understanding the “psychology” of an organization and that of an attacker. Accessing droves of data and expertly analyzing it can be the difference between before-it-happens and after-the-fact forensics. The behaviors of machines and users within an organization are often where the first signs of attack occur, and it’s just a matter of figuring out what those unique behaviors are, how and where to look for them. This is one place where data science is invaluable.
Most expert security analysts and threat hunters already know what behaviors they want to know more about, but don’t always know how to find them. They’re unable to build those queries within current rule-and-threshold tools, and tools built specifically for data scientists are too complicated, requiring a data science background to operate effectively. However, when cybersecurity and data science skills are combined, these problems become non-issues: your security team is able to find and make use of those early warning signs within your massive amounts of data.
The Costs of Converging Data Science Technologies and Cybersecurity
The cybersecurity industry is still in the early stages of implementing scalable and cost-effective ways to store vast quantities of data. Achieving a scalable solution requires some major architectural changes to the existing security stack, and major changes are never easy pills to convince those with the purse strings to swallow. The benefit to retaining this data, however, is that it contains evidence of impending attack and malicious intent, and can be used to monitor for the earliest possible warning signs of a potential security breach.
Unfortunately, there is already a shortage of both skilled cybersecurity professionals and data scientists, and requiring a convergence of these two disciplines means that the supply of qualified people will decrease while the demand increases. This is a problem that leaves organizations with more open headcount than can realistically be filled, and competing with each other for people with both set of skills.
Get Ahead of the Chaos: Promote Data Science and Cybersecurity Within Your Organization
Nowadays, it can be extremely beneficial for security operations within an organization to utilize data science for early detection and threat hunting – to help them get ahead of the curve with regards to attackers, the growing number of different attack vectors, vulnerabilities, automated attacker tools, alerts, etc. The next step, after weighing costs and benefits of data science within your cybersecurity operations, is to learn how to promote and integrate those skills within your current security operations team.
Here are a few different tips to help you do this:
- For larger enterprises that already have a data science team focused on growing revenues, it may make sense to dedicate a few people from this group to cybersecurity. Even if your subset of data scientists are only dedicated to cybersecurity part-time, they will help your security team find early indicators of potential attack within your data that weren’t being considered before.
- Buy professional services and outsource data science personnel. This may be less expensive than the first option and will allow you to scale up or down the amount of data science services you need as your team matures and big data usage evolves.
- Incentivize existing cybersecurity personnel to take data science courses. This option is a great investment in your employees, and one that will ensure data science models are laser-focused on positive outcomes for your security operations practice.
- Leverage security tools that incorporate data science modeling. This is probably the most cost-effective solution, and more technologies that provide this capability are popping up in the market. This is why machine learning and artificial intelligence have become more prevalent. These technologies incorporate data science within their code to achieve the learning and data modeling pieces to decipher more abstract data concepts. These include deriving conclusions from incomplete datasets and massive datasets based on calculated probabilities — inductive and deductive reasoning.
Storing data in a data lake without clear objectives on how it will be used to benefit the business is a fool’s errand. Enterprises find themselves in similar situations with every new technology they implement. For example, next-generation firewalls (NGFW) give customers the ability to create user and application policies, but many customers don’t deploy them this way — they’re still using their old IP-based policies that they claimed weren’t working and were their reason for buying an NGFW in the first place.
Converging big data and cybersecurity is only chaos when your data doesn’t have a clear purpose and isn’t understood. The data, itself, doesn’t provide answers on its own that will help cybersecurity team. It must be used effectively by people and/or tools with specific goals in mind. Data science is great at connecting seemingly unrelated data points and deriving meaning from data sets, and when used within the context of cybersecurity, can find hidden threats that security operations weren’t previously able to search for.