Kuhn's book The Structure of Scientific Revolution outlined an episodic model in which periods of “normal science” were interrupted by periods of “revolutionary science”. Kuhn challenges us to consider new paradigms and to change the rules of the game, our standards and our best practices.
Artificial intelligence (AI) and machine learning (ML) liberatingly delivers this new paradigm, putting the science back into security.
It’s quite clear that relying on endpoint protection solutions that only detect malware AFTER it has executed and caused damage is no longer a viable security plan. These legacy technologies are inherently reactionary in nature, and too dependent on the outdated and high-maintenance practices of manual sample analysis and signature creation.
Legacy antivirus (AV) vendors hoped post-execution analysis and solutions would give them an edge against the malware onslaught, only to find it made systems less nimble, more expensive to maintain, and highly prone to attacks by malware variants and zero-days.
Cylance: AI + ML = Pre-Execution Protection
Cylance has overcome these challenges by building the first and only artificial intelligence and machine learning based pre-execution malware detection engine.
The goal of the pre-execution methodology is to analyze suspect code and determine if a file is good or bad based purely on the information contained within the file itself, and then repeat that at a sustainable massive scale.
But how, exactly, do we do that? The answer lies within the awesome power of AI.
Over the past few decades, billions of files have been created worldwide—both malicious and non-malicious. As the file creation process has evolved, patterns have emerged out of that chaos that dictate exactly how specific types of files are constructed. There is a variability in these patterns as well as certain anomalies, but as a whole, pattern consistencies become more visible as available sample sizes increase.
The challenge lies in identifying these patterns, understanding how they manifest across millions of attributes, and recognizing what consistent patterns tell us about the nature of potentially malicious files.
Given of the magnitude of the data samples involved, the tendency people have towards bias, and the massive number of computations required to process that information, humans are simply incapable of leveraging all this data to make a determination as to whether the file is malicious or not. Yet legacy AV vendors still rely heavily on human decision making in their processes.
But humans have neither the cognitive power nor the mental endurance to keep up with the overwhelming volume and sophistication of modern threats. Promising advances have been made in behavioral and vulnerability analysis, as well as identifying indicators of compromise, but these advances all suffer from the same fatal flaw: they are all based on manual human analysis.
Leveraging the Power of Machine Learning
Machines, however, do not suffer from this same bias, and machine learning and data mining go hand in hand. Machine learning focuses on prediction based on properties learned from earlier data. This is how Cylance differentiates malicious files from safe and legitimate ones. Data mining focuses on the discovery of previously unknown properties of data, so those properties can be used in future machine learning decisions.
The ability to do this across a huge number of samples is important, because modern polymorphic malware creation is largely automated. Today, it requires very little effort for attackers to mutate a piece of malware, enabling it to elude legacy signature-based AV solutions. These ‘traditional’ solutions were adequate for protection when malware creation was a manual process, but using the same tools and techniques today is like trying to catch the water from a failing dam by hiring a thousand people to form a bucket chain.
To stop malware before it ever gets a chance to execute, applied AI uses complex algorithms that can predict if a program is malicious based on millions of feature sets. This approach for prevention has proven extremely effective at stopping malware before it infects a system and without the need for initial victims, as is the case with legacy antivirus.
Simply stated, the AI methodology has the advantage of being able to identify new malware long before it might be identified by antiquated, post-execution, signature-based approaches.
The Death of Detect and Respond Solutions
Identifying malicious applications before they get a chance to execute helps limit security management costs and system performance overhead. It can also reduce the number of samples that need post-execution monitoring and the chance that a malicious sample will succeed in making it past that final layer of defense.
Traditional “detect and respond” solutions are not only ineffective, they also generate too much data by blindly logging everything, making them better suited for incident response AFTER a breach has occurred, rather than actually preventing the breach in the first place.
To keep up with modern attackers, security technologies need to evolve alongside them – and without relying on human intervention. That’s where AI and ML have the advantage.
Since we can now classify good files from bad based on objective, mathematically represented risk factors, we can also teach a machine to autonomously take the appropriate actions against these files in real-time, essentially automating the prevention elements of our security programs.
For years, industries such as healthcare, insurance, and high-frequency trading have applied the principals of AI and ML to analyze enormous quantities of business data and drive autonomous decision making.
Similarly, at the core of the AI-based security methodology is a massively scalable data-processing ‘brain’ capable of applying highly-tuned algorithmic models to enormous amounts of data in near real-time. An AI or ML approach to computer security will fundamentally change the way we understand and control the risks posed by malicious code.
Much like Kuhn’s model predicted, the security paradigm is shifting from that of “regular” outmoded strategies to one of security as a science, and AI is the primary agent for that “revolutionary” change.