EVALUATION RESULTS
Summary
On January 25th, AV-TEST released the results of their “Business Windows Client” Antivirus Test, conducted on Windows 10 only. This test marked the first publicly available test of CylancePROTECT™ by an “independent” third party. While CylancePROTECT has undergone numerous private tests by independent third parties that have led to vastly differing results, in this summary we will go into the detail of AV-TEST’s methodology and scoring, provide clarity on the test’s results, and hopefully provide a better understanding of Cylance and the technology we use to provide the most effective endpoint security product on the market today.
Independently verified, third party testing is paramount to objectively measure the security industry’s efficacy. They provide answers to consumers and businesses who neither have the time or resources to answer for themselves simply due to the sheer number of products and different layers of protection. While some may question the impartiality of these tests, we found the methodology to be open, easily auditable, and do not believe they intentionally favored one vendor over another. The companies evaluated in this test, however can spend as much time and resources as they have the money to ensure the highest possible score. In the end, the only pure way to ensure objectivity and independence is for you the consumer or business to acquire the samples, build the test methodology, and run the test yourself.
True “Next Generation Antivirus”
Before getting into the specifics of the AV-TEST, we should first explain that Cylance is “Next Generation Antivirus”, and our technology is vastly different from traditional antivirus. Unlike traditional antivirus, we use a pure machine learning approach to determining maliciousness. This means that we predict if a sample is good or bad, in real time, before it executes on the system – or what we call “pre-execution”. This diverges from traditional antivirus methodologies that involve writing signatures or relying on reputational or behavioral heuristics, to determine if a sample is bad, many times after it’s already running. While some antivirus products on the market today claim the use of multiple machine learning techniques that seem similar to Cylance’s approach at first glance, these products use machine learning in the cloud to generate new hash signatures or in conjunction with their post execution heuristic engines. We feel that both of these approaches fail to gain the benefits of a truly predictive engine making decisions in real time and believe them to be vastly inferior to Cylance’s pre-execution analysis approach. Traditional antivirus vendors are trying to layer yet another Band-Aid to augment the aging technology behind cloud lookups and post execution inspection of malware behavior on a host, while Cylance uses machine learning and artificial intelligence to predict if a file is malicious BEFORE it executes. The reason we took this path is simple…if you allow malware to run, you no longer have control of the integrity of your system. Just ask SaudiAramco, RasGas and Sony. In the time a traditional AV behavioral engine takes to make a decision about the goodness or badness of a sample (and assuming they get it right), it may have already tampered with the system, exfiltrated data, or even disabled the security solution outright. At best, you then have to go to the endpoint, grab the file, and figure out what damage was done before it was detected. At worst, your machine is now compromised and perhaps none the wiser. Allowing malware to execute prior to detection is the core of the industry’s problem that CylancePROTECT was built to solve. If you don’t let malware run in the first place, you don’t have to deal with the fallout of running malware on your endpoints.
Now, let’s go back to the test. AV-TEST’s evaluation consists of 3 individual components: Protection, Performance, and Usability. Each component has multiple tests that when combined represent a total possible score of 18 points, six for each component. In this analysis we will discuss the individual results from the different component tests, and try to lend some insight into the context of each. AV-TEST’s public tests include two separate tests, which are run over two consecutive calendar months. For the ease of comprehension we are going to combine both monthly tests into one group of statistics, making digestion of the data easier.
Protection
AV-TEST’s Protection testing consisted of using their own privately sourced samples of 14,798 individual pieces of malware, specifically sourced and chosen by AV-TEST, broken into two different data sets that are scored independently:
- Protection against zero-day malware attacks
- Detection of widespread and prevalent malware discovered in the last four weeks
AV-TEST calls the first set of malware “zero-day”. The test consisted of 140 samples, each of them pulled from individual malicious URLs chosen by AV-TEST alone to be malicious “zero-days”. From the test machine, the URL is visited, the file is downloaded, and executed. Here is where it gets fun. All traditional AV’s have URL blacklists embedded, which attempt to prevent users from visiting the URLs and becoming compromised without ever knowing if what they are downloading is actually malicious. Traditional AV just blocks the assumed malicious URL itself. While URL blacklisting can sometimes be an effective way of preventing a malicious file from getting delivered to a system through a web browser, it is purely reactive in that someone somewhere has to eyeball it to determine its nature, and not predictive. Therefore, URL blacklisting is not effective against combating enterprise malware because you generally only know about hostile domains the vendor has seen before. Additionally, there are countless ways that malware can get onto a machine and execute without a browser, including:
- Using existing admin tools like PSEXEC
- USB Flash Drives
- Over network shares
- Macros
- Websites that aren’t blacklisted yet
Cylance overcomes the above limitation through a unique architecture. By controlling execution of every bit of code at the kernel, the injection point of the malware is muted. We analyze every file as it goes to run, using artificial intelligence to make a decision if that file more closely resembles something malicious or something good, and either blocking execution or letting the file run.
Out of 140 “zero-day website attack” samples, all 140 were executed on a machine with CylancePROTECT installed. On execution, Cylance allowed three samples to execute, and out of those three samples, we blocked the malicious elements of one, and just straight-up missed the other two. This resulted in our average score of 97.9%. However, since traditional AVs have full URL blacklisting, by simply disallowing access to the URL, they get full credit for stopping the zero-day malware. While perhaps a valid test of one vector to receive malware, it does leave many vectors untested and ultimately unprotected. Doing all zero-day testing via malicious URLs doesn’t give the audience the ability to clearly see how the different AV engines accurately identify and block a piece of “unknown” malware. This is a well known issue with all traditional signature based AVs. They need time to get the samples into their companies and build signatures, which is a weakness deliberately not present in CylancePROTECT. Cylance not requiring signatures means the product has no need for daily updates. In fact, the mathematical model at the core of Cylance’s engine used during this test was created in September of 2015, and is still being used by CylancePROTECT customers at the writing of this blog.
The second test in the Protection category consisted of a much larger data set of 14,658 samples of “widespread and prevalent malware discovered in the last four weeks”, broken into ten different sub-categories: backdoors, bots, viruses, worms, downloaders, droppers, generic, password stealers, potentially unwanted applications, and rogue applications. Of these samples, Cylance provided 100% efficacy short of two areas. The first was potentially unwanted applications. The name alone goes to foreshadow what the files are - potentially unwanted, but not necessarily malicious. Of these 4,626 potentially unwanted files, Cylance failed to block 109 samples from execution, resulting in an effective rate of 97.64%. Unfortunately, it is often difficult to determine if potentially unwanted programs are actually malicious. After all, if they were, these files would have fallen into the other nine categories of malware. The other category that Cylance missed was labeled as ‘generic’. Of the 10,210 AV-TEST sourced samples that made up the ‘generic’ group, Cylance missed 24 individual samples, giving us an effective score of 99.76% on that category alone.
If you remove potentially unwanted applications from the calculations (which tend to fall into a gray area of interpretation anyway) there remains 10,032 samples, which Cylance missed a grand total of 27 (and I’m counting the sample that Cylance prevented infection from but was counted against us). Cylance achieved a 99.7% protection rate with a mathematical model that was built months before any of these samples were created - no blacklists necessary, no signatures necessary, just honest identification of the components of bad files, and blocking them from running in the first place. For those of you that have run Cylance, or attended one of our demos, you know this figure is in line with every single third party and internal test that has ever been run on the product.
Conspicuously absent in the tests were the ‘ransomware’ category. Ransomware is a huge issue plaguing today’s enterprise and home users, representing a multi-billion dollar criminal market. By relying on blacklists and dynamic detections, ransomware frequently skirts by traditional antivirus solutions, encrypting files and locking the system quicker than traditional antivirus can detect and stop it. I’m sure many readers have experienced this first hand. The sample will run and even if the antivirus throws up a detection, it’s too late. The machine is owned, and there is no fixing it, short of restoring from backup or paying the ransom. Protection from new and emerging ransomware is by far one of the greatest strengths of Cylance - keeping the malware from running in the first place and not simply detecting it as it locks up a computer, rendering it unusable.
Performance
The performance category consists of an aggregated score of 5 different components of testing:
- Downloading files
- Loading websites
- Installing applications
- Running applications opening specific documents
- Copying files
The best possible score for each of these components is ‘1’, meaning there is no significant impact to the action caused by the presence of antivirus on the system.
While we can go into detail at the performance of each individual test, we will focus on the areas where Cylance did not score a perfect ‘1’ in the performance testing.
In the first category, copying of files, Cylance scored a ‘2’, which is one measure of deviation from perfect. The reason for this is simple. In the configuration in which the product was tested, we scored every file on copy to the system, slowing down the process. This is not a standard configuration in most of our customer environments. Cylance was built to operate in execution control mode, scoring files and blocking them as they execute, be it by a user or the system. Traditional antivirus relies heavily on “on-access” scanning, or scanning a file as it hits the disk. Due to the fact that 14,658 samples weren’t detonated on the system, simply copied onto it, we had to activate our file watcher functionality (our version of on-access scanning). Of course, this is going to cause delays. It takes, on average, 16 milliseconds to analyze and score a file, but across large data sets those times add up. In a real world environment, running such functionality is simply not necessary. A piece of malware sitting on disk poses no threat to the system until it is executed, much as a gun poses no threat until the firing pin strikes the primer.
The second category in which Cylance did not score a perfect ‘1’ was installing applications. In fact, this was the worst performing score of our entire test. We scored a ‘4’ with the worst possible score being a ‘5’. This test was performed by installing 11 individual applications, recording the install time on a machine without Cylance, then comparing the install time to a machine with Cylance installed. Across the 11 applications that were installed, Cylance was responsible for increasing the installation time from an aggregated 211.4 seconds to 349.73, a total additional time of 138.33 seconds. Across 10 of the 11 installers Cylance was responsible for extending the installation time by an average 7.267 seconds per installer, short of one: VLC-2.1.5-win32.exe. For some reason in the AV-TEST environment, Cylance extended the installation time from 9.04 seconds to 74.7 seconds, increasing the install time by 65.66 seconds. In our internal testing we could not replicate such an extreme delay with the same file. Instead, the maximum delay we observed in our lab was 33 seconds. We are reviewing this phenomenon internally to understand the underlying cause and determine if this is expected behavior. All the while however, installations are not generally daily activities, and delaying the install of VLC on an enterprise computer by 33 or even 65.66 seconds seems to be the biggest issue identified by this testing, responsible for dropping our score significantly in the performance category.
It’s interesting that actual run time system performance (use of system resources, CPU, memory, or disk usage) isn’t measured, even though we all know it is the most complained about byproduct of traditional antivirus. The fact that traditional antivirus operates as an I/O watcher on the entire endpoint consumes massive resources, forcing enterprises to constantly upgrade systems in a battle to keep business applications running as their endpoint agents continue to add bloat and overhead to the system. Cylance solves this problem by strongly favoring execution prevention, rather than adding layers of interceptions built up over years. We focus on the point where potential threats become things you have to worry about, instead of trying to watch every layer of what’s going in and out of the computer in an attempt to identify malicious behavior.
Usability
Usability in the AV-TEST evaluation consists of four individual tests to measure false positive performance:
- False warnings or blockages of websites
- False detections of legitimate software
- False warning concerning certain actions carried out whilst installing and using legitimate software
- False blockages of certain actions carried out whilst installing and using legitimate software
Cylance doesn’t attempt to block websites as a means to protect endpoints, so we scored perfectly in the first category. We also had no issues with ‘false warnings concerning certain actions carried out whilst installing and using legitimate software’. We did experience a single false positive during the ‘false blockages of certain actions carried out whilst installing and using legitimate software’, but what really brought down the aggregate score was 28 ‘false detections of legitimate software as malware during a system scan’.
We’re going to refer to ‘false detections of legitimate software as malware during as system scan’ simply as ‘false positives’ for simplicity’s sake. For this test, 1,301,224 AV-TEST sourced files were scanned with Cylance to determine the product’s false positive rate. They were broken into two different tests: ‘critical’ files, and ‘less critical’ files from major download sites. Of the first category, ‘critical’ files, Cylance scored zero false positives. These are the files that would lead to issues like a blue screen, a failure to boot, or an issue with a critical business process. However, in the ‘less critical’ sample set, Cylance alerted on 28 total files. These 28 files are responsible for dropping our score in this category from a perfect ‘6’ to the awarded ‘4’. Even at the rate of 28 false positives out of 1,301,225 files, our effective FP rate was .002% - not perfect, but in an age of massive malware-assisted data breaches, typically not a deal breaker either.
Of the 28 false positives, Cylance performed full reverse engineering on every sample to identify what caused the detection, and the results were not shocking. Almost all of the files were trialware, or shareware, none of which were created by any major software manufacturing company (short of a display driver from a 2005 Gateway 7310 computer). In addition to these files being non-existent software in the enterprise space, they were all specifically packed or obfuscated, almost all of them packed with the same types of tools malware authors use to obfuscate the internals of malware, due to their free nature. They included files such as Deleaker 3.0.10 (released 2012), EZ CD Audio converter Ultimate 3.1.4 (which we couldn’t locate outside of warez sites), and Fox 3GP Video Converter 1.0. While an in-depth analysis proved the files were not in fact malicious, it did shed some light as to why we alerted.
First off, our product works by abstracting millions of features from a static binary before it executes. We then use those features to feed an artificial intelligence engine that has been trained on what good files look like and what bad files look like. Generally, our engine did not classify these files as outright ‘unsafe’. They were classified as ‘abnormal’, meaning they share numerous features with both bad and good files - enough that in sensitive environments the customer has the ability to block them along with ‘unsafe’ files, blocking their execution at that point in time, but giving the administrator an easy route to waive the detection and let them run. During this test, we configured the product to block both unsafe and abnormal files, which tends to be the policy deployed by the majority of our enterprise customers, however therein lies the key - these are not enterprise files. They are tools used by individuals to perform a specific task, written by a sole developer, and as such haven’t been added to our corpus of data to train our models. The set of features expressed in these files is extremely abnormal due to packing the applications with multiple packers, using anti-debugging features, and many other common malicious elements. If we were wearing our Enterprise Administrator hats, we would never allow one of them to run in our enterprise without express permission.
In summary, while the AV-TEST shows Cylance is not perfect based on AV-TEST’s methodology, we’ve never said we were perfect. In fact we’ve gone out of our way to say we are NOT perfect, and never will be. But we will forever pursue the impossible claim of perfection. There is no such thing as a 100% security product. That’s just not how security works. We don’t believe this test is necessarily representative of our effectiveness in our customers’ enterprise environments, nor does it address the problems they face on a daily basis. In the end, don’t believe us; don’t believe our competitors, just test us in your environment and compare. We’ve found countless malware and attacks that our customers’ previous vendors have completely missed, and every day we prevent the nastiest attacks in the world, protecting our customers better than any other company the industry. Test us yourself.