The Four Steps of Cylance Artificial Intelligence and Machine Learning
Forget science fiction. Artificial intelligence (AI) is the stuff of the real world today. Your Google computer search, your Netflix movie choices, and your Amazon-supplied selection of buying options "made just for you" all employ AI based techniques.
Cybersecurity is no different. The application of AI pioneered by Cylance has transformed endpoint protection before our very eyes. Organizations are completely changing the way they operate, using greater security with fewer resources. Their staff was previously stretched thin, running around fixing problems, and trying to catch up with never-ceasing tickets. But now, they can now focus on what’s important, and proactively plan for better security against the most sophisticated attacks – and executives can focus on vision and execution while not worrying about preparing for the inevitable catastrophic breach.
But what exactly is AI? What’s the process and why does it matter in cybersecurity?
While attackers have innovated over the last few decades, the systems they seek to undermine are relatively unchanged. They’ve connected via the Dark Web. They’ve mass produced. They’ve developed more efficient means of scaling attacks and attack frequency. Today, black hats, bad actors, malicious attackers — pick your term — don’t employ new techniques, but equally important, they perform malicious activity with greater volume, velocity, and variance.
But innovation exists in the universe of security — it’s the pioneering application of artificial intelligence to endpoint protection. The groundbreaking concept is simultaneously sophisticated yet simple. And it’s forever altered the way people and organizations defend themselves.
So, why now? Several forces converged to play a role. Life has gone digital. The Internet of things, mobile devices, big data, and embedded systems have produced vast quantities of data that can be collected, collated, quantified, and parsed in a myriad of ways.
And while the power of computing has soared to no end, the costs have equally plummeted. Life has gone digital for the masses — and they’re all connected. Add cloud computing, open systems, and connectivity, and what you have is the digital life.
For Cylance, amassing enormous sums of data and applying it to machine learning algorithms has yielded amazing results for its customers. From large enterprises to small businesses, Cylance AI recognizes how attackers attempt to exploit computers and predicts, prevents, and better protects against their attacks.
Collection
Artificial intelligence based machine learning begins with collecting vast amounts of data from as many sources as possible. Security researchers gather hundreds of millions of files from data feeds, public and private databases, purchased and proprietary sources, surveys, and more. Recent innovations such as elastic cloud computing, ubiquitous mobile use, IoT, and embedded systems have transformed the capability and speed of data collection.
Unlike current so-called ‘machine learning’ from other vendors, Cylance has captured and continues to grow its sample data size. And with the explosive growth of its user community, the datasets continue to expand exponentially.
Extraction
Cylance data scientists and engineers have honed the feature extraction process to a precise methodology, deconstructing a file into more than a million characteristics. Individual features are analyzed against millions of characteristics from other files known to be benign or malicious. The deconstructed file characteristics are then translated into vectors.
You can observe machine learning as a simple three-step process:
- To learn, it must observe
- To observe, it must know what to look for
- To know what to look for, it must have previously learned
Extraction yields the observations used to train the AI math models to autonomously learn and recognize patterns that determine if a new file is benign or malicious.
Learning
After data is collected and features are extracted from each file, the millions of attributes are ready for the learning process. The attributes are converted to numerical values, in the form of vectors, which are used in model training. Dozens of models are created with measurements to ensure the accuracy of prediction, and the testing process itself helps identify ineffective models. Hundreds of millions of files are used to test and validate models. Tested, refined, and ready for action, the final models are loaded for use.
As the files and file attributes go through the learning process, the models develop an understanding of the intention of a sample, which can be used in a predictive fashion to determine the potential risk a new file may pose without having to execute the file itself.
Classification
The creation of accurate statistical models facilitates file content classification and clustering, and includes organizing files based on intent, such as benign versus malicious files. For instance, you can classify files, prior to executing, using the math models that predict whether it’s intended to perform as a key logger, potentially unwanted program (PUP), trojan, etc. In the end, machine learning relies on precise content categorization.
AI can uncover statistical correlations that human observation will miss entirely or misinterpret as harmless. Whereas human analysis can take minutes or longer, computational AI performs the analysis in milliseconds. The analysis is much more precise than human interpretation because of the extensive Cylance database of files and file characteristics analyzed.
In addition, each classification analysis results in a confidence score, which can be used to weigh decisions around a single file. An operator can quickly view a dashboard and decide, based on the score, whether to block, quarantine, monitor or analyze it further. In other words, AI works both autonomously as well as part of a user-centric workflow. Users can benefit from AI and isolate outliers that need further analysis, allowing them to transition from tedious response to thousands of alerts to focusing on priority work that benefits from their skillset.
Why Does This Matter?
The foundational steps of AI based machine learning each play a vital role in the process of providing prediction, prevention, and protection that works pre-execution, before threats occur. Prevention is better than protection, and unless your prevention solution has a significant enough data collection, powerful extraction tools, learning math models, and fine-tuned content or file organization, then it’s falling short.
Want to learn more? Read the white paper.