13 Deadly Sins of APT Incident Response – Part 3
When a threat actor breached your network, you rapidly implemented your incident response (IR) plan. And now — following long days, short nights, and non-existent weekends — your team is tired. And all of you are ready to complete the final phases of your remediation and response, to move forward.
How you carry out this phase of your IR plan is crucial to its overall success.
Incident response mistakes — especially when dealing with an Advanced Persistent Threat (APT) — can exacerbate the damage threat actors cause and give them extra time to carry out their attack.
In Part 1 of this series, we discussed “deadly sins” of incident response that network defenders make in the preparation stage while planning for an attack. In Part 2, we looked at incident response mistakes made during the heat of the moment with a known attack underway in your environment.
Now, in the final part of this series, we will discuss mistakes made during the remediation and recovery phases of an attack. These are the “deadly sins” you want to avoid, based on insights from the BlackBerry Incident Response Team, which collectively has more than 100 years of IR experience.
The Tenth Deadly Sin of Incident Response: Incomplete Remediation and Recovery
It is of the utmost importance to perform the containment, eradication, and remediation phases of response in one swift move. This includes removing all infected systems, resetting all affected accounts, and reinstating fresh builds of critical servers. Where possible, we recommend disconnecting the entire company network from the internet during this step to ensure no countermeasures can be taken by the threat actor during this phase.
Too many well-intentioned network defenders start taking systems offline, one by one, over an extended time period. This can alert attackers you are attempting some sort of intervention and they will actively take measures to “clean up” and stay dormant, often for months. This can also effectively block incident responders from identifying all attacker implants. Failing to perform these phases of incident response in one swift move is a mistake.
Service Accounts in Incident Response
A significant mistake we see during the remediation phase is leaving service accounts out of this phase, due to the difficulties some companies have with things like hardcoded application passwords. Threat actors know that these accounts are often ignored in a reset and can take advantage of this opportunity.
These service accounts typically impact business-critical and revenue-generating applications, so there is often considerable push-back on resetting these accounts, even after an attack. Failing to reset is risky because service accounts are one of the more valuable targets for threat actors, as they often have administrator privileges. And many service accounts are also considerably over-privileged (see our blog for advice on this topic).
Another key point is to remember your “back-up” administrator accounts. They are also highly privileged, but only used on rare occasions, so their password is seldom changed.
Depending on the visibility of the attack, it can be good practice to reset the built-in Key Distribution Service account (KRBTGT) password twice after a potential golden ticket generation. A golden ticket attack gives a threat actor access to an organization’s Active Directory domain. Please seek advice when doing the reset, as there is a waiting period required between resets to avoid issues with Active Directory (AD) communication.
Rebuild vs. Restore After a Cyberattack
In our experience, the Incident Response team typically does not perform hands-on recovery activities. Their focus is on creating a list of required steps, which are then planned and executed by the administrators responsible for the systems.
In too many cases, when these steps are carried out, we see businesses make crucial mistakes. These include restoring things from an older snapshot and attempting to remove backdoors manually. The optimal solution is to rebuild systems from the group up, which takes more time and effort, but also reduces risk and increases your confidence that the environment is once again secure.
In limited cases, you can safely restore from a backup. You must have very high confidence that you know exactly when the incident started, and you must have good visibility into all the activity the attacker carried out. If you have these things, you can restore from a point before the compromise with a reasonable amount of confidence. (Remember, you’ll also need to remediate any vulnerabilities to stop it from reoccurring).
However, this option proves risky when any infection elements are unknown, such as when an initial entry happened, or when servers were load-balanced, if some of them were not rebuilt. In such cases, the older snapshot could still be infected.
We recommend conferring with the IR team to decide on the optimal course of action for each affected server, to determine if a rebuild is necessary, or if a restoration from a backup is sufficient. You’ll need to determine conclusively that the backup was performed on a date before the incident.
When a system needs a rebuild, the best solution is to completely wipe the hard drive. This means overwriting the disk with random bytes or full zeros, and reinstalling the image from a trusted, “golden image.”
However, please note that in some extreme cases reinstalling an image still will not resolve the issue. For example, one APT actor called “Equation Group” survived golden image reinstallation by reprogramming the hard drive firmware of more than a dozen popular HDD brands.
The Eleventh Deadly Sin of Incident Response: Believing It Is Over
When dealing with an advanced persistent threat (APT), responders often make the mistake of thinking they have the attack fully remediated before they actually do. In many ways, removing the attacker from the environment is just the start. Security staff must also think about ways to implement and improve continued monitoring, how to perform proactive threat hunting, and how to increase visibility of the security team within the environment — using experience gained from the incident.
Doing this will help protect you from the “P” in APT, which stands for “persistent” for a reason. Government-sponsored groups will continue attacking companies that have been identified as targets. These groups will often appear to go dormant for short periods of time, then suddenly find and retarget another weak spot in the victim’s defenses.
Advanced groups often start with the easiest options and move through their arsenal until they successfully breach the target network. When attackers infiltrate the environment, they will often seek out security information, including vulnerability scan results and reports, so they can use your weaknesses against you.
For example, during a recent ransomware case, we believe we accidentally disturbed an unrelated APT actor that was previously present in the environment. They quickly and easily gained access back into the environment using prior knowledge of known vulnerabilities in multiple externally accessible applications.
Evidence of any previous attack was difficult to come by, due to the old environment being ravaged by the ransomware attack. Fortunately, we were able to catch this quickly and the remediation plan already had steps in it that assisted us in preventing the easy re-compromise of the system via human-operated attacks. However, we had to switch gears quickly after identifying the presence of an APT, to rapidly design a remediation plan that could fully remove the access, and rebuild the machines (again), after remediating the application vulnerabilities.
It’s important to note that if APT actors are not able to successfully attack you directly, they may move to your partners and service providers to try different angles of attack. Your first APT attack is often just the initial indication that you are a target. Increased vigilance and improved defenses are required from this point forward, as it is not likely that this will be your last such attack. Persistence is often the key differentiator between a nation-state threat actor versus financially motivated criminal actors.
The Twelfth Deadly Sin of Incident Response: Ignoring Lessons Learned
Assessing and acting upon “lessons learned” is time-consuming, and so it is often skipped by teams at all levels, including both technical and managerial staff. After a difficult and time-consuming incident, teams are exhausted. And many organizations want to return to “business as usual” as fast as possible, without dwelling on things they may have done wrong.
However, analyzing what could have gone better is a critical opportunity for improving an organization’s security posture. This analysis can help you understand what plans and job roles were missing, or where an organization should look for additional defensive measures and external vendor support.
Until “lessons learned” activities are prioritized and put into practice, the lesson itself is not actually learned. This is especially true because APT-driven security incidents are relatively rare outside of consulting organizations that spend a great deal of time working on them. However, the intelligence gained by learning from these attacks is far too valuable to waste.
Here are some resources that will help you perform “lesson learned” sessions:
The Thirteenth Deadly Sin of Incident Response: Failing to Share Indicators
The last “deadly sin” often committed during the recovery phase is failing to share your experience and valuable threat intelligence, after you remediate the cyberattack. Intelligence should never be shared externally until the incident itself is fully resolved.
In some cases, sharing indicators of compromise (IoCs) may be limited by vendor/client nondisclosure agreements (NDAs) or other legal constraints. This can make it difficult to share indicators externally. Consider to what extent you could anonymize indicators, or perhaps, create YARA, SIGMA, or Suricata rules. These three formats provide a complex set of detection rules to identify malicious files, SIEM data, or network traffic. At the very least, curating and operationalizing the use of intelligence internally will make you more effective.
Curating involves doing quality assurance on the intelligence, making sure there is not client related data, false positives, or PPI within the indicators of compromise and all the intelligence has the highest confidence level possible.
And operationalizing means using your intelligence set, in the best way possible, according to the tools and practices your organization has in place. This might be integrating indicator streams into a SIEM solution or writing specific rules for your EDR to catch APT behavior.
Everything should depend on what the organizational need is, but in general, threat intelligence should improve existing tools and procedures.
Also, consider joining a trusted group of peers within your industry or sector — or create one if it does not already exist — to share your intelligence. Although we sometimes compete in business, we are all under attack by the same shared adversaries, and by acting together and learning from each other, we can all become safer.
The following platforms are open-source tools that help with sharing indicators:
Closing Notes on Incident Response Mistakes During Recovery
In terms of recovery, the common denominator among incidents is that most organizations believe an incident is over and done with once they can resume operations. From an IR standpoint, we’ve seen dozens of companies that did not respect the need for password resets or did not identify the full extent of a compromise prior to restoring systems from a trusted source. This failure often results in another compromise only a few months down the line.
Although ensuring continued operations is indeed the ultimate goal of the security team, we need to understand that post-incident activities are as important as dealing with the incident itself.
Cybersecurity incidents can quickly turn into a genuine game of cat and mouse, and it is best to be a very strategic cat!
Parting Thoughts: The Wages of “Sin”
The BlackBerry Incident Response Team shared its collective knowledge in this series. In all three parts, we often focused on the negative — and necessarily so. The IR mistakes any of us make can have serious consequences.
However, the opposite is also true. Avoid the “deadly sins” we discussed, and the odds of a successful response and remediation increase significantly. This will make it easier to sleep at night knowing that your efforts were a job well done.
Please share this series with colleagues and any network defenders you know. Let’s help as many organizations as possible benefit from these insights.
And check to make sure you have an incident response retainer in place. Doing this before an attack allows you to reach for help the moment you need it, to save time and help guide your next move.