Carbanak+FIN7: MITRE EDR evaluations round 3

It has been more than eight months since MITRE Engenuity purple teamed Carbanak and FIN7 in the third round of EDR evaluations. Vendors roared to highlight their exceptional prowess, best-in-class top-level leadership, and, before I forget, industry-leading fidelity and indisputable performance something. MITRE again emphasizes that it doesn't proclaim winners or rank EDR solutions in a separate blog post although it admits it would if it were easy to declare one! Unlike previous editions, MITRE includes rudimentary data analytics to draw your own conclusions, which saves you the trouble of preparing their JSON files for use in Splunk, Tableau, or PowerBI!

Visibility and detection include Monday mornings

MITRE Engenuity includes telemetry in its definition of detection and visibility, and telemetry is another word for logs. This definition is unhelpful in production environments because, for example, 1500 endpoints will generate 1 Terabyte of telemetry in 2 months using Microsoft Defender for Endpoint. Although traces of adversary actions are present in this sea of noise, so are your sleepy colleagues on Monday mornings and network and process events related to just running Teams and Chrome. "100% visibility" and "highest number of detections" suddenly sound like a cybersecurity noise generator to avoid instead of something that would serve SOC analysts.

A model that emphasizes data loss across multiple steps between adversary and analyst may be better suited to disambiguate visibility and detection practically. Such a model already exists: DeTTECT published three years ago by Ruben Bouman and Markus Bakker at Rabobank CDC, defined telemetry as a prerequisite for detection, not as a separate entity. The DeTTECT model helps troubleshoot detection pipelines: we could have raised an alert here, but we didn't. There is no problem with visibility (the logs are there) but with detection (what we do with the logs).

Caring about SOC analysts

One vendor stubbornly refuses to play the "more noise is better" game. Instead, they focus on decreasing alerts and presenting prioritized ones for analyst review. The reason becomes clear by looking at the analytic alerts from the Linux part of the first scenario. Here we see the adversary executing nefarious elite hacker actions like "ssh" and using "cat" and "ls", which your Linux colleagues probably also use daily. That will raise a lot of false positives. Guess who has to look at those? Guess what is going to happen with them?

End-to-end information loss

Dealing with false positives is already 99% SOC core business, but supplying more alerts means a bigger SOC or less time spent per alert. The crudest way to suppress them is to ignore low-priority alerts, making them the new telemetry. Telemetry is helpful but needs further processing. If the EDR also classifies less-pedestrian alerts as low-priority, this may lead to significant information loss. More alerts are a local optimum, making the EDR vendor look good. It incentivizes less filtering, and filtering is bad because it causes bar charts and press releases to look less impressive than competitors.

Unfortunately, we don't know how many alerts these EDR solutions would trigger in a non-lab environment. The evaluations put a number in two out of the four quadrants of the matrix below. We can guestimate the number of true negatives alerted as positives by looking at the individual alerts. We hope these approach 0 but unfortunately, Living on the Land (using tools already present on systems) reduces the distance between true positives and true negatives. In turn, this raises the number of true positives getting ignored and true negatives getting alerted.

	Alerted	Ignored
Actual Positive	We know the number now, thanks!	Same!
Actual Negative	?	?

Below is a chart of the number of alerts per step, sorted by number of analytic detections. Do we need 28 alerts to establish that the threat actor is expanding access in step 5? Which actions require alerting, and which actions don't? Do we care about the missed steps in red? Looking deeper into the actual data and screenshots on the MITRE Engenuity site will tell more about possible false positives than any other chart.

**Figure 1.** Analytic detections per evaluation step in the MITRE Engenuity Carbanak+FIN7 EDR evaluation. Darker blue cells indicate more alerts for a given vendor and step; red cells indicate missed or zero analytic detections.

One vendor unlike the others

The evaluations included a SIEM vendor before, but that ended without any press release on the part of the vendor, meaning it didn't go well. Arcsight, a new participant in round 3 and SIEM vendor, decided to accept the challenge and invest in pizza and overtime for four months. They used Sysmon, Zeek, auditd and other logs to generate telemetry for visibility and create rules in Arcsight for detection. The nice thing about this solution is that it promotes knowledge building and integrates visibility and detection into the SOC, making it less of a call center.