Authors: Pavol Sokol, Ľubomír Antoni, Ondrej Krídlo, Eva Marková, Kristína Kováčová, Stanislav Krajči
Abstract
An increasing number of cyberattacks puts a rising demand on security analysts and teams for security incident response. In this paper, we focus on connections and relationships between digital evidence, which can help solve cybersecurity incidents. We can apply Formal Concept Analysis as a set of data analysis methods that are based on lattice theory. This particular biclustering method allows us to explore the meaningful groupings of digital objects (referred to as objects) regarding joint attributes. Moreover, we can visualize the concept lattice to consult its hierarchy with the experts in the field.
In our paper, we describe the formal context based on digital evidence collected from the NTFS file system. We present several concept lattices on these data subsets and provide the association rules from our tasks.
Introduction
An increasing number of cyberattacks puts a growing demand on security analysts and teams for security incident response. Analysts are easily overwhelmed by the many alerts from monitoring devices, so it is essential for them to quickly gain an overview of what is happening and obtain all the relevant information. It is crucial to make the right decisions about their next steps to minimize the loss of sensitive and confidential information and prevent repeated attacks.
Security incident handling is an essential reactive activity of organizations in information and cybersecurity. Its goal is to identify the source of the incident, understand the attacker’s procedure, perform impact analysis, and design security measures. The incident must be resolved quickly and correctly. For this reason, a more advanced analysis is used, namely digital forensic analysis. This involves investigating all devices that can store digital data. In the digital investigation, the analyst either confirms or refutes the forensic hypothesis, especially when dealing with a security incident.
The digital investigation aims to obtain relevant information available in the system from metadata and timelines to identify items with significant forensic value. Metadata such as file size, file path, and file name are usually used to filter and index files. Closely related to metadata is the creation and analysis of timelines. A timeline is an approach by which sets of records can be represented in a sequential chronological arrangement [1]. Timeline analysis is one of the leading forensic capabilities to investigate a cyberattack [2]. It allows security teams to more quickly identify digital evidence or events with significant forensic value and gain a global view of events that occurred before, during, and after the incident [1].
A forensic analyst can find unusual but event-related digital evidence using these timelines. A data pattern that does not closely resemble standard data behavior is called an anomaly [3]. Anomaly search is a standard part of forensic investigation. At present, manual searches prevail [4], or keyword searches are performed based on strong probabilities of occurrence [2]. These activities are time-consuming. For this reason, a more convenient approach with better detection efficiency is required [5].
This paper focuses on the effective search for important digital evidence and the search for connections and relationships between them, which can help solve cybersecurity incidents. To summarize the problems outlined above, we emphasize the following research questions:
- What is the relationship between attributes of digital evidence in a forensic timeline?
- How can anomalous records in a forensic timeline be identified?
To answer these questions, we will apply Formal Concept Analysis. This method of data analysis, based on lattice theory, allows us to explore the meaningful groupings of digital objects (referred to as objects) with respect to common attributes, and it provides visualization capabilities [6, 7].
This paper is structured into seven sections. After the introduction, we present the related works in Section 2. Section 3 briefly describes the use case, outlines the dataset preprocessing process, and describes the attributes. Section 4 presents the concept lattice of digital evidence. Section 5 discusses association rules of digital evidence. Finally, Section 6 concludes the paper and discusses the challenges for future research. Identifying possible attributes and finding relationships between them is an important research question in this area. An equally important aspect is identifying relevant digital evidence for the case. For this purpose, we analyzed this digital evidence using Formal Concept Analysis.