Authors: Patrik Pekarčík, Andrej Gajdoš, Pavol Sokol
Abstract
As the number of devices and users connected to the Internet increases, also the number of security threats and incident increases and reactive measures are not sufficient. For this reason, the emphasis is shifting to preventive measures. The forecast of an increase or decrease in the number of security attacks or incidents in the network of an organization can be very helpful in prevention measures. In this paper, we focus on the network security situation forecasting based on time series analysis. The main objective of this paper is to determine the effect of seasonality and sliding window on network security situation forecasting, and criteria for choosing the suitable time series. Our evaluation shows that the seasonality does not play an important role in time series analysis. Also, time series analysis methods with the usage of sliding windows have comparable forecasting results. The combination of Arima and Exponential smoothing methods (ETS), which achieved the best results within the research evaluation, proves to be a suitable candidate for the real-time forecasting model.
Introduction
The number of cyber threats and attacks targeted towards all varieties of devices increases daily. The main topics in the field of cybersecurity are the detection of security incidents and the response to them. Security threats cannot be completely eliminated. Therefore, the current trend is to move from reactive to proactive activities [4]. The main goal is to prevent or mitigate security incidents before they cause harm to the organization.
Methods of predictive analysis play a significant role in predicting specific security incidents, predicting the next steps of the attacker or in predicting the security situation of the organization [11]. In this regard, we recognize three main approaches to predictive methods in cybersecurity:
- attack projection, problem being predicting the next move of an adversary in a running attack by projecting the series of actions the attacker performs [27];
- attack prediction, problem being what type of attacks are going to happen where and when [1];
- security situation forecast, problem being forecast number of attacks or vulnerabilities in the network of the organisation [18].
In this paper, we focus on the network security situation forecasting. It is based on general definition of the situational awareness: “Perception of the elements in the environment within a volume of time and space, the comprehension of their meaning and the projection of their status in near future” [8]. Therefore, the network security situation forecasting is a monitoring of cyber systems, understanding of the cybersecurity situation represented by modeling of cyber threats or relating security alerts and predicting the changes in cyber security situation [11].
There are a number of important issues that need to be addressed in this approach. The main problems are space and time requirements of the predictive methods, prediction window, criteria for suitable time series etc.
To summarize the problems outlined above, we emphasize the following questions that we aim to answer:
- the effect of seasonality on network security situation forecasting,
- the effect of the sliding window on network security situation forecasting,
- time series selection criteria suitable for network security situation forecasting.
To answer the questions, we use predictive methods based on time series. Time series models “attempt to make use of the time-dependent structure present in a set of observations” [6]. The appropriate forecasting methods depend largely on what type of data is available. We have the choice of either qualitative forecasting methods (in cases when available data are not relevant to the forecasts) or quantitative forecasting methods. For purpose of research in this paper, we have available data from Warden system [17] and we have chosen quantitative forecasting methods, which describe the network security situation at a point in time [18].
This paper is based on the results of our previous research [21]. In previous papers, we focused on the quantitative analysis of the total number of incidents and did not pay attention to different categories of alerts. In research paper [10], we address similar issues, but only work with weekly data. This paper clarifies the issues examined and uses annual data as source data, which, in addition to the quantitative component (number), also carry a qualitative component (alert category, network protocol, or network port).
This paper is organized into five sections. Section 2 focuses on the review of published research related to predictions in cybersecurity based on time series. Section 3 focuses on the research methodology and outlines the dataset and methods used for the analysis. Section 4 states result from an analysis of the research questions and discuss knowledge obtained from the analysis. The last section contains conclusions.