You are currently viewing Lessons learned from correlation of honeypots‘ data and spatial data

Lessons learned from correlation of honeypots‘ data and spatial data

Article
Link to Google Scholar

Authors: Pavol Sokol, Veronika Kopčová

Abstract

Honeypots and honeynets are unconventional security tools for the purpose of studying techniques, methods, tools, and goals of attackers. Analysis of data collected by these security tools is important for network security. In this paper, we focus on information about the locations, shapes of geographic features and the relationships between them, usually stored as coordinates and topology (spatial data). We discuss specific spatial data related to countries and analyse them in relationship to number of attempted attacks collected by honeypots. In the paper, we analyse the relationship between the spatial data and number of attempted attacks and properties of countries, from which attackers attack. We found that there is relationship between the spatial data related to countries and number of attempted attacks. Also the number of attacks is related to active population who use the Internet and level of infrastructure and service provision of country.

Introduction

Cyberspace offers new opportunities, but it is also a source of new threats for both, individuals and for organizations. Therefore, network security has become an increasingly important part of modern society. Traditionally, information security is primarily defensive and uses conventional tools to protect the information (e.g. firewalls). For this purpose, it is necessary to collect and investigate as much information about these communities as possible. From this point of view, honeypot seems to be very useful tool. It can be defined as “a computing resource, whose value is in being attacked” [1]. Lance Spitzner defines honeypot as “an information system resource whose value lies in unauthorized or illicit use of that resource” [2].

The most widespread classification of honeypots is based on the level of interaction. There are two types of honeypots – low-interaction honeypots and high-interaction honeypots. On one hand, the first type of honeypot emulates the characteristics of network services or a particular operating system (e.g., Dionaea [3]). On the other hand, the second type of honeypot is a complete operating system with all services and sensors, which is used to get more information about attacks and attackers [4] (e.g., HonSSH [5]).

Special type of high-level interaction honeypot is defined as honeynet. The honeynet can be also referred to as “a virtual environment, consisting of multiple honeypots, designed to deceive an intruder into thinking that he or she has located a network of computing devices of targeting value” [6]. Honeynet consists of four parts: (i) data control, (ii) data capture, (iii) data collection, and (iv) data analysis [1], [6].

Collection and analysis of data captured using honeypots and honeynets is the main purpose of using these tools. Learning new unconventional information about the attacks, attackers, and tools facilitates protection of the network services and computer networks of organizations. Each honeypot collects the IP addresses of attackers. It is possible to obtain several interesting data and information from IP address, for example, name of the internet provider, location of computer or server, and time-zone of the honeypot. Geographic coordinates of the attacker’s IP address allow for extracting subsidiary data, such as country, region, city and time zone etc.

The above mentioned data that can be referred to as spatial data as their location within the geographical space can be extracted from the IP address. It enables the global finding and locating of individuals or devices anywhere in the world [7]. Spatial data is also known as geospatial data, spatial information or geographic information. From the perspective of research, the geographical location of the attackers may be useful for identifying attacks. This paper is a sequel to the analysis of data collected from honeypots and honeynets. In paper [8] we focus on time-oriented data and discuss the relationship between time and data captured by honeypots. On the other hand, the main aim of this paper is to obtain information about attackers using analysis of spatial-oriented data. This paper is interdisciplinary and combines geoinformatics, information security and mathematical statistics.

We address the following two research questions in this paper:

  • What is the relationship between the spatial data and the number of attempted attacks?

  • From which countries do the attackers attack?

 

This paper is organized into eight sections. Section II focuses on the review of published research related to lessons learned from analysis in the honeypots and honeynets. Section III outlines the dataset and methods used for experiment. Section IV presents the results of spatial analysis. Sections VVI and VII focus on specific spatial data from various aspects, such as internet users, population and economic aspects. The last section contains conclusions and our suggestions for the future research.