Peer-to-peer botnets: exploring behavioural characteristics and machine/deep learning-based detection

The orientation of emerging technologies on the Internet is moving toward decentralisation. Botnets have always been one of the biggest threats to Internet security, and botmasters have adopted the robust concept of decen-tralisation to develop and improve peer-to-peer botnet tactics. This makes the botnets cleverer and more artful, although bots under the same botnet have symmetrical behaviour, which is what makes them detectable. However, the literature indicates that the last decade has lacked research that explores new behavioural characteristics that could be used to identify peer-to-peer botnets. For the abovementioned reasons, in this study, we propose new two methods to detect peer-to-peer botnets: first, we explored a new set of behavioural characteristics based on network traffic flow analyses that allow network administrators to more easily recognise a botnet’s presence, and second, we developed a new anomaly detection approach by adopting machine-learning and deep-learning techniques that have not yet been leveraged to detect peer-to-peer botnets using only the five-tuple static indicators as selected features. The experimental analyses revealed new and important behavioural characteristics that can be used to identify peer-to-peer botnets, whereas the experimental results for the detection approach showed a high detection accuracy of 99.99% with no false alarms.


Graphical Abstract 1 Introduction
The term "bot" refers to a compromised machine under the command of a botmaster, whereas the term "botnet" refers to a network of such compromised machines [1].Typically, bots are exploited to perform various attacks, such as stealing data, launching distributed denial of service (DDoS) attacks, phishing, and spam [2].Recently, botnets have led to huge threats to Internet infrastructure security in different scenarios.Therefore, managing and improving network security have become more challenging, especially since the attackers are also improving their tactics and capabilities to avoid the existing countermeasures against them.In the last decade, botmasters developed their tactics well by benefiting from several robust concepts, such as decentralisation [3].The concept of decentralisation has been used to solve many of the biggest problems related to the Internet's network infrastructure, such as the single point of failure problem.However, it also brought new challenges when illegal intruders utilised the same strong points against the original purposes of those points.For example, peer-to-peer (P2P) botnets have been observed to adopt the P2P architecture, and these botnets are characterised by dispersion and distribution [4,5].Figure 1 shows the difference between the P2P botnets on the left side and the centralised botnets on the right side.
In addition, P2P botnets have no independent botnet mainframe, which eliminates the vulnerabilities that weaken other architectures [6].Furthermore, P2P botnets are more resilient and stealthier than other types of botnets, which is another reason why they are very difficult to defeat or detect [7].
However, there are still several security countermeasures for botnets, and each countermeasure thwarts the botnets differently.For example, botnet monitoring provides information about most bots using monitoring mechanisms, such as honeypots, crawlers, and sensors [4].These mechanisms assist to more behavioural understanding and analysis.Consequently, that leads to identify the botnets' characteristics and behaviours in the networks.Another effective security countermeasure is the intrusion detection and prevention system (IDPS); the purpose of these systems is to monitor network traffic to detect unauthorised access and take procedures to prevent it [8].There are two main types of intrusion detection systems: anomaly based and signature based.The first type detects abnormal traffic based on deviations from the normal network traffic.The second type defines certain misbehaviours or signatures and then detects them once they happen.In terms of the location model, there are two main types of IDSs: host-based IDSs reside in the host, and network-based IDSs reside across the whole network [9,10].However, they are not foolproof and may not catch all the botnet instances especially botnets are constantly evolving and that what makes it challenging for even the effective IDPSs to keep up with the new tactics of botnets.
Although IDPSs are valuable security countermeasures, they are not a panacea, and they should be supplemented with other security practices to effectively mitigate the cyberthreats such botnets.This work aims to fill this gap by exploring more behavioural characteristics of one of the most serious and modern threats which P2P botnets.This paper proposes a new method of network traffic analysis that assists to identify new behavioural indicators of P2P botnets in the network.This method categorises the behavioural characteristics into two categories: (i) flow based has two norms to measure the packets per flow (PPF) and bytes per packet (BPP) as indicators and (ii) deviation from standard behaviour to measure the behavioural deviation from the transport layer and application layer as another indicator.Practically, these artefacts can be used as indicators of compromise (IOC) that can be leveraged by the network administrator to secure their networks from such threat.
Furthermore, this paper also proposes a novel approach to detect the P2P botnets using machine learning (ML) and deep learning (DL) techniques.The proposed approach utilises only the static indicators (five-tuple) including source and destination IP addresses, source and destination port numbers, and protocol identify number, as selected features.
In summary, this paper presents two security countermeasures for P2P botnets.First, we explore new behavioural characteristics/dynamic indicators (also known as IOCs) for P2P botnets to enable network administrators to distinguish the P2P traffic crossing the network boundaries via a network traffic analysis.Second, we utilise the static indicators (five-tuple) to detect P2P botnets using ML/DL techniques that have not yet been leveraged.For evaluation purpose, we utilise a recently published dataset by Kable et al. [11] that contained the P2P botnet scenario.To summarise, our contributions in this paper are as follows: • Proposing a new method based on analysing the network traffic flow and deviation from the standard The paper is organised as follows.Section 2 conducts a comprehensive review of the related works and summarises the state of the art of ML/DL-based solutions.Section 3 defines the dataset used in this work.Section 4 lists the implementation prerequisites of this work.Section 5 describes the newly proposed method of exploring a new set of behavioural characteristics of P2P botnets.Section 6 presents the ML/DL-based proposed approach to detect the P2P botnets.Finally, Sect.7 concludes this work and provide multiple future works.

Related works
The connection between compromised machines and the command and control (C&C) servers is an inevitable operation needed to call commands and updates.Consequently, some indicators always lead to the recognition of the botnets in a network [12].In this section, we comprehensively review related work meant to identify botnet behaviour by analysing network traffic.Furthermore, this section also summarises the related works that have proposed IDSs to specifically detect P2P botnets, in a table at the end of this section (Table 1).Most of the effective IDSs proposed by related works are based on ML/DL techniques.In brief, ML and DL are subfields of artificial intelligence (AI), which can be defined as the capability of machines to learn and imitate intelligent human behaviour [13][14][15].In addition, ML and DL techniques have shown promise as effective and efficient mechanisms for detecting anomalous behaviour [15,16].
Lee et al. [17] used the degree of periodic repeatability to distinguish between malicious HTTP bots and benign nodes.The authors considered the repeatability standard deviation in the detection of HTTP botnets as the degree of periodic repeatability.The results showed that the flows from benign nodes and HTTP bots were distinguishable.However, this paper only dealt with a sample of malicious HTTP botnets, with the only feature vector being the degree of periodic repeatability, i.e. the authors only looked for malicious HTTP botnets by monitoring the relations between the HTTP servers and bots.
Strayer et al. [18] examined flow characteristics, such as the packet timing, burst duration, and bandwidth, and then considered various indicators as evidence of the existence of botnet command and control.The authors started by eliminating the traffic that was unlikely to represent the activity of a botnet.They then classified the remaining traffic into groups that were likely to represent botnet activities.Furthermore, the authors correlated the likely traffic to determine the common communication patterns used by the botnet activities.Ultimately, the authors showed that the evidence for botnets could be extracted from traffic traces.However, they only practically evaluated their work with IRC commands.
W. Lu et al. [19] presented a classification approach for the detection of botnets.The authors evaluated the proposed framework using the web and the IRC community; the evaluation results showed a high detection rate with a low false alarm rate.In addition, the authors formalized the botnet behaviour using the average standard deviation for the byte frequency (over 256 ASCII characters in the traffic payload).Then, they provided a botnet strategy, whereby a higher average deviation value represented a higher likelihood that the traffic was generated by human beings.This indication strategy is important when using unsupervised learning (e.g.clustering) to detect botnets.However, this approach requires a large number of bots in the network, and, intuitively, it is inefficient when there is a small-scale botnet.
Venkatesh et al. [20] proposed a method to detect HTTP-based botnets using the behaviour of bots in the network.The authors discovered that most web-based botnets' communications exploit TCP connections.The behaviours of the TCP connections were extracted as selected features to detect HTTP-based botnets using ML techniques, such as neural networks.This method demonstrated the capability to detect HTTP-based botnets with a high detection rate and low false alarm rate.However, the authors only evaluated the proposed method by using the Zeus and SpyEye bots, and both these bots are similar in their behaviour in network traffic.
G. Gu et al. [21] proposed a detection system based on the protocol and structure used by botnets.This system exploits the properties of botnets, as bots of each botnet utilise the same C&C communications, i.e. they have similar malicious behaviours.
Wang et al. [22] presented an approach for detecting web-based C&C bots by identifying their network behaviour in a supervised network.Modelling the essential network behaviour showed that the approach could be used to detect web-based C&C bots with a low falsepositive rate.The authors noticed that the bots under the same botnet had similar connections when carrying out C&C communication.They therefore aimed to extract the common network behaviours used by web-based bots in order to automate the detection model.However, the authors neither consider group activities nor the payload information.
Eslahi et al. [23] proposed low-access-rate and highaccess-rate filters; these filters reduced the falsepositive rate in HTTP-based botnet detection.The high-access-rate filter was proposed based on the fact that botnets do not generate bulk data.Therefore, this filter was designed to remove any traffic that generates a high rate of requests.Later, those high-rate requests are labelled as automatic software rather than bot communications.The low-access-rate filter ignores the traffic that appears to be low as bots are created to perform faster than humans, as well as to undertake larger tasks, i.e. bots do not generate brief traffic.• The proposed approach could not detect the botnets that have irregularities in their traffic flow, such as storm, because the method was built based on the similarity of botnet traffic [34] Bayesian networks, Naive Bayes, J48 ML • Proposed a methodology to detect P2P botnets using ML techniques and achieved a high detection rate • Research was only conducted for the LAN environment [35] Decision tree ML • Proposed a P2P detecting system involving the identification of malicious fast-flux networks • The system is based on low time to live; when the TTL reaches zero, packets are discarded.This leads to loss of some of the network information [12] Neural network DL • Based on a multilayer NN, the proposed method achieved a high detection rate of 99% [6] K-nearest, REP tree, SVM ML • Proposed a new feature extraction method using the graphic symmetry concept to detect P2P botnets [36] Decision tree ML • Proposed an approach based on an ML classifier to detect P2P botnets at the node level • Storage overheads and major computational resources were required to process the constant flows at the node without even feature engineering [32] MultiBoostAB, DecisionStump ML • Detection of parasite P2P botnets using machine-learning classifiers • The authors used the same dataset [31], which is small and limited in terms of the traffic type Jang et al. [24] studied how to evade detection methods, and analysing the evasion technique was intended to contribute to detecting botnets.
AlAwadi et al. [25] proposed a multi-phase IRC botnet behaviour detection model.The authors used the C&C response messages and the malicious behaviours of IRC bots to identify botnets in the network environment.
Rostami et al. [26] provided an overview of the features and parameters utilised to detect HTTP botnets in order to propose a set of characteristics for the HTTP protocol that could be used to analyse and detect botnets.The authors presented various HTTP protocol attributes in order to facilitate better understanding and classification of HTTP packets, such as GET, POST, and the user agent.
In sum, botnets quickly upgrade their functionalities and improve their methods to evade detection techniques.Consequently, the periodic tasks with C&C servers and the packet size can change, which can defeat current botnet detection systems based on these features.Therefore, studying other attributes based on traffic analyses might help to develop new indicators that can facilitate botnet detection by network administrators.

Dataset definition
For many reasons, such as privacy considerations, obtaining a real network dataset is difficult.We can see that most existing datasets are simulation-based datasets.We were not concerned about whether the dataset used here was a real network or a simulation-based one, but we were concerned about the method of construction.Thorough and adequate dataset construction is important since new IDSs should be evaluated before deployment in real networks using a robust dataset.Issues in the datasets may even be reflected in the final evaluation [40].
We comprehensively studied the existing datasets, and each one was found to have its limitations: some were small-size datasets, some were unknown-source datasets, and some were datasets that were no longer reachable.Table 2 summarises the information about the existing datasets that contain P2P botnet traffic flows.
The issue with most of the existing datasets is that they are incomplete datasets.For detection purposes, the dataset must contain attack traffic mixed with background traffic in order to allow the trained model to learn more about both normal and abnormal behaviour.For example, the CTU-13 dataset is the most widely used compared to others (for instance, Xing et al., 2022) because it is a reliable and well-constructed dataset.However, after we experimentally analysed this dataset, we found that no benign traffic was recorded from noninfected machines, i.e. once we blocked the IP addresses of the botmaster and the infected machines, no traffic was left.There was only one dataset that has a traffic contained of both P2P botnets and benign nodes which was published by Kabla et al. [11].
Another important point is that most of the datasets are provided as CSV files, and we counted this as a limitation since CSV files only reflect a limited image of network traffic.In addition, flow-based behavioural indicators and bias standard behavioural indicators cannot be derived from CSV files, but PCAP files give complete network information, allowing better understanding when devising new IOCs.

Dataset Description
P2P botnet dataset -PeerAmbush [11] The latest dataset that was published and contain P2P botnet scenario.This dataset is well-constructed, and it was used in a research to detect the P2P botnet using deep learning technique [11] DCNDS [41] Project dataset including a P2P botnet scenario.This dataset does not contain background flows, and no PCAP file is provided CTU-13 [42] Includes 13 scenarios with different botnet samples, such as the P2P botnet.Many protocols are considered, such as ICMP, TCP, and DNS.However, this dataset does not contain background flows VHS-22 [43] A CSV file that contains mixed flows of botnets from other datasets, such as ISOT, CICIDS, CTU-13, and MTA, with legitimate traffic MTA-KDD-19 [44] Malware Traffic Analysis Knowledge Dataset.However, only a small CSV file is provided Trend Micro [45] CTF Wildcard botnet dataset 400.It contains only the following features: timestamp, source, destination, port, and bytes.It is provided as a CSV file P2P-BDS [46] Based on the article "Peer-2-Peer botnet detection system", but it is no longer reachable ISOR [47] Based on [47], but it is no longer reachable ISOT [48] Botnet dataset that only contains traffic passed from/to DNS Given the above reasons, we selected the P2P botnet dataset (PeerAmbush) [11] to evaluate the two proposed methods.The selected dataset is available for other researchers at Kaggle,1 namely: P2P botnet dataset -PeerAbmush. 2Figure 2 shows the dataset construction process of the selected dataset [11].
The selected dataset was completed by including the traffic flows of the botmaster, bots/infected machines, and noninfected machines.Then, the selected dataset is used for two purposes: to explore a new set of behavioural characteristics for a P2P botnet and to train a detection model using the static indicators.Table 3 describes the selected dataset.

Implementation prerequisites
The implementation prerequisites include programming languages and software tools to experimentally implement the proposed methods as follows: (i) identify the behavioural characteristics of P2P botnets by analysing the network traffic flow, and (ii) detect the P2P botnets using ML/DL techniques.Tables 4 and 5 list the hardware and software specifications used, respectively.

Behavioural characteristics of peer-to-peer botnets
Typically, the main part of a botnet is the C&C channel.
When we analysed network traffic, the behavioural indicators of C&C were also analysed.There may be some common features among the bots in network traffic, such as when botmasters are directly or indirectly informed about botnet detection or analysis activities.In addition, botmasters are required to periodically update the bots, which forces them to find a means of communication that, in the end, will be evidence of their presence.This kind of bot activity makes them recognisable and detectable.However, large-scale networks with extensive Internet bandwidth and administrative restrictions make it harder to monitor the whole network and accurately detect intrusions.Thus, this paper presents a new set of behavioural characteristics that can be used as IOCs to recognise the presence of P2P botnets in a network environment.
Unlike packet-based analysis, the behaviour level is related to higher-level features that are extracted from the traffic flow in order to help the network administrator recognise P2P botnets.In this study, we categorised the behavioural characteristics into flow-based characteristics and deviations from the standard behaviour of the network protocols.Noticeably, the experimental findings indicated deviations from standard behaviour in the transport layer (UDP) and the application layer (HTTP).Figure 3 summarises the categorisation of behavioural characteristics in this paper.
In other words, we depended on behaviour analysis and recognition using the standard protocol behaviours (i.e. the dynamic indicators), disregarding the port-based analysis undertaken by some researchers because there would be high false-positive rates.The reason behind high false-identification rates is that thousands of network applications do not use the registered TCP/UDP ports nowadays [49].
On the other hand, despite each botnet implementing its own C&C mechanism, such mechanisms exhibit distinguishable behaviours that can be captured by analysing the network behavioural indicators, allowing the network administrator to recognise anomalies easily.Furthermore, partially matching behaviours occur regularly in the lifetimes of botnets, which is another factor that makes it possible to capture them.For example, the botmaster may distribute scripts that automatically execute when certain events happen, such as new bots joining the botnet.

Flow-based behavioural characteristics
This category involved classifying distinctive network traffic behaviours as indicators of anomalies or benign node traffic.The analysis was based on the flow; a flow is a set of packets that belong to the same instance of communication with an application at the source and destination hosts.One of the most common ways of identifying a particular UDP or transmission control protocol (TCP) flow is by using the five-tuple features: source IP address, destination IP address, source port number, destination port number, and protocol identifier number [50].The items in the five-tuple were used as static indicators to detect P2P botnets using ML/DL techniques (Sect.6) in order to show how indicative these static indicators are in the detection of botnets.Nevertheless, no related work has yet leveraged the five-tuple for detection purposes.
However, to uniquely identify a flow, we must define it as something altogether different.Moreover, this analysis can work with encrypted traffic because it does not rely on the packet payload.
Flow-based indicators fall into two types: static indicators, which are not changeable over the flow's lifetime, and dynamic indicators, which are changeable as the flow progresses through time.As is known, the immutable information in the IP and TCP/UDP headers is a significant source of statistical indicators (Sect.6 describes P2P botnet detection using static indicators).The static indicators include five-tuple values (as mentioned above).
Likewise, some dynamic indicators, such as the packet size values, may also be derived from the payload Fig. 3 The behavioural characteristics information and packet header.In contrast, the packet arrival and departure times represent dynamic indicators, but they are outside the packet.Further dynamic indicators can be derived, such as burst times, periodic throughput samples, and bytes per burst.
In our experimental analysis, we depend on two new and important indicators to distinguish the behavioural characteristics: packets per flow (PPF) and bytes per packet (BPP).

Packets per flow (PPF)
The PPF refers to how many packets uniquely represent a single flow.The PPF revealed that the greatest numbers of packets were transmitted (Tx packets) and received (Rx packets) by the botmaster IP in the first place and to/ from the infected machine in the second place, as shown in the screenshot in Fig. 4.

Bytes per packets (BPP)
In the same way, the BPP revealed that the volume of data (Tx bytes, Rx bytes) sent to/from the botmaster was the greatest, followed by that to/from the infected machines, as shown in the screenshot in Fig. 4. The IP addresses of the botmaster and the bots are listed below the screenshot in Fig. 4.

Deviation from standard behavioural indicators of the protocols
The analysis of deviations from standard behavioural indicators is also known as protocol-based analysis.This analysis is based directly on the packet's payload.This analysis has a low false-positive rate compared to other analyses; thus, we worked with two different analysis directions in order to avoid a limited indication reading.However, there are two drawbacks to this method of analysis: it poses a possible threat to privacy, and it is computationally intensive.
In the analysis of deviations from standard behavioural indicators, the experimental findings showed deviations in two network layers: the transport layer and the application layer.The deviations were in two protocols: UDP and HTTP. Figure 5 shows the positions of the deviations from the standard behavioural indicators in the network layers.

Transport layer -UDP
For the CTU-13 botnet dataset, we realised that the botnet utilised the UDP protocol as the main carrier channel to infect computers.Compared to other protocols, UDP accomplishes this process in a simple fashion: it sends packets directly to a target computer without establishing a connection first and indicates the order of said packets or checks whether they have arrived as intended, unlike the TCP protocol, which completely relies on a handshaking-style connection.With many of the security mechanisms in other protocols, computers can drop suspicious requests; i.e. no acknowledgement is required.
For example, we compare UDP connections to TCP handshaking in Fig. 6 to show the ease with which botnets can use UDP as a carrier channel.
The comparison reveals a valuable vision and provides a better understanding that can be used with indicators to recognise deviations from protocol standard behaviours.Our experimental analyses showed that the UDP protocol was more often leveraged by the P2P botnets than TCP, as shown in the screenshot in Fig. 7.The IP addresses of the botmaster and the bots are listed below the screenshot in Fig. 7.

Application layer -HTTP
Regarding the HTTP protocol and why it is preferable for exploitation by botnets, botmasters of P2P botnets might publish the commands on a certain website to update the bots.This process continues regularly at intervals predefined by the botmasters.
In recent years, HTTP has become the dominant protocol among the various protocols for Internet services as it provides a set of rules for the management of the data exchange between servers and browsers.Analysing HTTP traffic has thus become a common method in current HTTP-based botnet detection studies [17,20,23].With the HTTP protocol, bots hide their communication flows within the normal HTTP flows, making them stealthy and difficult to detect.Monitoring and inspecting HTTP packets can reveal valuable information that can help network administrators analyse botnets' behaviour better and, ultimately, detect their presence in the network.In our experimental analyses, we identified several HTTP characteristics that were very helpful in distinguishing the bot traffic from the rest of the web network traffic.The screenshot in Fig. 8 clearly shows that the greatest numbers of packets were transmitted (Tx packets) and received (Rx packets) by the botmaster IP in the first place and sent to/from the infected machine in the second place, and there was a noticeable difference in

Detecting peer-to-peer botnets using the five-tuple static indicators
The rapid extension rates for network bandwidth are one of the most significant challenges for botnet detection systems.Thus, one of the critical assessment norms for IDS researchers is assessing the processing capability of IDSs.The well-known IDSs, such as Bro and Snort, nowadays consume large amounts of resources when they process a large amount of payload data over a high-speed network [51].
The orientation of the research shows the effectiveness of data mining and the adaptation of ML/DL techniques for detecting botnets [11,51,52].For many reasons, such as the growing sizes of payload information streaming on the network and increasing network speeds, solutions that rely on learning-based techniques are preferable because these techniques can automate the processing of huge amounts of data.ML/ DL technique-based solutions can save resources and time for systems, reduce the solution complexity, and make the process smoother.Moreover, data mining and ML/DL techniques are easy to apply to network flow information.In addition, the evaluation metrics are convenient indicators for the detection of botnets.
Given the above reasons, we experimentally examined two ML and DL techniques (NBTree and MLP) that have not previously been evaluated for the detection of P2P botnets using only the five-tuple features (previously mentioned in Sect.5.1), i.e. the static indicators comprising the source IP address, destination IP address, source port number, destination port number, and protocol identifier number.The NBTree technique is a decision tree-based attribute-weighting technique with an adaptive Naïve Bayesian Tree [53].The algorithm's pseudo-code and an analysis of NBTree can be found in [54], whereas the multilayer perceptron (MLP) is a deep neural network.Unlike other classification techniques, such as support vectors or the Naive Bayes classifier, MLP classifier relies on an underlying neural network to perform the task of classification [11].The algorithm's pseudo-code can be found in [55].
The proposed approach consists of three major stages: data preparation, feature selection, and ML/DL-based Fig. 9 The road map for the proposed approach detection.Figure 9 shows the road map for the proposed approach.

Data preparation
The data preparation process entails the preparation of the selected dataset for the next stages through various steps that make it readable by the ML and DL algorithms.The first step after selecting the dataset was data labelling because we adopted supervised ML/DL techniques in the third stage.Thereafter, we labelled the dataset with multiple classes: botmaster, bot, and normal records.Data cleaning was necessary to remove the incorrectly formatted, incomplete, or corrupted data within the dataset because when merging multiple datasets (as described in the selected dataset [11]), as in the dataset construction, there are opportunities for data to be mislabelled or duplicated.Therefore, we converted the dataset into numerical data to make it understandable by the following algorithms.Finally, we scaled the numerical data to fit within a specific scale, such as 0-1 or 0-100.We scaled the dataset because of algorithms used in the third stage that are based on measuring how far apart the data points are, such as the ML algorithm [56].The prepared dataset represented the input for the next stages.

Feature selection
As discussed previously in Sect.5, we considered the static indicators-i.e. the five-tuple features comprising the source and destination IP addresses, source and destination port numbers, and protocol identify number-as selected features, in addition to the class, for the detection of the P2P botnets.

Machine and deep learning-based detection
The behaviour of P2P botnets is distinguishable from benign behaviour in a network.The P2P botnet detection issue could be modelled as a multi-class classification task, thanks to our previous labelling of the dataset into a botmaster, bots, and benign flows.In order to detect the P2P botnet, we used only the five-tuple features, as previously mentioned (Sect.6.2).Accordingly, we adopted ML and DL techniques that have yet to be leveraged to detect the P2P botnets.Day by day, the relationship between cybersecurity and ML/DL techniques, such as AI applications, becomes stronger [57].This interplay between cybersecurity and AI applications, such as ML, reflects the effectiveness of these solutions in defeating cyber threats [13,52].Although there are still some risks from AI in some fields (as discussed by Radanliev et al. [58]), it is efficient and effective in anomaly detection and worth investigating.
We used two different testing approaches: cross-validation and percentage splitting.The cross-validation testing approach splits the dataset into folds.For example, if there are 10-folds, 9 of them may be specified for training and evaluation purposes and only 1 for testing purposes.Percentage splitting splits the dataset into two different sets: the first comprises 80% of the original dataset and is for training purposes, while the other 20% of the original dataset is for testing purposes [59,60].

Parameter settings
This section shows the parameter settings of the ML and DL classifiers used in this work.Two algorithms are used as classifiers, NBTree as ML classifier, and MLP as a DL classifier.As aforementioned, there are two testing approaches that are used in this stage: cross-validation and percentage splitting.The parameter settings that we set to MBTree are as follows.For cross-validation testing approach, the batch size was 100, where the number of decimal places to be used for the output of numbers in the model was 2. The number of folds that used to assess the performance and generalisation ability of NBTree was 10 in this experiment.For percentage splitting, the numbers of batch size and the decimal places are the same that were used in the cross-validation.In this testing approach, the dataset was sliced into 10-folds.This approach ensures that the proposed approach is trained on majority of the dataset while still retaining a portion for independent testing, helping to assess its generalisation to unseen data.
Whereas the parameter settings that we set to MLP are as follows.For cross-validation testing approach, the number of training instances utilised in one iteration is 100 (officially called as the batch size).In addition, there are 10 hidden layers in our proposed MLP.Furthermore, we set 0.3 as the learning rate for updating the weights of nodes, whereas the momentum that is applied to weight updates is 0.2.Last but not least, the number of folds that used to assess the performance and generalisation ability of MLP was 10 in this experiment.For percentage split, the number of training instances utilised in one iteration is also 100, when there are 10 hidden layers as well.Similarly, the learning rate and momentum are the same that are set to the cross-validation testing approach.In the testing approach, the dataset was divided into 80% for training and 20% for testing as performed by [11,56].

Evaluation metrics
In general, there are many evaluation metrics that can be used to evaluate the performance of applied techniques, such as the false-positive rate (FPR) and true-positive rate (TPR).In this study, we evaluated our proposed approach using key metrics: accuracy, recall, precision, FPR, TPR, and F-score.Table 6 describes the evaluation metrics and the equations used to calculate those metrics [13].

Experimental results
In this section, we compare the experimental results for our proposed approach to existing related work (see Table 1 in Sect.2).As abovementioned, we applied ML and DL techniques to detect the P2P botnet: NBTree as a ML classifier and MLP as a DL classifier.Both classifiers surpassed the results of related work on evaluation metrics in terms of the accuracy, recall, precision, FPR, TPR, F-score, and even the time taken to build a model.NBTree as a ML technique has achieved a higher detection accuracy of 99.99% compared to the related works that adopted other ML techniques in their detection stages.In addition, NBTree also showed higher scores in terms of recall, precision, TPR, and F-score, compared to the related works.The experimental results of this ML technique showed its effectiveness in recognising the P2P botnets within a short record time taken to build a model of 53.68 s in cross-validation and 0.46 s in percentage split.Last but not least, this technique showed a superiority in terms of there was no FPR, which means this technique has very accurately recognised all the instances of P2P botnets as abnormal instances (attack) and recognised all the normal behaviour as such.In other words, this technique can accurately distinguish the behaviours of P2P botnets from the normal behaviours without any errors.
Meanwhile, MLP as a DL technique has also achieved a higher detection accuracy of 99.86% compared to all scores of detections in the related works.Moreover, MLP also achieved higher scores in terms of recall, precision, TPR, and F-score, compared to the related works.However, this technique took longer time to build a model compared to NBTree.The time taken to build a model using MLP was 269.43 s in cross-validation and 0.37 s in percentage split.According to [52], it is reasonable that DL techniques take longer time for training compared to ML techniques in case of exactly same experiment circumstances.
In general, the proposed approach using NBTree and MLP achieved higher detection accuracy compared to the related works by using only the static indicators (the five-tuple).The five-tuple represents five features, and this was the fewest number of features compared to other IDSs that have been proposed to detect P2P botnets.Initially, there were 30 features in the dataset, and after our analyses, we selected only 5 features to detect the P2P botnet.Relatively, we only used 16.6% of the original dataset to detect the P2P botnet and achieved very high detection accuracy.Technically, this saved around 84% of the time and resources normally consumed.
Achieving the highest detection accuracy using the fewest number of features can be advantageous for several reasons as follows: (i) Simplicity, where having smaller set of features can make the operation easier to understand and interpret, and that leads for a faster training [56]; (ii) efficiency, using fewer features may reduce the computational resources required to train the model, and that makes the detection process more efficient [56]; and (iii) cost reduction, collecting and preprocessing data for feature extraction can be resourceintensive, while using fewer features may reduce the cost associated with data collection and preprocessing.Taken together, the proposed approach showed its effectiveness and efficiency compared to the existing detection systems as discussed above.
Table 7 tabulates the experimental results for the proposed approach using two different testing approaches to evaluate NBTree and MLP as classifiers to detect P2P botnets.
It was challenging to conduct a fair comparison of the existing IDSs that have been developed to detect botnets and our proposed approach for many reasons, such as the following: (i) the fact that each approach/solution has been evaluated in a different environment [61], (ii) there are many different binary bots employed in the different experiments [61], and (iii) it is not trivial to obtain and execute the code for each solution [25].Therefore,

R
The ratio of correctly classified attack incidents to the number of real attacks R = TP we undertook a traffic analysis to explore a new set of IOCs and then compared the performance of our detection approach to that found in the related work using the standard evaluation metrics.Table 8 compares the proposed approach to the related works in terms of accuracy, recall, precision, FPR, TPR, and F-score by using ML techniques.Take note, the comparison is exclusively performed to the related works that exactly proposed detection models/approaches/solutions to detect P2P botnets using either ML or DL techniques.The above table shows that the proposed approach outperforms the ML-based-related works in terms of the standard evaluation metrics especially the detection accuracy.However, Table 9 compares the proposed approach to the related works in terms of accuracy, recall, precision, FPR, TPR, and F-score by using DL techniques.
Once again, the above table shows that the proposed approach outperforms the DL-based-related works in terms of the standard evaluation metrics.In addition, the proposed approach achieves the highest detection accuracy by using the fewest number of number features compared to the related works (five-tuple, static indicators).Figures 10 and 11 show the detection accuracy of  the proposed approach compared to the related works that based on ML and DL techniques, respectively.The experimental results showed that the five-tuple features (static indicators) are enough to accurately detect P2P botnets using NBTree or MLP.In the abovementioned comparison, the detection accuracy might show slight privilege, but considering the number of features used, this proposed approach outperforms the related works.Taken together, the proposed approach achieved the highest detection accuracy compared to the related works using the fewest number of features (five-tuple, static indicators).The performance reflects the effectiveness and efficiency of the proposed approach in detecting P2P botnets, showing that this Fig. 10 The detection accuracy of the proposed approach (NBTree) compared ML-based related works approach is promising enough to depend and build on in future work.

Conclusion and future work
In this paper, we proposed two methods to detect P2P botnets.First, we analysed the traffic flow to develop a new set of behavioural characteristics as IOCs (or signs) of P2P botnets in two directions: flow-based indicators and indicators of deviations from standard protocol behaviour.Second, we proposed a new approach to detect P2P botnets using only static indicators (the five-tuple) using two ML/DL techniques as classifiers.
The experimental results showed that these two methods are efficient security countermeasures to recognise and detect the P2P botnets.These two methods proved their efficiency to be adopted as a solid foundation for future research.To build upon this study, potential extensions of this research include dynamic analysis integration, i.e. incorporate dynamic indicators analysis techniques alongside adopting the static indicators to create a hybrid detection approach.In addition, enhancing the feature engineering, i.e. investigating in more sophisticated feature selection or feature ranking techniques to identify the most relevant indicators for ML/DL techniques.Finally, we encourage the upcoming researchers to approach and develop the real-time detection and response.In other words, it could be optimising the detection systems/approaches/models/solutions for real-time operation and allowing for immediate response to emerging threats.

Fig. 1
Fig. 1 Centralised botnet vs. P2P botnets of a beneficial botnet as an anti-botnet measure • Use of the beneficial botnet to detect P2P communication by malware[27] Deep neural network DL • Proposing a new deep neural network-based approach to detect the P2P botnet using minimum number of features compared to other related works[28] Random forest, KNN, Naïve Bayes, SVM, decision tree ML • Proposing a Hadoop-based P2P botnet detection system to detect P2P botnet in local area network (LAN)• This paper introduced some of compromise indicators such count of unique destination hosts connected, total amount of data transferred from the source host, average of the TTL value of the packets transferred from the source host, and count of unique destination ports connected[29] SVM, K-means, decision tree, logistic regression ML • The authors experimentally examined some of feature selection algorithms to identify the most significant set of features• The authors applied four machine learning algorithms to detect the P2P botnet[30] ResNet convolution neural network DL • Proposing a ResNet CNN-based model to detect the P2P botnet by extracting important features from the traffic data.The idea of ResNet is to integrate the local connection and weight sharing in order to solve the problem of gradient explosion

Fig. 2
Fig. 2 Data construction process of the selected dataset

Fig. 5
Fig. 5 Positions of deviations from the standard behaviours of protocols in the network layers

TP+FNP
The percentage of attack incidents correctly classified relative to the classified number of attacks P = TP TP+FP FPR The relative weaknesses of the proposed approach, in other words, it refers to the proportion of misclassifications FPR = FP TN+FP TPR The percentage of normal traffic that is classified as normal traffic TPR = TP TP+FN FS A combined measure of precision and recall F-score = Precision×recall Precision+recall * 2

Table 1
Summary of related works

Table 3
Description of the selected dataset

Table 4
Hardware specifications

Table 5
Software specifications

Table 7
The evaluation metrics of the proposed approach using two testing approaches: cross-validation and percentage splitting

Table 8
Comparison between the proposed approach and the related works in terms of accuracy, FPR, precision, recall, and F-score using ML techniques

Table 9
Comparison between the proposed approach and the related works in terms of accuracy, FPR, precision, recall, and F-score using DL techniques