Discipline: Computer Sciences and Information Management
Subcategory: Computer Science & Information Systems
Session: 4
Room: Virginia B
Cyanea Van Trieu Do - University of Texas at San Antonio
Co-Author(s): Richard Garcia, University of Texas at San Antonio, TX.
Cyber-attack events are mostly surreptitious and challenging to detect. Most hackers use malicious logic to exploit vulnerabilities in cyber networks and gain unauthorized access to computer systems. In order to design preventive measures and minimize damages from cyber-attacks, it is necessary to find attacking patterns and predict attacker’s behaviors. The hypothesis is that the time series of cyber-attack on a sub-network A may correlate with that on sub-network B, which provide a basis to predict the behaviors of cyber-attacks. The goal is to build a computational model, which has the features as a result from Granger Causality analysis, to compute and predict cyber-attack numbers for each sub-network as well as their patterns in IPv4 address space. In this presentation, we will describe a computational forecast method to predict the cyberattack data and their behavior over a period of time. The commonly used machine learning model, Long Short-Term Memory (LSTM), is implemented by inputting features from Granger Causality analysis in our previous study. The raw data were collected by a Honeypot for a period of one and half years. The number of attacks for each subnet in IP space was recorded daily. By applying Granger Causality analysis, the causal connections between subnets in the time series dataset are analyzed. From the Granger Causality analysis, the three important features such as: modularity, degree range, and target-source are identified, and then applied into LSTM to determine if the model is more efficient compare to the LSTM model without using the three features for predicting dynamic cyber-attack data. As a result, the LSTM model with Granger Causality analysis features shows a significant improvement in predicting the number of attacks in each sub-network as well as the patterns in the next time steps compare to the original LSTM without Granger Causality analysis features. The conclusion is that the ability to predict the next patterns in IP address space can assist to reduce numbers of attacks by performing prevention methodology ahead of time, which in term to improve network security. The future direction is to develop an efficient computational framework so that the prediction can be carried out in real time.
Funder Acknowledgement(s): NSF/HRD #1736209
Faculty Advisor: Yusheng Feng, yusheng.feng@utsa.edu
Role: My part is started from analyzing causal connections between subnets in the time series dataset. The connection results are visualized and analyzed by Gephi visualization tools to identify the three features such as: degree range, modularity, and target-source. Then the features are inputted into the Long Short Term Memory model to train and predict the number of attacks in each sub-networks in the next time steps.