Discipline: Computer Sciences and Information Management
Subcategory: Civil/Mechanical/Manufacturing Engineering
Session: 2
Room: Exhibit Hall A
Ciera Oliver-Ratcliff - Livingstone College, Department of Mathematics
Co-Author(s): Timquan Stover, Livingstone College, Salisbury, North Carolina
In surrounding military instillations, neighbors have made complaints about the noises they’re experiencing from the military auxiliary within the area. Neighbors reported hearing blast noises periodically throughout the day. The purpose of this research is to analyze acoustic data using tree- based methods to determine whether the sound is of a blast or non-blast from a military auxiliary. Tree- based methods is a popular algorithm which empowers predictive models with high accuracy, stability and ease of interpretation. Tree-based methods includes bagging, random forest, and boosting. Bagging also known as bootstrap is a method for generating multiple versions of a predictor. Bagging reduces the variance and hence increases the prediction accuracy of a statistical learning method. Random forest improves accuracy by minimizing the correlation while maintaining strength. Boosting can improve the performance prediction results by converting weak learners into strong learners. We used the machine learning tool R-Statistical language to investigate the prediction accuracy of acoustic data using the three tree-based algorithms. The bagging algorithm includes the Out-Of-Bag Error (OBB) which is a valid estimate of the test error for the bagged model. The random forest algorithm consists of the Gini Index to identify feature selection and the variable importance. Boosting obtains AdaBoost algorithm which identifies the shortcomings by adjusting the weighted version of the training data and taking the weighted average of weak predictors. Our objective was to find the highest prediction accuracy using tree- based algorithms with acoustic data. Our data set included 12,640 observations from military installations and 38 predictors used within this research. We implemented a Bagging algorithm to predict the highest classification accuracy. Our results on the training data set yielded a prediction accuracy of 97.64%; of the OBB for Bagging; 97.21% on the testing data set. Random forest implemented on the training data set used 500 trees and 6 features at each node. The training model predicted accuracy of 97.75%; testing model performed an accuracy of 98%. The results for boosting are of relative importance instead of the prediction accuracy. Boosting shows the relative importance variables were aw, a8.1, aj, a5, and a3. In conclusion, we found random forest algorithm used the highest prediction accuracy to determine if the noise was a blast noise or not by dealing with the three different categories of tree-based methods; bagging, random forest, and boosting.
Funder Acknowledgement(s): National Science Foundation Support Group at Livingstone College
Faculty Advisor: Nicole Allen, nallen@livingstone.edu
Role: For this research, I was in charge of understanding the concepts of machine learning. I had to gain better knowledge of Tree- based Methods by reading An Introduction to Statistical Learning. Identified each of the categories of tree- based methods (bagging, random forest and boosting). I used bagging, random forest and boosting algorithms to determine whether the acoustic data was of a blast or non-blast. Over 12,000 observations were obtained from the military installations, however I only used 38 predictors in this research. Once completed I then coded the data set into the latest technology programming software known as R- Statistical Language. After determining the results, I was able to conclude the highest prediction accuracy out of the three tree-based methods.