Discipline: Computer Sciences and Information Management
Subcategory: Computer Science & Information Systems
Kofi Forson - Delaware State University
Co-Author(s): Karla Miletti, John Lidell, and Tomasz G. Smolinski, Delaware State University, Dover, DE
Throughout the years, different methods have been applied to classify proteins analyzed with laser-induced breakdown spectroscopy (LIBS). Some of these proteins include Bovine Serum Albumin, Opsteopontin, Leptin, and Insulin-like Growth Factor II. The classification of these particular proteins can lead to the detection of diseases, such as ovarian cancer. We postulate that classificatory decomposition (CD) implemented on the Graphics Processing Unit (GPU) architecture is a swift, effective method to classify LIBS data into the four protein types. CD uses multi-objective evolutionary algorithms and rough sets to classify the data. This method not only decomposes the spectra into a small set of additive components by using multi-objective optimization, but also tests the classificatory aptitude of the decomposition. The goals of classificatory decomposition are to obtain a low reconstruction error rate and high classification accuracy, while utilizing few components. Classificatory decomposition uses pareto-optimality, reducts, and the non-dominated vector evaluated genetic algorithm (end-VEGA) to classify LIBS data into the protein types and reach these goals. Importantly, there are several aspects of CD that can be parallelized to run faster on a GPU-equipped computer. We propose to parallelize the CD algorithm by utilizing the OpenACC programming paradigm, which is a set of standardized, high-level statements, called pragmas, that enable C/C++ and Fortran programmers to utilize massively parallel coprocessors. Pragmas are informational statements utilized to assist the compiler during the program’s compilation. OpenACC provides a rich pragma language to annotate data location, data transfer, and loop or code block parallelism. In this project, we add specific pragma statements in the areas of the program that contain non-dependent for-loops to increase its computational efficiency. Further research could entail applying GPU programming to other components of the program, including data manipulation.
Funder Acknowledgement(s): This study was supported by NSF CREST 1242067, NSF EPSCoR 0814251, NIH NCRR 5P20RR016472-12, and NIGMS 8P20GM103446-12.
Faculty Advisor: Tomasz G. Smolinski, kofif7kofi@gmail.com
Role: I participated in the following aspects of this research: Exporting the program; Profiling the program; Parallelizing the program; Compiling the program; Data Analysis; Poster/Abstract Creation.