Discipline: Computer Sciences & Information Management
Subcategory: STEM Research
- Prairie View A&M University
Co-Author(s): William Lim, Travon Johnson, De'Ahna Johnson
Space and computation power demands needed for the ever increasing size of the data and complexity of data analytics algorithms are stretching the memory and computational limits of conventional von Neumann-based computers. It is time
to explore other architectures, such as data flow systems, to overcome these limitations. Our research explores the feasibility of Fresh Breeze, a data flow architecture, for big data analytics.
In our research, we looked into the issues of porting Deep Neural Network programs to funJava, a functional subset of Java. We use a minimal two-layer network to explore these issues and opportunities for enhancing, extending, and improving funJava language support and the capability of the funJava compiler to
better support the development of deep neural networks in funJava. We reported preliminary results of this work in a paper at Parco2017. Using synthetic data for our development work, we found that with a Fresh Breeze implementation the performance of the computational bottleneck, the matrix multiplication, within a
layer scales linearly as the number of cores increases. This is made possible by the ability in Fresh Breeze to decompose a computation into parallel data-driven tasks and good load balancing of tasks (to ensure that tasks that are ready for execution are quickly and evenly distributed to all available cores). We have a similar observation with performance studies for linear algebra computations and for multi-program computations where more than one funJava program is run at the same time on the Fresh Breeze system. Fresh Breeze has no knowledge about what a task does nor does it need to know. The tasks can be from the same computation (like a matrix multiplication), different computations (say, one from machine learning and another from computational biology, running at the same time), or the different stages, iterations, or layers of a complex computation.
The research also provides opportunities to expose and train PVAMU (an HBCU) students on deep learning and data flow computers. One of the student projects is looking into the issues of implement 2-D convolutions in funJava. Another project is exploring the performance of funJava versions of four machine learning algorithms: K-Nearest Neighbors, naive Bayes & Bayesian Belief Networks, Logistic Regression, and Linear Regression. A summer student project was undertaken to develop the funJava versions of support libraries, like StrictMath and JAMA.
Funder Acknowledgement(s): NSF and DoD
Faculty Advisor: None Listed,