Discipline: Technology and Engineering
Subcategory: Computer Engineering
Tilaye Alemayehu - University of the District of Columbia
Co-Author(s): Nian Zhang and Sasan Haghani, University of the District of Columbia, Washington, DC
Weighted extreme learning machine (ELM), as an effective and efficient machine learning technique, has attracted tremendous attention in the fields of healthcare, biomedical engineering, and cancer detection and diagnosis. ELM is basically a least square-based learning algorithm that learns to recognize complex patterns within rich and massive data and make intelligent data-driven decisions. Classification of data becomes very difficult because of unbounded size and imbalanced nature of data. Minority samples are defined as those samples that rarely occur but are extremely important and would incur overwhelming cost if they are not properly classified. In the last few years, the imbalanced learning problem has become the most important problem in data mining and has drawn a significant amount of interest from academia, industry, and government funding agencies. The fundamental issue with the imbalanced learning problem is the ability of imbalanced data to significantly compromise the performance of most standard learning algorithms. An effective imbalanced learning system developed for the highly overlapped imbalanced classes involving rare disease can save billions of dollars and human life. In order to work with data that has imbalanced class distribution, a weighted ELM algorithm is proposed to add a weight matrix in extreme learning machine (ELM) to strengthen the impact of minority class while weaken the impact of majority class. First we conducted the analysis of the effect of imbalanced data distribution on the classification performance using the unweighted extreme learning machine (ELM) with non -kernel or kernel hidden nodes. Then we demonstrated the effect of adding the weight to the ELM in both binary classification and multiclass classification problems. Specifically we interpreted the difference of unweighted ELM classifier and weighted ELM classifier from the movement of the separating boundary between the minority class and majority class. A weighting scheme is developed to determine the weight matrix that plays an important role in determining the re-balance ratio between minority class and majority class, and how much further the boundary is pushed towards the majority class. The Gaussian kernel is used as the feature mapping function. The experimental results showed that by using the weighted ELM, the separating boundary is moved towards the majority class. As a result, more minority data are correctly classified, but slightly more majority data are misclassified. In addition, when the classes are balanced, the difference of performance of unweighted ELM and weighted ELM is slight. Moreover, it demonstrated that the proposed weighted ELM is applicable to not only the imbalanced datasets, but also the balanced datasets.
Funder Acknowledgement(s): This work was supported in part by the U.S. National Science Foundation under Grants HRD #1505509, HRD #1435947 and HRD #0928444, and a USGS Grant under DC WRRI Grant #2015DC174B.
Faculty Advisor: Nian Zhang,