Please use this identifier to cite or link to this item:
http://172.22.28.37:8080/xmlui/handle/1/425
Title: | Enhanced SMOTE Algorithm for Classification of Big Data Using Random Forest |
Authors: | Bhagat, Reshma C |
Keywords: | Data Mining Machine Learning Map-Reduce Framework Over-sampling |
Issue Date: | 2015 |
Publisher: | Rajarambapu Institute of Technology, Rajaramnagar |
Abstract: | In the era of big data, the applications generating tremendous amount of data are becoming the main focus of attention as the wide increment of data generation and storage that has taken place in the last few years. This scenario is challenging for data mining techniques which are not arrogated to the new space and time requirements. In many of the real world applications, classification of imbalanced data-sets is the point of attraction. Most of the classification methods focused on two-class imbalanced problem. So, it is necessary to solve multi-class imbalanced problem, which exist in real-world domains. In the proposed work, methodology for classification of multi-class imbalanced data is introduced. This methodology consists of two steps: In first step we used Binarization techniques (OVA and OVO) for decomposing original dataset into subsets of binary classes. In second step, the SMOTE algorithm is applied against each subset of imbalanced binary class in order to get balanced data. Finally, to achieve classification goal Random Forest (RF) classifier is used. Specifically, oversampling technique is adapted to big data using MapReduce so that this technique is able to handle as large data-set as needed. iii An experimental study is carried out on 8 experimental cases to evaluate the performance of proposed method. For experimental analysis, datasets from UCI repository are used with different imbalanced ratio (IR). The proposed system is implemented on Apache Hadoop and Apache Spark platform. The results obtained shows that proposed method outperforms existing methods. |
Description: | Under the Guidance of Prof. S. S. Patil |
URI: | http://localhost:8080/xmlui/handle/1/425 |
Appears in Collections: | M.Tech Computer Science & Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Enhanced SMOTE Algorithm for Classification of Big Data Using Random Forest.pdf Restricted Access | 373.39 kB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.