Analysis of K-Means and K-Medoid Clustering Algorithm on Various Datasets in Data Mining

Jaskaranjit Kaur, Gurpreet Kaur

Abstract


With the development of information technology and computer science, high-capacity data appear in our lives. In order to help people analyzing and digging out useful information, the generation and application of data mining technology seem so significance. Clustering is the mostly used method of data mining. Clustering can be used for describing and analyzing of data. Clustering is a way of combining data objects or data points into disjoint cluster. The basic concept behind clustering is that the data objects in the same clusters should be related to each other and the data objects belonging to different clusters should differ from each other. This research article analyzes the performance of K-Means algorithm and K-Medoid algorithm on two different datasets. The algorithms are applied on cars dataset and banking dataset and results are compared on the basis of within sum square error.


Keywords


K-Means, K-Medoid, WEKA tool

Full Text:

PDF

References


Agarwal S, Yadav S, Singh K. K-means versus K-means ++ Clustering Technique. International Workshop on Education Technology and Computer Science, 2012. DOI: 10.1109/SCES.2012.6199061.

Ismael N, Alzaalan M, Ashour W. Improved multi threshold birch clustering algorithm. International Journal of Artificial Intelligence and Applications for Smart Devices 2014; 2(1): 1-10.

Shah S, Singh M. Comparison of a time efficient modified K-mean algorithm with K-Mean and K-Medoid algorithm. International Conference on Communication Systems and Network Technologies, Rajkot. 2012. DOI: 10.1109/CSNT. 2012.100.

Dhingra S, Gilhotra R, Ravishanker. Comparative analysis of Kohonen-SOM and K-Means data mining algorithms based on academic activities. International Journal of Computer Applications 2013; 6(1): 237-41.

Vijayarani S, Jothi P. An efficient clustering algorithm for outlier detection in data streams. International Journal of Advanced Research in Computer and Communication Engineering 2013; 2(9): 3657-65.

Na S, Xumin L, Yong G. Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm. International Symposium on Intelligent Information Technology and Security Informatics 2010; 74: 63-7.

Ramamohan Y, Vasantharao K, Chakravarti CK, et al. A study of data mining tools in knowledge discovery process. International Journal of Soft Computing and Engineering 2012; 2(3).

Han J, Kamber M. Data mining: concepts and techniques. Third Edition. Beijing: China Machine Press, 2012.

Rani Y, Manju, Rohil H. Comparative analysis of BIRCH and CURE hierarchical clustering algorithm using WEKA 3.6.9. Transactions on Computer Science Engineering and its Applications 2012; 3(10): 1115-22.

Dhiman R, Vashisht S. A cluster analysis and decision tree hybrid approach in data mining to describe tax audit. International Journal of Computers and Technology 2013; 4(1).

Wei-Lun C, Tzu-Hsiang L. A cluster-based approach for automatic social network construction. International Conference on Social Computing, USA. 2010. DOI: 10.1109/SocialCom.2010.94.

Singh A, Yadav A, Rana A, et al. K-means with three different distance metrics. International Journal of computer Applications 2013; 67(10).

Kameshwaran K, Malarvizhi K. Survey on clustering techniques in data mining. International Journal of Computer Science and Information Technologies 2014; 5(2): 2272-6.

Dan J, Jianlin Q, Xiang G, et al. A synthesized data mining algorithm based on clustering and decision tree. International Conference on Computer and Information Technology, UK. 2010. DOI: 10.1109/ CIT.2010.456.


Refbacks

  • There are currently no refbacks.