A dataset of unlabelled data needs to have the number of clu…

Written by Anonymous on April 22, 2026 in Uncategorized with no comments.

Questions

A dаtаset оf unlаbelled data needs tо have the number оf clusters estimated that best reflect the underlying characteristics of the data.  The dataset consists of 282 data instances each consisting of 13 attributes.  The dataset has been cleaned, has no missing values, and no obvious outliers.  The Kmeans algorithm was used to test the number of clusters from 2 to 12 with the results in the  table below.  The starting centroids were randomly chosen by the algorithm.    #clusters# iterationsSSE26260.7315227.12410183.49511173.30610150.62712144.15812125.74915121.081015117.531112115.09128113.79Based on the information in the table what do you think is the value for the best number of clusters and why?  Is there a way to validate the result and, if yes, how might it be done?

Comments are closed.