[Readings]
[K-Means] (50pt)
Implement the K-Means algorithm as described in Chapter 7.4.1 of the textbook. You may use any programming language of your choice, as long as your program can be compiled and run on CS3.
Use your K-Means implementation to cluster the Forest CoverType dataset, using only the ten quantitative attributes. Your program should take the input data file as a command line parameter, as shown below (I'm using Java for examples, but as stated earlier, you may use other programming languages):java KMeans <dataset>
Since there are seven forest cover types, your program should produce seven clusters. We label each cluster with the class label of the majority class in the cluster, and if a record has the same label as the label of the cluster it belongs to, we consider the record "correctly clustered". Your program should output to the console the percentage of the correctly clustered records, e.g. 50%. Please do not output any debugging information in the submitted version.
Note that