Theory

K means clustering: Anomaly Detection

Assume that we have five different cryptocurrency prices which they cannot be less than $0 and greater than $20.000. These prices were obtained and stored in each two hours in a day. However, some values were corrupted when they were stored. Use K-means clustering method to cluster the prices into 6 different clusters and detect the corrupted prices (Anomaly detection).

Cryptocurrency prices

$7845, $778, $942, $143, $0.75, $7956, $810, $976, $146, $0.76, $8215, $825, $1002, $152, $0.78, $8542, $847, $1038, $157, $0.78, $8150, $100587, $807, $1015, $150, $0.72, 8386, $884, $101964, $1085, $138, $0.82, $8219, $827, $995, $158, $0.82, $7500, $745, $948, $135, $0.67, $9257, $901, $120967, $1154, $148, $0.72, $8553, $811, $1218, $175, $0.84

Calculate and Results

Apply K means clustering algorithm to identify clusters.

Press 'Step manually' button to perform step wise calculations of the clusters. Press 'Converge' to directly calculate the clusters (automatically perform all iterations). The result will be presented in the plot.


How to initially place centroids?

Use random and unique data points
Use initial n number data point (where n is number of clusters)

Results and Plot information