Jye Sawtell-Rickson
1 min readAug 29, 2019

--

Nicely written article, but not technically sound.

kmeans shouldn’t be used for one-dimensional data, it is a multivariate technique. It’s neither efficient nor optimal for this case. Especially when you’re talking about introductory techniques, why not just bin the data?

If more accuracy is required a technique like Jenks Natural Breaks Optimisation can be used.

Alternatively, do kmeans clustering across the three dimensions you specified with three clusters! Then you don’t need to do the adjustment at the end to merge the categories.

--

--

Jye Sawtell-Rickson
Jye Sawtell-Rickson

Written by Jye Sawtell-Rickson

Talking about data science, product analytics, and artificial intelligence.

Responses (1)