The Categories We Uncover When Clustering Apps According to Their Features
Wouldn’t it be great if categories in App Stores were dynamic and calculated automatically using smart techniques that would analyse all the aspects of current apps and devise and evolving categorisation that would not only be meaningful and easy to use, but would give fair chance to all apps to be discovered?
Well, that’s not here yet. But in an effort to automate the categorisation of apps using their descriptors, I have employed a clustering algorithm to see how far I can get.
To achieve clustering using the simplest way possible, I have employed K-Means clustering technique with k set to 19 (the number of categories in App World at the time) and hoped that every feature will magically fall with its buddies into their original categories.
With a cocktail of NLTK’s clustering package, scikit, matplotlib and mpld3 I have achieved the spectre you see below. In the plot, I have used SciKit’s TSNE to render the points in something akin to clusters.
The plot is interactive. Every point is a feature, the shape is the category from which this feature was extracted and the colour is the feature’s newfound cluster.
Hover on points to read the feature. Click on the magnifying glass to zoom on a certain cluster, the arrows to pan the view around and home button to reset the plot.