Image Image Image Image Image Image Image Image Image Image

| November 21, 2017

Scroll to top

Top

No Comments

The Categories We Uncover When Clustering Apps According to Their Features

The Categories We Uncover When Clustering Apps According to Their Features
Afnan AlSubaihin

Wouldn’t it be great if categories in App Stores were dynamic and calculated automatically using smart techniques that would analyse all the aspects of current apps and devise and evolving categorisation that would not only be meaningful and easy to use, but would give fair chance to all apps to be discovered?

Well, that’s not here yet. But in an effort to automate the categorisation of apps using their descriptors, I have employed a clustering algorithm to see how far I can get.

The features were extracted from apps in the BlackBerry App World in 2011 which were extracted and made available by the  UCLAppA team (Technical Report).

To achieve clustering using the simplest way possible, I have employed K-Means clustering technique with k set to 19 (the number of categories in App World at the time) and hoped that every feature will magically fall with its buddies into their original categories.

With a cocktail of NLTK’s clustering package, scikit,  matplotlib and mpld3 I have achieved the spectre you see below. In the plot, I have used SciKit’s TSNE to render the points in something akin to clusters.

The plot is interactive. Every point is a feature, the shape is the category from which this feature was extracted and the colour is the feature’s newfound cluster.
Hover on points to read the feature. Click on the magnifying glass to zoom on a certain cluster, the arrows to pan the view around and home button to reset the plot.

Click here for full page view of the clusters.