Objective: Analyze large scale dataset of fashion images to discover visually consistent style clusters.
- Dataset: StreetStye-27K.
- Code: demo here
New dataset: StreetStye-27K
- Photos (100 million): from Instagram using the API to retrieve images with the correct location and time.
- People (14.5 million): they run two algorithms to normalize the body position in the image:
- Clothing annotations (27K): Amazon Mechanical Turk with quality control. 4000$ for the whole dataset.
Architecture:
Usual GoogLeNet but they use Isotonice Regression to correct the bias.
Unsupervised clustering:
They proceed as follow:
- Compute the features embedding for a subset of the overall dataset selected to represent location and time.
- Apply L2 normalization.
- Use PCA to find the vector representing 90% of the variance (165 here).
- Cluster them using a GMM with 400 mixtures which represent the clusters.
They compute fashion clusters for city or bigger entities:

Results:
Pretty standard techniques but all patched together to produce interesting visualizations.