How does T-distributed Stochastic Neighbor Embedding (T-SNE) work at a high level?

T-SNE, or T-distributed Stochastic Neighbor Embedding, is an alternative approach to dimension reduction that is best suited for visualizing high-dimensional data in only two dimensions.

At a high level, it projects the original dataset of p dimensions down into 2 dimensions by preserving the distance between observations in the same neighborhood in the higher dimensional space when it is projected into lower dimensions. In the lower dimensional data, the pairwise distances between observations, or similarity, is modeled using the Student’s T-distribution, which gives T-SNE that part of its name.

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute