How does T-SNE compare to PCA?

PCA is a linear dimensionality reduction technique designed to model the variability based on the global structure of the data, while T-SNE is a non-linear technique that is optimal for capturing the local structure of high-dimensional data.

T-SNE is better suited to handle outliers, as where PCA would project outliers onto the axis that captures the largest proportion of overall variability, T-SNE is more likely to partition outliers into a different neighborhood than regions of higher density.

T-SNE is considered a more modern technique that generally is preferred over PCA, especially for data exploration and visualization. It does require tuning hyper-parameters such as perplexity and learning rate, whereas PCA requires little tuning besides choosing the number of components post-hoc. 

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad
Find out all the ways that you can
Contribute