Computer Vision (1)
Generative AI (2)
Machine Learning Basics (18)
Deep Learning (52)
- DL Basics (16)
- DL Architectures (17)
  - Feedforward Network / MLP (2)
  - Sequence models (6)
  - Transformers (9)
- DL Training and Optimization (17)
Natural Language Processing (27)
- NLP Data Preparation (18)
Supervised Learning (115)
- Regression (41)
  - Linear Regression (26)
  - Generalized Linear Models (9)
  - Regularization (6)
- Classification (70)
  - Logistic Regression (10)
  - Support Vector Machine (9)
  - Ensemble Learning (24)
  - Other Classification Models (9)
  - Classification Evaluations (9)
Unsupervised Learning (55)
- Clustering (37)
  - Distance Measures (9)
  - K-Means Clustering (9)
  - Hierarchical Clustering (3)
  - Gaussian Mixture Models (5)
  - Clustering Evaluations (6)
- Dimensionality Reduction (9)
Statistics (34)
Data Preparation (35)
- Feature Engineering (30)
- Sampling Techniques (5)

What are some options for dealing with outliers?

Updated: March 26, 2023

As outliers are observed data points, the very first step that should be taken is to understand what resulted in the outlier. If it is an issue with the data generation process, or something like measurement error, that is something that needs to be addressed before modeling on the data. If there is a sufficiently large sample size, and there is nothing systematic about the existence of the outliers, it might be possible to remove them from the dataset; however, this should be done with caution. If the sample size is not so large, another option is to trim the values of the outliers to more reasonable values so that they do not reside so far away from the concentration of data that they have an undue effect on the regression line. Finally, using quantile regression, which models a quantile such as the median rather than the mean of the data, is more robust to outliers than linear regression and might be of interest depending on the application.

Author

AIML.com

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment (Cancel Reply)

You must be logged in to post a comment.

Partner Ad

Join us on:

Find out all the ways that you can

Contribute

Partner Ad

Learn Data Science with Travis - your AI-powered tutor | LearnEngine.com

What are some options for dealing with outliers?

Author

Leave the first comment (Cancel Reply)

Other Questions in Statistics