
Supervised Learning Interview Questions
Regression:
- Explain the concept of Linear Regression
- What are the assumptions in a Linear Regression model?
- How are coefficients of linear regression estimated?
- How is variability measured in Linear Regression?
- What are the key evaluation criteria for Linear Regression model?
Classification:
- What is classification, and discuss the different types of classification?
- How do you evaluate the performance of a classification model?
- What is a ROC curve?
- How do you handle imbalanced datasets in classification tasks?
- Explain the difference between Gini, Entropy, and Information Gain
Logistic Regression
- What is Logistic Regression? Describe the process of how to use logistic regression to fit data
- What are the major assumptions of logistic regression?
- What are the advantages and disadvantages of logistic regression?
- What is the relationship between the log odds ratio and probability?
Ensemble Learning (Decision Trees, Bagging, Random Forest, Boosting)
- What is a Decision Tree? Explain the concept and working of a Decision tree model
- What is Bagging? How do you perform bagging and what are its advantages?
- Explain the concept and working of the Random Forest model
- What is Gradient Boosting (GBM)? Describe how does the Gradient Boosting algorithm work
- What is XGBoost? How does it improve upon standard GBM?
- How is Gradient Boosting different from Random Forest?
- What is the difference between Adaboost and Gradient boost?
- Distinguish between a Weak learner and a Strong Learner
- What are the key hyperparameters in a Random Forest model?
- GBM vs Random Forest: which algorithm should be used when?
- What is the difference between Decision Trees, Bagging and Random Forest?
- What are the advantages and disadvantages of Decision Tree model?
- What are the advantages and disadvantages of Random Forest?
- What are the advantages and disadvantages of a GBM model?
- How does pruning a tree work?
Suppor Vector Machine (SVM)
- What is the basic idea of Support Vector Machine (SVM) and Maximum Margin?
- What hyper-parameters are typically tuned in SVM?
- What are the pros/cons of using an SVM model?
- What are common choices to use for kernels in SVM?
- Describe the hinge loss function used in SVM
- What is the kernel trick in SVM?
Other key questions
- What is a Generalized Linear Model (GLM)?
- Briefly discuss other models that fall within the scope of GLM.
- What is the difference between a generative and a discriminative model?
- What is a naive bayes classifier? Explain how does Naive Bayes work
- What are the Pros/Cons of Naive Bayes?
- How does discriminant analysis work at a high level?
- What is the difference between classification and regression in supervised learning?
- What is the difference between Feature Engineering and Feature Selection?
- What is Feature Scaling? Explain the different feature scaling techniques?
- What is Feature Standardization and why is it needed?
- What is overfitting, and how can it be prevented in supervised learning?
- What is underfitting and how can it be prevented?
- What does L1 regularization (Lasso) mean?
- What does L2 regularization (Ridge) mean?
Relevant articles: