Machine Learning Interview Questions

Q. What is Multi-class Classification?
Q. How would you evaluate a classification model?
Q. What are some of the most common feature engineering techniques?
Q. What are the advantages and disadvantages of Decision Tree model?
Q. What are the advantages and disadvantages of Random Forest?
Q. What are the advantages and disadvantages of a GBM model?
Q. How is Gradient Boosting different from Random Forest?
Q. GBM vs Random Forest: which algorithm should be used when?
Q. What are the best ways to safeguard against overfitting a GBM?
Q. Briefly discuss other models that fall within the scope of GLM.
Q. How would you evaluate a Classification model using ROC/AUC?
Q. How would you address an imbalanced classification problem?
Q. How Does Naive Bayes Work?
Q. How does SVM adjust for classes that cannot be linearly separated?
Q. Describe the hinge loss function used in SVM
Q. How does hinge loss differ from logistic loss?
Q. Explain how SVM can be used in regression problems
Q. What is the difference between QDA and Gaussian Mixture Models (GMM)?
Q. What is the purpose of feature selection, and what are some common approaches?
Q. How are model hyper-parameters generally selected?
Q. What are the different types of Gradient Descent?
Q. What is the error / loss function in logistic regression?
Q. What differentiates Linear Discriminant Analysis (LDA) from Quadratic Discriminant Analysis (QDA)?
Q. What is a closed form solution, and what are the advantages of a problem having such a solution? Which algorithms have a closed form solution?
Q. What are some common evaluation metrics in clustering?
Q. What are some common distance metrics that can be used in clustering?
Q. What loss function does K-Means seek to minimize?
Q. How does K-Means ++ work?
Q. What are some options for clustering on categorical data? What if the dataset contains a combination of numeric and categorical features?
Q. What is Expectation-Maximization (EM)?
Q. What is a Gaussian Mixture Model (GMM)?
Q. How does the initial choice of centroids affect the K-Means algorithm?
Q. How does DBSCAN Clustering work, and in what cases is it useful?
Q. What is Spectral Clustering?
Q. What is the difference between parametric and non-parametric models?
Q. What are the pros and cons of parametric vs. non-parametric models?
Q. What is the difference between Feature Engineering and Feature Selection?
Q. How are categorical features or qualitative predictors represented in a machine learning model?
Q. What is Laplace Smoothing? What is Additive Smoothing? Why do we need smoothing in IDF?
Q. What is Bag-of-Words Model? Explain using an example
Q. What is an N-gram Language model? Explain its working in detail
Q. What are the assumptions of linear regression?
Q. How are continuous features incorporated into Naive Bayes?
Q. What are common choices to use for kernels in SVM?
Q. Discuss Dummy encoding in the context of feature engineering
Q. Discuss text feature extraction in the context of feature engineering
Q. What is Hierarchical Clustering?
Q. Explain the difference between Entropy, Gini, and Information Gain
Q. How does a decision tree create splits from continuous features?
Q. How does pruning a tree work?
Q. What is CART?
Q. What are the key hyperparameters for a Random Forest model?
Q. Why is Random Forest a non-linear model? Why does it result in non-linear decision boundaries?
Q. What is the difference between Decision Trees, Bagging and Random Forest?
Q. What are the key hyperparameters for a GBM model?
Q. What are the options for reporting feature importance from a decision-tree based model?
Q. What is the difference between Adaboost and Gradient boost?
Q. What is XGBoost? How does it improve upon standard GBM?
Q. How is variability measured in Linear Regression?
Q. What is multicollinearity and how can that be identified?
Q. What is Global F-Test?
Q. What is R-squared and adjusted R-squared?
Q. What are the various measures of error (MSE, RMSE, MAE)?
Q. What is Information Criteria (AIC, BIC)?
Q. What are some of the problems with stepwise selection approaches?
Q. Suppose there are a large number of predictors ‘p’. What is the best approach to find out if any of the p predictors are helpful in predicting the response ‘y’?
Q. What are the most common transformations when the target variable is not normally distributed?
Q. Doesn’t polynomial regression violate the multicollinearity assumption for Linear Regression?
Q. Why does multicollinearity result in poor estimates of coefficients in linear regression?
Q. What is the difference between Regression and ANOVA?
Q. What is the difference between outliers, high leverage points, and high influence points?
Q. What is an outlier?
Q. What is a high leverage point?
Q. What is a high influence point?
Q. What are potential problems encountered in Linear Regression?
Q. What is non-negative least squares, and when is it used?
Q. What problems would arise from using a regular linear regression to model a binary outcome?
Q. Why are the log odds used in the link function instead of just the regular odds ratio?
Q. What is the relationship between the log odds ratio and probability?
Q. How are the coefficients in a logistic expression interpreted?
Q. Why are coefficients estimated through Maximum Likelihood (MLE) instead of Least Squares?
Q. What is the equivalent of the overall F test in logistic regression?
Q. What are the advantages and disadvantages of logistic regression?
Q. How does GLM adjust to the case of count data?
Q. What is the cost function used in Poisson Regression?
Q. What is overdispersion in Poisson Regression, and what are alternate specifications for when it is present?
Q. What about cases where a significant number of observations have a count of 0 (in the context of Poisson Regression)?
Q. What is Gamma Regression?
Q. What is Beta regression?
Q. What is Tweedie Regression?
Q. What is Accuracy?
Q. What is Misclassification rate?
Q. What is Recall?
Q. What is Precision?
Q. What is F1 Score?
Q. What is Specificity?
Q. What is False Positive Rate (FPR)?
Q. How to determine threshold/decision rule for a classification model?
Q. Understanding Probability Outputs in Classification Algorithms
Q. What do you mean by calibration quality? How can calibration quality be detected from the output of an algorithm?

Partner Ad

Join us on:

Find out all the ways that you can

Contribute

Machine Learning Interview Questions

Other Questions in Machine Learning Interview Questions