Skip to content

Latest commit

 

History

History
36 lines (28 loc) · 1.87 KB

README.md

File metadata and controls

36 lines (28 loc) · 1.87 KB

Effective XGBoost by Matt Harrison - My Personal Take

This repository contains my personal notes, examples, and code implementations based on the book "Effective XGBoost" by Matt Harrison. The purpose of this repo is to document my journey as I peer thru the mind of a master displaying his craft so I could glean some insights myself into building models that I could use for my projects.

Book summary

Ths is a book that provides a comprehensive guide to using the XGBoost library for building machine learning models. The book covers the basics of decision trees, a fundamental component of the XGBoost model, and discusses the tradeoffs involved in creating a predictive model. It also provides best practices for using the XGBoost library and shows how to use related libraries to improve your model. The book includes examples and exercises to help readers practice using XGBoost and understand its features.

The book is organized into chapters that cover topics such as data preprocessing, hyperparameter tuning, model evaluation, and model interpretation. It also discusses advanced topics such as feature interactions, SHAP values, and model deployment. Throughout the book, the author provides practical examples and code snippets to help readers understand how to use XGBoost effectively.

Table of Contents

  1. Introduction
  2. Datasets
  3. Exploratory Data Analysis
  4. Tree Creation
  5. Stumps on Real Data
  6. Model Complexity & Hyperparameters
  7. Tree Hyperparameters
  8. Random Forest
  9. XGBoost
  10. Early Stopping
  11. XGBoost Hyperparameters
  12. Hyperopt
  13. Step-wise Tuning with Hyperopt
  14. Do you have enough data?
  15. Model Evaluation
  16. Training For Different Metrics
  17. Model Interpretation
  18. xgbfir (Feature Interactions Reshaped)
  19. Exploring SHAP
  20. Better Models with ICE, Partial Dependence, Monotonic Constraints, and Calibration
  21. Serving Models with MLFlow
  22. Conclusion