8. trees

Trees represent a set of methods where prediction is given by majority vote or average outcome based on a partition of the predictor space.

Author
Published

October 28, 2024

Artwork by @allison_horst.

Agenda

October 28, 2024

  1. Decision Trees
  2. Example

October 30, 2024

  1. Bagging
  2. Example

Readings

Reflection questions

  • What does CART stand for?

  • How does CART make predictions on test data?

  • Can CART be used for both classification and regression or only one of the two tasks?

  • Can you use categorical / character predictors with CART?

  • How is tree depth chosen?

  • What does it mean for CART to be high variance?

  • What are the advantages of the CART algorithm?

  • What are the disadvantages of the CART algorithm?

Ethics considerations

  • What type of feature engineering is required for CART?

  • If the model produces near perfect predictions on the test data, what are some potential concerns about putting that model into production?

Slides

Additional Resources

With the help of the Rand Corp., the city tried to measure fire response times, identify redundancies in service, and close or re-allocate fire stations accordingly. What resulted, though, was a perfect storm of bad data: The methodology was flawed, the analysis was rife with biases, and the results were interpreted in a way that stacked the deck against poorer neighborhoods. The slower response times allowed smaller fires to rage uncontrolled in the city’s most vulnerable communities.

:::