Decision trees - Regression tree building

Decision trees - Regression tree buildingDr. D’Agostino McGowan1 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processDivide the predictor space (the set of possible values for X1,X2,…,XpX1,X2,…,Xp ) into JJ distinct non-overlapping regions, R1,R2,…RjR1,R2,…Rj
2 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processDivide the predictor space (the set of possible values for X1,X2,…,XpX1,X2,…,Xp ) into JJ distinct non-overlapping regions, R1,R2,…RjR1,R2,…Rj
For every observation that falls into the region RjRj, we make the same prediction, the mean response value for the training observations in RjRj
2 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processThe regions could have any shape, but we choose to divide the predictor space into high-dimensional boxes for simplicity and ease of interpretation
3 / 7

The tree building process

The regions could have any shape, but we choose to divide the predictor space into high-dimensional boxes for simplicity and ease of interpretation
The goal is to find boxes, $R_{1}, \dots, R_{j}$ that minimize the RSS, given by

$\sum_{j = 1}^{J} \sum_{i \in R_{j}} (y_{i} - {\hat{y}}_{R_{j}})^{2}$ where ${\hat{y}}_{R_{j}}$ is the mean response for the training observations within the $j$ th box.

3 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processIt is often computationally infeasible to consider every possible partition of the feature space into JJ boxes
4 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processIt is often computationally infeasible to consider every possible partition of the feature space into JJ boxes
Therefore, we take a top-down, greedy approach known as recursive binary splitting
4 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processIt is often computationally infeasible to consider every possible partition of the feature space into JJ boxes
Therefore, we take a top-down, greedy approach known as recursive binary splitting
This is top-down because it begins at the top of the tree and then splits the predictor space successively into two branches at a time
4 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processIt is often computationally infeasible to consider every possible partition of the feature space into JJ boxes
Therefore, we take a top-down, greedy approach known as recursive binary splitting
This is top-down because it begins at the top of the tree and then splits the predictor space successively into two branches at a time
It is greedy because at each step the best split is made at that step (instead of looking forward and picking a split that may result in a better tree in a future step)
4 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processFirst select the predictor XjXj and the cutpoint ss such that splitting the predictor space into {X|Xj<s}{X|Xj<s} and {X|Xk≥s}{X|Xk≥s} leads to the greatest possible reduction in RSS
5 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processFirst select the predictor XjXj and the cutpoint ss such that splitting the predictor space into {X|Xj<s}{X|Xj<s} and {X|Xk≥s}{X|Xk≥s} leads to the greatest possible reduction in RSS
We repeat this process, looking for the best predictor and cutpoint to split the data within each of the resulting regions
Now instead of splitting the entire predictor space, we split one of the two previously identified regions, now we have three regions
5 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processFirst select the predictor XjXj and the cutpoint ss such that splitting the predictor space into {X|Xj<s}{X|Xj<s} and {X|Xk≥s}{X|Xk≥s} leads to the greatest possible reduction in RSS
We repeat this process, looking for the best predictor and cutpoint to split the data within each of the resulting regions
Now instead of splitting the entire predictor space, we split one of the two previously identified regions, now we have three regions
Then we look to split one of these three regions to minimize the RSS
5 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

The tree building processFirst select the predictor XjXj and the cutpoint ss such that splitting the predictor space into {X|Xj<s}{X|Xj<s} and {X|Xk≥s}{X|Xk≥s} leads to the greatest possible reduction in RSS
We repeat this process, looking for the best predictor and cutpoint to split the data within each of the resulting regions
Now instead of splitting the entire predictor space, we split one of the two previously identified regions, now we have three regions
Then we look to split one of these three regions to minimize the RSS
This process continues until some stopping criteria are met. 
🛑 e.g., we could stop when we have created a fixed number of regions, or we could keep going until no region contains more than 5 observations, etc.
5 / 7

05:00

`Draw a partition`

Draw an example of a parition of a two-dimensional feature space that could result from recursive binary splitting with six regions. Label your figure with the regions, $R_{1}, \dots, R_{6}$ as well as the cutpoints $t_{1}, t_{2}, \dots$ . Draw a decision tree corresponding to this partition.

6 / 7

Dr. Lucy D'Agostino McGowan adapted from slides by Hastie & Tibshirani

7 / 7

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Decision trees - Regression tree building

Dr. D’Agostino McGowan

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

The tree building process

Draw a partition

The tree building process

Help

`Draw a partition`