class: right, top, my-title, title-slide # Support Vector Machines ### Jo Hardin ### November 9 & 11, 2021 --- # Agenda 11/09/21 1. linearly separable 2. dot products 3. support vector formulation --- ## `tidymodels` syntax 1. partition the data 2. build a recipe 3. select a model 4. create a workflow 5. fit the model 6. validate the model --- ## Support Vector Machines > SVMs create both linear and non-linear decision boundaries. They are incredibly efficient because of the **kernel trick** which allows the computation to be done in a high dimension. --- ## Deriving SVM formulation `\(\rightarrow\)` see class notes for all technical details * Mathematics of the optimization to find the widest linear boundary in a space where the two groups are completely separable. * Note from derivation: both the optimization and the application are based on dot products. * Transform the data to a higher space so that the points are linearly separable. Perform SVM in that space. * Recognize that "performing SVM in higher space" is exactly equivalent to using a kernel in the original dimension. * Allow for points to cross the boundary using soft margins. --- # Agenda 11/11/21 1. not linearly separable (SVM) 2. kernels (SVM) 3. support vector formulation --- ****** **Algorithm**: Support Vector Machine ****** 1. Using cross validation, find values of `\(C, \gamma, d, r\)`, etc. (and the kernel function!) 2. Using Lagrange multipliers (read: the computer), solve for `\(\alpha_i\)` and `\(b\)`. 3. Classify an unknown observation (${\bf u}$) as "positive" if: `$$\sum \alpha_i y_i \phi({\bf x}_i) \cdot \phi({\bf u}) + b = \sum \alpha_i y_i K({\bf x}_i, {\bf u}) + b \geq 0$$` ****** --- --- ## SVM example w defaults .panelset[ .panel[.panel-name[recipe] ```r penguin_svm_recipe <- recipe(sex ~ bill_length_mm + bill_depth_mm + flipper_length_mm + body_mass_g, data = penguin_train) %>% step_normalize(all_predictors()) summary(penguin_svm_recipe) ``` ``` ## # A tibble: 5 × 4 ## variable type role source ## <chr> <chr> <chr> <chr> ## 1 bill_length_mm numeric predictor original ## 2 bill_depth_mm numeric predictor original ## 3 flipper_length_mm numeric predictor original ## 4 body_mass_g numeric predictor original ## 5 sex nominal outcome original ``` ] .panel[.panel-name[model] ```r penguin_svm_lin <- svm_linear() %>% set_engine("LiblineaR") %>% set_mode("classification") penguin_svm_lin ``` ``` ## Linear Support Vector Machine Specification (classification) ## ## Computational engine: LiblineaR ``` ] .panel[.panel-name[workflow] ```r penguin_svm_lin_wflow <- workflow() %>% add_model(penguin_svm_lin) %>% add_recipe(penguin_svm_recipe) penguin_svm_lin_wflow ``` ``` ## ══ Workflow ════════════════════════════════════════════════════════════════════ ## Preprocessor: Recipe ## Model: svm_linear() ## ## ── Preprocessor ──────────────────────────────────────────────────────────────── ## 1 Recipe Step ## ## • step_normalize() ## ## ── Model ─────────────────────────────────────────────────────────────────────── ## Linear Support Vector Machine Specification (classification) ## ## Computational engine: LiblineaR ``` ] .panel[.panel-name[fit] .pull-left[ ```r penguin_svm_lin_fit <- penguin_svm_lin_wflow %>% fit(data = penguin_train) penguin_svm_lin_fit ``` ] .pull-right[ ``` ## ══ Workflow [trained] ══════════════════════════════════════════════════════════ ## Preprocessor: Recipe ## Model: svm_linear() ## ## ── Preprocessor ──────────────────────────────────────────────────────────────── ## 1 Recipe Step ## ## • step_normalize() ## ## ── Model ─────────────────────────────────────────────────────────────────────── ## $TypeDetail ## [1] "L2-regularized L2-loss support vector classification dual (L2R_L2LOSS_SVC_DUAL)" ## ## $Type ## [1] 1 ## ## $W ## bill_length_mm bill_depth_mm flipper_length_mm body_mass_g Bias ## [1,] 0.24891 1.0802 -0.22564 1.3284 0.069927 ## ## $Bias ## [1] 1 ## ## $ClassNames ## [1] male female ## Levels: female male ## ## $NbClass ## [1] 2 ## ## attr(,"class") ## [1] "LiblineaR" ``` ] ] ] --- #### Fit again ``` ## ══ Workflow [trained] ══════════════════════════════════════════════════════════ ## Preprocessor: Recipe ## Model: svm_linear() ## ## ── Preprocessor ──────────────────────────────────────────────────────────────── ## 1 Recipe Step ## ## • step_normalize() ## ## ── Model ─────────────────────────────────────────────────────────────────────── ## $TypeDetail ## [1] "L2-regularized L2-loss support vector classification dual (L2R_L2LOSS_SVC_DUAL)" ## ## $Type ## [1] 1 ## ## $W ## bill_length_mm bill_depth_mm flipper_length_mm body_mass_g Bias ## [1,] 0.24891 1.0802 -0.22564 1.3284 0.069927 ## ## $Bias ## [1] 1 ## ## $ClassNames ## [1] male female ## Levels: female male ## ## $NbClass ## [1] 2 ## ## attr(,"class") ## [1] "LiblineaR" ``` --- ## SVM example w CV tuning (RBF kernel) .panelset[ .panel[.panel-name[recipe] ```r penguin_svm_recipe <- recipe(sex ~ bill_length_mm + bill_depth_mm + flipper_length_mm + body_mass_g, data = penguin_train) %>% step_normalize(all_predictors()) summary(penguin_svm_recipe) ``` ``` ## # A tibble: 5 × 4 ## variable type role source ## <chr> <chr> <chr> <chr> ## 1 bill_length_mm numeric predictor original ## 2 bill_depth_mm numeric predictor original ## 3 flipper_length_mm numeric predictor original ## 4 body_mass_g numeric predictor original ## 5 sex nominal outcome original ``` ] .panel[.panel-name[model] ```r penguin_svm_rbf <- svm_rbf(cost = tune(), rbf_sigma = tune()) %>% set_engine("kernlab") %>% set_mode("classification") penguin_svm_rbf ``` ``` ## Radial Basis Function Support Vector Machine Specification (classification) ## ## Main Arguments: ## cost = tune() ## rbf_sigma = tune() ## ## Computational engine: kernlab ``` ] .panel[.panel-name[workflow] ```r penguin_svm_rbf_wflow <- workflow() %>% add_model(penguin_svm_rbf) %>% add_recipe(penguin_svm_recipe) penguin_svm_rbf_wflow ``` ``` ## ══ Workflow ════════════════════════════════════════════════════════════════════ ## Preprocessor: Recipe ## Model: svm_rbf() ## ## ── Preprocessor ──────────────────────────────────────────────────────────────── ## 1 Recipe Step ## ## • step_normalize() ## ## ── Model ─────────────────────────────────────────────────────────────────────── ## Radial Basis Function Support Vector Machine Specification (classification) ## ## Main Arguments: ## cost = tune() ## rbf_sigma = tune() ## ## Computational engine: kernlab ``` ] .panel[.panel-name[CV] ```r set.seed(234) penguin_folds <- vfold_cv(penguin_train, v = 4) ``` ] .panel[.panel-name[param] ```r # the tuned parameters also have default values you can use penguin_grid <- grid_regular(cost(), rbf_sigma(), levels = 8) penguin_grid ``` ``` ## # A tibble: 64 × 2 ## cost rbf_sigma ## <dbl> <dbl> ## 1 0.00097656 1 e-10 ## 2 0.0043128 1 e-10 ## 3 0.019047 1 e-10 ## 4 0.084119 1 e-10 ## 5 0.37150 1 e-10 ## 6 1.6407 1 e-10 ## 7 7.2458 1 e-10 ## 8 32 1 e-10 ## 9 0.00097656 2.6827e- 9 ## 10 0.0043128 2.6827e- 9 ## # … with 54 more rows ``` ] .panel[.panel-name[tune] ```r # this takes a few minutes penguin_svm_rbf_tune <- penguin_svm_rbf_wflow %>% tune_grid(resamples = penguin_folds, grid = penguin_grid) penguin_svm_rbf_tune ``` ``` ## # Tuning results ## # 4-fold cross-validation ## # A tibble: 4 × 4 ## splits id .metrics .notes ## <list> <chr> <list> <list> ## 1 <split [186/63]> Fold1 <tibble [128 × 6]> <tibble [0 × 1]> ## 2 <split [187/62]> Fold2 <tibble [128 × 6]> <tibble [0 × 1]> ## 3 <split [187/62]> Fold3 <tibble [128 × 6]> <tibble [0 × 1]> ## 4 <split [187/62]> Fold4 <tibble [128 × 6]> <tibble [0 × 1]> ``` ] ] --- ## SVM model output ```r penguin_svm_rbf_tune %>% collect_metrics() %>% filter(.metric == "accuracy") %>% ggplot() + geom_line(aes(color = as.factor(cost), y = mean, x = rbf_sigma)) + geom_point(aes(color = as.factor(cost), y = mean, x = rbf_sigma)) + labs(color = "Cost") + scale_x_continuous(trans='log10') ``` ![](2021-11-09-svm_files/figure-html/unnamed-chunk-16-1.png)<!-- --> --- ## SVM model output - take two ```r penguin_svm_rbf_tune %>% collect_metrics() %>% filter(.metric == "accuracy") %>% ggplot() + geom_line(aes(color = as.factor(rbf_sigma), y = mean, x = cost)) + geom_point(aes(color = as.factor(rbf_sigma), y = mean, x = cost)) + labs(color = "Cost") + scale_x_continuous(trans='log10') ``` ![](2021-11-09-svm_files/figure-html/unnamed-chunk-17-1.png)<!-- --> --- ## SVM Final model ```r penguin_svm_rbf_best <- finalize_model( penguin_svm_rbf, select_best(penguin_svm_rbf_tune, "accuracy")) penguin_svm_rbf_best ``` ``` ## Radial Basis Function Support Vector Machine Specification (classification) ## ## Main Arguments: ## cost = 0.371498572284237 ## rbf_sigma = 1 ## ## Computational engine: kernlab ``` ```r penguin_svm_rbf_final <- workflow() %>% add_model(penguin_svm_rbf_best) %>% add_recipe(penguin_svm_recipe) %>% fit(data = penguin_train) ``` --- ## SVM Final model ```r penguin_svm_rbf_final ``` ``` ## ══ Workflow [trained] ══════════════════════════════════════════════════════════ ## Preprocessor: Recipe ## Model: svm_rbf() ## ## ── Preprocessor ──────────────────────────────────────────────────────────────── ## 1 Recipe Step ## ## • step_normalize() ## ## ── Model ─────────────────────────────────────────────────────────────────────── ## Support Vector Machine object of class "ksvm" ## ## SV type: C-svc (classification) ## parameter : cost C = 0.371498572284237 ## ## Gaussian Radial Basis kernel function. ## Hyperparameter : sigma = 1 ## ## Number of Support Vectors : 137 ## ## Objective Function Value : -31.8 ## Training error : 0.052209 ## Probability model included. ``` --- ## Test predictions ```r penguin_svm_rbf_final %>% predict(new_data = penguin_test) %>% cbind(penguin_test) %>% select(sex, .pred_class) %>% table() ``` ``` ## .pred_class ## sex female male ## female 39 5 ## male 4 36 ``` ```r penguin_svm_rbf_final %>% predict(new_data = penguin_test) %>% cbind(penguin_test) %>% conf_mat(sex, .pred_class) ``` ``` ## Truth ## Prediction female male ## female 39 4 ## male 5 36 ``` --- ## Other measures ```r # https://yardstick.tidymodels.org/articles/metric-types.html class_metrics <- metric_set(accuracy, sensitivity, specificity, f_meas) penguin_svm_rbf_final %>% predict(new_data = penguin_test) %>% cbind(penguin_test) %>% class_metrics(truth = sex, estimate = .pred_class) ``` ``` ## # A tibble: 4 × 3 ## .metric .estimator .estimate ## <chr> <chr> <dbl> ## 1 accuracy binary 0.89286 ## 2 sens binary 0.88636 ## 3 spec binary 0.9 ## 4 f_meas binary 0.89655 ``` --- ## Bias-Variance Tradeoff <div class="figure" style="text-align: center"> <img src="../images/varbias.png" alt="Test and training error as a function of model complexity. Note that the error goes down monotonically only for the training data. Be careful not to overfit!! image credit: ISLR" width="90%" /> <p class="caption">Test and training error as a function of model complexity. Note that the error goes down monotonically only for the training data. Be careful not to overfit!! image credit: ISLR</p> </div> --- ## Reflecting on Model Building <div class="figure"> <img src="../images/modelbuild1.png" alt="Image credit: https://www.tmwr.org/" width="2176" /> <p class="caption">Image credit: https://www.tmwr.org/</p> </div> --- ## Reflecting on Model Building <div class="figure"> <img src="../images/modelbuild2.png" alt="Image credit: https://www.tmwr.org/" width="2067" /> <p class="caption">Image credit: https://www.tmwr.org/</p> </div> --- ## Reflecting on Model Building <div class="figure" style="text-align: center"> <img src="../images/modelbuild3.png" alt="Image credit: https://www.tmwr.org/" width="70%" /> <p class="caption">Image credit: https://www.tmwr.org/</p> </div>