#---
# title: "CSC 171 - Lab 8"
# author: "Student Name"
# output: word_document
#---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Instructions
In this lab, you will begin to get oriented with R and work with some data.
#### How to **complete** this assignment.
* Attempt each exercise in order.
* In each code chunk, if you see "# INSERT CODE HERE", then you are expected to add some code to create the intended output (Make sure to erase "# INSERT CODE HERE" and place your code in its place).
* If my instructions say to "Run the code below..." then you do not need to add any code to the chunk.
* Many exercises may require you to type some text below the code chunk, interpreting the output and answering the questions.
* Please follow the Davidson Honor Code and rules from the course syllabus regarding seeking help with this assignment.
#### How to **submit** this assignment.
* When you are finished, click the "Knit" button at the top of this panel. If there are no errors, an word file should pop up after a few seconds.
* Take a look at the resulting word file that pops up. Make sure everything looks correct, your name is listed at the top, and that there is no 'junk' code or output.
* Save the word file (to your local computer, and/or to a cloud location) as: **Lab 8 "Insert Your Name"**.
* Use [this link](https://forms.gle/4SgT2hx9aY2XXEkk7) to upload your word file to my Google Drive folder. **Do not** upload the original .Rmd version.
* This assignment is **due Thursday, August 4, 2022, no later than 9:30 am Eastern**. Points will be deducted for late submissions.
* TIP: Start early so that you can troubleshoot any issues with knitting to word.
#### Grading Rubric
There are 6 possible points on this assignment.
**Baseline (C level work)**
- Your .Rmd file knits to word without errors.
- You answer questions correctly but do not use complete sentences.
- There are typos and 'junk code' throughout the document.
- You do not put much thought or effort into the Reflection answers.
**Average (B level work)**
- You use complete sentences to answer questions.
- You attempt every exercise/question.
**Advanced (A level work)**
- Your code is simple and concise.
- Unnecessary messages from R are hidden from being displayed in the word.
- Your document is typo-free.
- At the discretion of the instructor, you give exceptionally thoughtful or insightful responses.
---
#### **Exercise 1. (6 points)**
We have seen that we can fit an SVM with a non-linear kernel in order to perform classification using a non-linear decision boundary. We will now see that we can also obtain a non-linear decision boundary by performing logistic regression using non-linear transformations of the features.
(A) Generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them. For instance, you can do this as follows:
- $x1 <- runif(500) - 0.5$
- $x2 <- runif(500) - 0.5$
- $y <- 1 * (x1^2 - x2^2 > 0)$
(B) Plot the observations, colored according to their class labels. Your plot should display $X_1$ on the x-axis, and $X_2$ on the y-axis.
(C) Fit a logistic regression model to the data, using $X_1$ and $X_2$ as predictors.
(D) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be linear.
(E) Now fit a logistic regression model to the data using non-linear functions of $X_1$ and $X_2$ as predictors (e.g. $X_1^2, X_1 \times X_2, log(X_2)$, and so forth).
(F) Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be obviously non-linear. If it is not, then repeat (a)-(e) until you come up with an example in which the predicted class labels are obviously non-linear.
(G) Fit a support vector classifier to the data with $X_1$ and $X_2$ as predictors. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
(H) Fit a SVM using a non-linear kernel to the data. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
(I) Comment on your results
```{r}
#insert code here
```
**ANSWER:**