Instructions

What should I bring to tutorial on January 12?

  • R output (e.g., plots) for Question 2. You can either bring a hardcopy or bring your laptop with the output.

First steps to answering these questions.

file_url <- "https://raw.githubusercontent.com/ntaback/UofT_STA130/master/week1/Week1PracticeProblems-student.Rmd"
download.file(url = file_url , destfile = "Week1PracticeProblems-student.Rmd")

Look for the file “Week1PracticeProblems-student.Rmd” under the Files tab then click on it to open.

  • Change the subtitle to “Week 1 Practice Problems Solutions” and change the author to your name and student number.

  • Type your answers below each question. Remember that R code chunks can be inserted directly into the notebook by choosing Insert R from the Insert menu (see Using R Markdown for Class Assignments). In addition this R Markdown cheatsheet, and reference are great resources as you get started with R Markdown.

  • If you are NOT working on https://rstudio.chass.utoronto.ca/ then you will need to install several libraries such as tidyverse, mdsr and mosaic to complete the questions.

Practice Problems

Question 1

Exercise 3.1 in the textbook uses data that come with R. The dataset is in the mosaic package, which you must first load with the command library(mosaic). The name of the dataframe is Galton.

  1. Construct the plots that you are asked to construct in Exercise 3.1.

  2. Name three additional plots that would be interesting to examine.

Question 2

Bring your output for this question to tutorial on Friday January 12 (either a hardcopy or on your laptop). For this question, we will use the data in Exercise 3.4 in the texbook. You can read more about the data and the variables here: https://rdrr.io/cran/mosaicData/man/Marriage.html.

  1. Construct at least two plots that each show the distribution of one categorical variable.

  2. Construct at least two plots that each show the distribution of one quantitative variable.

  3. Construct a plot that shows the relationship two between variables. What can you say about the relationship?

  4. Can you construct a plot using three variables? four variables? If you can, construct them!

Question 3

For this exercise, you will load data from an external source. You can read about the data here: http://sta220.utstat.utoronto.ca/data/the-skeleton-data/.

The data are in a plain text file with spaces between columns here: http://stats.onlinelearning.utoronto.ca/wp-content/uploaded/Data/SkeletonDatacomplete.txt. The following code will load the data into a tibble (the tidyverse version of a data frame).

  1. Read the data into R using the following code.
library(tidyverse)
data_url <- "http://stats.onlinelearning.utoronto.ca/wp-content/uploaded/Data/SkeletonDatacomplete.txt"
skeleton_data <- read_table(data_url)

Inspect the data to make sure it is read in completely. You can compare by going directly to the data_url.

  1. Construct at least four interesting graphs with the data, including: a graph of one categorical variable, a graph of one quantitative variable, a graph with at least two variables, a graph with at least three variables.

  2. Describe what you learned about the data from your graphs.

R Markdown source