HD 5514 Research Methods (Fall 2019)

In-Class Acitivity: Correlation

Load Data

We are going to use cars data included in R by default. We will load and print the cars data.

Visualize your data using a scatter plot

Find the data columns, named speed (numeric Speed (mph)) and dist (numeric Stopping distance (ft)) from the cars data set. Create a scatter plot to visualze the relationship between the two variables.

Calcuate correlation

Calcuate correlation among two variables speed and dist using the cor function. Conduct statistical testing for the association using cor.test function. The default is Pearson’s correlation, but you can use Spearman’s (rho) rank correlation method="spearman" or Kendall’s tau method="kendall" for nonparametic tests.

In-Class Acitivity: Simple Linear Regression

Load Data

We are going to use cars data included in R by default. We will load and print the cars data.

Visualize your data using a scatter plot

Find the data columns, named speed (numeric Speed (mph)) and dist (numeric Stopping distance (ft)) from the cars data set. Create a scatter plot to visualze linear relationship between the two variables.

Visualize your data using a boxplot and check for outliers

Use the boxplot funciton to create a boxplot for each variable: speed (numeric Speed (mph)) and dist (numeric Stopping distance (ft)) from the cars data set. See if there is any outlier based on the 1.5 x interquartile-range (1.5 * IQR) rule.

Check the normality of the response variable using a histogram

Create a histogram with the function hist(). You can change the number of bins using the breaks= argument.

Chek the normality of the response variable using a density plot

Create a density plot with the function density().

Chek the normality of the response variable using a QQ plot

Create a density plot with the function qqnorm(). Quantile-Quantile plot (QQ-plot) shows the correlation between a given sample and the normal distribution.

Fit a simple linear regression model

Fit a linear model with the function lm(). Regress the outcome variable dist on the exploratory variable speed

Produce a summary of the model fitting

Produce a summary of the results of the model fitting using summary funciton.

Checking for statistical significance

Check coefficients, F-statistic, and R-squared

In-Class Acitivity: Multiple Linear Regression

Load Data

We are going to use mtcars data included in R by default. We will load and print the mtcars data.

Fit a multiple linear regression model

Fit a linear model with the function lm(). Regress the outcome variable mpg(Miles/(US) gallon) on the exploratory variables cyl (Number of cylinders), wt (Weight (1000 lbs)), and vs Engine (0 = V-shaped, 1 = straight).

Produce a summary of the model fitting

Produce a summary of the results of the model fitting using summary funciton.

Assignment 9 (Week 13)

Read Data

We will use build-in data set Prestige contained in the R package car. We will load and print the survey data.

Use the help function to learn about variables

If you want to learn more about the t.test function.

Check three data frame columns (income, education)

First, find the data column, named income, of the survey data set.

Next, find another data column, named education in the survey data set.

Lastly, find the data column, named prestige, of the survey data set.

Visualize your variable and check for outliers

Use the boxplot funciton to create a boxplot for each variable: income (Average income of incumbents, dollars, in 1971), education (Average education of occupational incumbents, years, in 1971), prestige (Pineo-Porter prestige score for occupation, from a social survey conducted in the mid-1960s) from the cars data set. See if there is any outlier based on the 1.5 x interquartile-range (1.5 * IQR) rule.

Check the normality of a potenital response variable using a histogram

Create a histogram with the function hist(). You can change the number of bins using the breaks= argument.

Check the normality of a potenital response variable using a density plot

Create a density plot with the function density().

Check the normality of a potenital response variable using a QQ plot

Create a density plot with the function qqnorm(). Quantile-Quantile plot (QQ-plot) shows the correlation between a given sample and the normal distribution.

Check the normality of the response variable using a histogram

Create a histogram with the function hist(). You can change the number of bins using the breaks= argument.

Chek the normality of the response variable using a density plot

Create a density plot with the function density().

Check the normality of the response variable using a QQ plot

Create a density plot with the function qqnorm(). Quantile-Quantile plot (QQ-plot) shows the correlation between a given sample and the normal distribution.

Fit a simple linear regression model

Fit a linear model with the function lm(). Regress the outcome variable prestige on the exploratory variable education

Produce a summary of the model fitting

Produce a summary of the results of the model fitting using summary funciton.

Koeun Choi

November 20, 2019