HD 5514 Research Methods (Fall 2019)
Analysis of Two-Way Tables (W11)
In-Class Acitivity
Load Data
We are going to use mtcars
data included in R by default. We will load and print the mtcars data.
Print Data
We can view the content of the mtcars data set.
Check Data
Check the number of rows and columns. (Learn more about Data Types and Objects in R: https://msu.edu/~lixue/geo866/lab02/data_type.html)
# Display the internal structure of an R object.
str(mtcars)
# Check the dimension of an object
dim(mtcars)
# Number of rows (observations)
nrow(mtcars)
# Number of columns (variables)
ncol(mtcars)
If you want to learn more about the mtcars data set, you can bring up a helpful file for functions and datsets.
Form a Contingency Table
We will learn how to conduct Chi-Square Test on the gear (Number of forward gears) and cyl (Number of cylinders) columns in the mtcars data set.
First, let’s form the contengency table. The table
function returns a contingency table of the counts at each combination of factor labels.
Conduct Chi-Squared Test
Now, we will conduct the chi-squared test using the chisq.test() function. We also set correct=FALSE
to turn off Yates’ continuity correction.
Get Expected Counts
To get a table of expected counts, type this.
Conduct Fisher’s Exact Test
Chi-square test is used to the the association between two categorical variables when the cell sizes are expected to be large. Fisher’s Exact test is used when sample size is small (or you have expected cell sizes < 5).
Assignment 7 (Week 11)
Read Data
We will use build-in data set survey
in the MASS package.
# Install the MASS package if you haven't (remove # from the code below)
#install.packages("MASS")
# Load the MASS Package
library(MASS)
We will load and print the survey data.
Check Data
Check the number of rows and columns.
Form a Contingency Table
Now, we will conduct Chi-Square Test on the Smoke (How much the student smokes) and Exer (How often the student exercise) columns in the mtcars data set.
First, let’s form the contengency table.
Conduct Chi-Squared Test
Nest, we will conduct the chi-squared test using the chisq.test() function. We also set correct=FALSE
to turn off Yates’ continuity correction.
Combine columns
The warning message above is due to the small cell values in the contingency table. We can combine the second and third columns to avoid the warning sign.
First, we will save the contingency table named tbl.
# Save the contingency table as tbl
tbl <- table(survey$Smoke, survey$Exer)
tbl
# We can apply the chisq.test function to the contingency table tbl
chisq.test(tbl)
Next, combine the second and third columns of tbl. Save it in a new table named ctbl.
# Combine the second and third columns
ctbl <- cbind(tbl[,"Freq"], tbl[,"None"] + tbl[,"Some"])
ctbl
We can apply the chisq.test function to the contingency table ctbl