Journal articles and books often present information in contingency tables. This document briefly considers how to store the information from a contingency table as a data frame. Consider the contingency table mat
stored as a matrix.
mat <- matrix(data = c(56, 35, 61, 43, 54, 61, 21, 42, 8, 19), nrow = 2)
dimnames(mat) <- list(gender = c("girls", "boys"), ability = c("hopeless", "belowavg", "average", "aboveavg", "superior"))
mat
ability
gender hopeless belowavg average aboveavg superior
girls 56 61 54 21 8
boys 35 43 61 42 19
class(mat)
[1] "matrix"
One approach to create a data frame from a contingency table is to use the expand.dft
function from the vcdExtra
package.
table
matT <- as.table(mat)
matT
ability
gender hopeless belowavg average aboveavg superior
girls 56 61 54 21 8
boys 35 43 61 42 19
class(matT)
[1] "table"
as.data.frame
.matDF <- as.data.frame(matT)
matDF
gender ability Freq
1 girls hopeless 56
2 boys hopeless 35
3 girls belowavg 61
4 boys belowavg 43
5 girls average 54
6 boys average 61
7 girls aboveavg 21
8 boys aboveavg 42
9 girls superior 8
10 boys superior 19
class(matDF)
[1] "data.frame"
expand.dft
function.DF <- vcdExtra::expand.dft(matDF)
head(DF)
gender ability
1 girls hopeless
2 girls hopeless
3 girls hopeless
4 girls hopeless
5 girls hopeless
6 girls hopeless
class(DF)
[1] "data.frame"
Consider creating a contingency table from the data frame DF
.
CT <- xtabs(~gender + ability, data = DF)
CT
ability
gender aboveavg average belowavg hopeless superior
boys 42 61 43 35 19
girls 21 54 61 56 8
chisq.test(CT)
Pearson's Chi-squared test
data: CT
X-squared = 19.869, df = 4, p-value = 0.00053
set.seed(2)
N <- 10^4 - 1 # Change this for slower computers
result <- numeric(N)
for (i in 1:N) {
T2 <- xtabs(~ sample(gender) + ability, data = DF)
result[i] <- chisq.test(T2)$statistic
}
obs <- chisq.test(xtabs(~ gender + ability, data = DF))$statistic
obs
X-squared
19.86911
pvalue <- (sum(result >= obs) + 1)/(N + 1)
pvalue
[1] 5e-04