Creating Data Frames from Contingency Tables

Journal articles and books often present information in contingency tables. This document briefly considers how to store the information from a contingency table as a data frame. Consider the contingency table mat stored as a matrix.

mat <- matrix(data = c(56, 35, 61, 43, 54, 61, 21, 42, 8, 19), nrow = 2)
dimnames(mat) <- list(gender = c("girls", "boys"), ability = c("hopeless", "belowavg", "average", "aboveavg", "superior"))
mat

       ability
gender  hopeless belowavg average aboveavg superior
  girls       56       61      54       21        8
  boys        35       43      61       42       19

class(mat)

[1] "matrix"

One approach to create a data frame from a contingency table is to use the expand.dft function from the vcdExtra package.

If the data is stored in a matrix, convert it to class table

matT <- as.table(mat)
matT

       ability
gender  hopeless belowavg average aboveavg superior
  girls       56       61      54       21        8
  boys        35       43      61       42       19

class(matT)

[1] "table"

Convert the table object to a data frame using as.data.frame.

matDF <- as.data.frame(matT)
matDF

   gender  ability Freq
1   girls hopeless   56
2    boys hopeless   35
3   girls belowavg   61
4    boys belowavg   43
5   girls  average   54
6    boys  average   61
7   girls aboveavg   21
8    boys aboveavg   42
9   girls superior    8
10   boys superior   19

class(matDF)

[1] "data.frame"

Convert the frequency form data frame to a data frame using the expand.dft function.

DF <- vcdExtra::expand.dft(matDF)
head(DF)

  gender  ability
1  girls hopeless
2  girls hopeless
3  girls hopeless
4  girls hopeless
5  girls hopeless
6  girls hopeless

class(DF)

[1] "data.frame"

Consider creating a contingency table from the data frame DF.

CT <- xtabs(~gender + ability, data = DF)
CT

       ability
gender  aboveavg average belowavg hopeless superior
  boys        42      61       43       35       19
  girls       21      54       61       56        8

chisq.test(CT)


    Pearson's Chi-squared test

data:  CT
X-squared = 19.869, df = 4, p-value = 0.00053

Randomization Test

set.seed(2)
N <- 10^4 - 1 # Change this for slower computers
result <- numeric(N)
for (i in 1:N) {
T2 <- xtabs(~ sample(gender) + ability, data = DF)
result[i] <- chisq.test(T2)$statistic
}
obs <- chisq.test(xtabs(~ gender + ability, data = DF))$statistic
obs

X-squared 
 19.86911

pvalue <- (sum(result >= obs) + 1)/(N + 1)
pvalue

[1] 5e-04

Creating Data Frames from Contingency Tables

Alan Arnholt

Last Updated: 2016-03-21

Randomization Test