Can you spot the difference between a character string and a number? Here’s a test: Which of these are character strings and which are numbers? 1, “1”, “one”.
x <- 1
y <- "1"
z <- "one"
typeof(1)
[1] "double"
typeof(x)
[1] "double"
typeof("1")
[1] "character"
typeof(y)
[1] "character"
typeof("one")
[1] "character"
typeof(z)
[1] "character"
Create an atomic vector that stores just the face names of the cards in a royal flush (ace through ten all one suit), for example, the ace of spades, king of spades, queen of spades, jack of spades, and ten of spades. The face name of the ace of spades would be “ace”, and “spades” is the suit. Which type of vector will you use to save the names?
FaceNames <- c("ace", "king", "queen", "jack", "ten")
FaceNames
[1] "ace" "king" "queen" "jack" "ten"
typeof(FaceNames)
[1] "character"
Create the following matrix, which stores the name and suit of every card in a royal flush.
[,1] [,2]
[1,] "ace" "spaces"
[2,] "king" "spaces"
[3,] "queen" "spaces"
[4,] "jack" "spaces"
[5,] "ten" "spaces"
hand1 <- c("ace", "king", "queen", "jack", "ten", rep("spades", 5))
matrix(hand1, ncol = 2)
[,1] [,2]
[1,] "ace" "spades"
[2,] "king" "spades"
[3,] "queen" "spades"
[4,] "jack" "spades"
[5,] "ten" "spades"
matrix(hand1, nrow = 5)
[,1] [,2]
[1,] "ace" "spades"
[2,] "king" "spades"
[3,] "queen" "spades"
[4,] "jack" "spades"
[5,] "ten" "spades"
dim(hand1) <- c(5, 2)
hand1
[,1] [,2]
[1,] "ace" "spades"
[2,] "king" "spades"
[3,] "queen" "spades"
[4,] "jack" "spades"
[5,] "ten" "spades"
Note: R matrices are column major.
now <- Sys.time()
now
[1] "2016-02-15 14:47:38 EST"
typeof(now)
[1] "double"
class(now)
[1] "POSIXct" "POSIXt"
SEC <- unclass(now)
SEC
[1] 1455565658
The number stored in SEC
represents the number of seconds that have passed between the time and 12:00 AM January 1st 1970 (in the Universal Time Coordinated (UTC) zone).
gender <- factor(c("male", "female", "female", "male"))
typeof(gender)
[1] "integer"
attributes(gender)
$levels
[1] "female" "male"
$class
[1] "factor"
unclass(gender)
[1] 2 1 1 2
attr(,"levels")
[1] "female" "male"
Many card games assign a numerical value to each card. For example, in blackjack, each face card is worth 10 points, each number card is worth between 2 and 10 points, and each ace is worth 1 or 11 points, depending on the final score. Make a virtual card by combining “ace”, “heart”, and 1 into a vector. What type of atomic vector will result? Character Check if you are right.
card <- c("ace", "heart", 1)
typeof(card)
[1] "character"
Use a list to store a single playing card, like the ace of hearts, which has a point value of one. The list should save the face of the card, the suit, and the point value in separate elements.
card <- list(face = "ace", suit = "hearts", value = 1)
card
$face
[1] "ace"
$suit
[1] "hearts"
$value
[1] 1
df <- data.frame(face = c("ace", "two", "six"), suit = c("clubs", "clubs", "clubs"), value = c(1, 2, 3))
df
face suit value
1 ace clubs 1
2 two clubs 2
3 six clubs 3
typeof(df)
[1] "list"
class(df)
[1] "data.frame"
str(df)
'data.frame': 3 obs. of 3 variables:
$ face : Factor w/ 3 levels "ace","six","two": 1 3 2
$ suit : Factor w/ 1 level "clubs": 1 1 1
$ value: num 1 2 3
df2 <- data.frame(face = c("ace", "two", "six"), suit = c("clubs", "clubs", "clubs"), value = c(1, 2, 3), stringsAsFactors = FALSE)
df2
face suit value
1 ace clubs 1
2 two clubs 2
3 six clubs 3
typeof(df2)
[1] "list"
class(df2)
[1] "data.frame"
str(df2)
'data.frame': 3 obs. of 3 variables:
$ face : chr "ace" "two" "six"
$ suit : chr "clubs" "clubs" "clubs"
$ value: num 1 2 3
Creating a deck of cards with less typing
Face <- c("king", "queen","jack", "ten","nine","eight","seven","six","five","four","three","two","ace")
Suit <- c("spades","clubs", "diamonds", "hearts")
Value = 13:1
deck <- data.frame(face = rep(Face, 4), suit = rep(Suit, each = 13), value = rep(Value, 4), stringsAsFactors = FALSE)
library(DT)
datatable(deck)
readr::read_csv()
and repmis::source_data()
Note: read.csv()
will not read from https
(Hypertext Transfer Protocol Secure) web sites.
site <- "https://gist.githubusercontent.com/garrettgman/9629323/raw/ee5dfc039fd581cb467cc69c226ea2524913c3d8/deck.csv"
deck2 <- readr::read_csv(site)
head(deck2)
face suit value
1 king spades 13
2 queen spades 12
3 jack spades 11
4 ten spades 10
5 nine spades 9
6 eight spades 8
deck1 <- repmis::source_data(url = site, sep = ",", header = TRUE)
Downloading data from: https://gist.githubusercontent.com/garrettgman/9629323/raw/ee5dfc039fd581cb467cc69c226ea2524913c3d8/deck.csv
SHA-1 hash of the downloaded data file is:
a1cdb425b6cd2b030f9538257b7c2a61c6e6c8b1
datatable(deck1)
write.csv(deck1, file = "cards.csv", row.names = FALSE)
We download the deck.csv
file from the supplied GitHub url stored in the character string site
and store the downloaded file as DFcards.csv
in the same directory as the current document.
download.file(url = site, destfile = "./DFcards.csv", method = "curl")
list.files()
[1] "cards.csv" "Chapter7.Rmd" "Chapters1and2.html"
[4] "Chapters1and2.Rmd" "Chapters3and4.Rmd" "Chapters5and6.Rmd"
[7] "DFcards.csv" "PackageBuilding.html" "PackageBuilding.Rmd"
[10] "PackagesUsed.bib" "PNG"
head(deck)
face suit value
1 king spades 13
2 queen spades 12
3 jack spades 11
4 ten spades 10
5 nine spades 9
6 eight spades 8
deck[1, 1]
[1] "king"
deck[1, 1:3]
face suit value
1 king spades 13
deck[1:2, 1] # returns a single column
[1] "king" "queen"
deck[1:2, 1, drop = FALSE] # returns a data frame
face
1 king
2 queen
deck[-(2:52), 1:3] # everything except rows 2-52 for cols 1-3
face suit value
1 king spades 13
You can use a blank space to tell R
to extract every value in a dimension.
deck[1, ] # same as deck[1, 1:3]
face suit value
1 king spades 13
set.seed(4)
sims <- 10000
xbar <- numeric(sims)
for(i in 1:sims){
xbar[i] <- mean(runif(50, 0, 10))
}
mean(xbar)
[1] 4.997238
sd(xbar)
[1] 0.4113463
library(ggplot2)
DF <- data.frame(xbar = xbar)
ggplot(data = DF, aes(x = xbar)) +
geom_density(fill = "lightblue") +
stat_function(fun = dnorm, args = list(mean = 5, sd = ((10)/sqrt(12))/sqrt(50)), color = "red") +
theme_bw()
What percent of the values in xbar
are between \(5 - (10/\sqrt{12})/(\sqrt{50}) = 4.5917517\) and \(5 + (10/\sqrt{12})/(\sqrt{50}) = 5.4082483\)?
LOG <- xbar >= (5 - ((10)/sqrt(12))/sqrt(50)) & xbar <= (5 + ((10)/sqrt(12))/sqrt(50))
head(LOG, n = 10)
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE
head(which(LOG == TRUE), n = 10)
[1] 2 3 4 5 6 7 9 10 11 13
length(xbar[LOG])
[1] 6808
mean(LOG)
[1] 0.6808
# Compare to Normal
pnorm(1) - pnorm(-1)
[1] 0.6826895
Use the preceding ideas to write a shuffle function. shuffle()
should take a data frame and return a shuffled copy of the data frame.
shuffle <- function(cards){
index <- sample(dim(cards)[1], size = dim(cards)[1], replace = FALSE)
cards[index, ]
}
deck2 <- shuffle(cards = deck)
deck2[1:5, ]
face suit value
43 ten hearts 10
38 two diamonds 2
24 three clubs 3
10 four spades 4
23 four clubs 4
deck$value
[1] 13 12 11 10 9 8 7 6 5 4 3 2 1 13 12 11 10 9 8 7 6 5 4
[24] 3 2 1 13 12 11 10 9 8 7 6 5 4 3 2 1 13 12 11 10 9 8 7
[47] 6 5 4 3 2 1
mean(deck$value)
[1] 7
LST <- list(numbers = c(1, 2, 3, 4, 5), logical = c(TRUE, FALSE, TRUE), strings = c("dog", "cat", "horse", "car"))
LST
$numbers
[1] 1 2 3 4 5
$logical
[1] TRUE FALSE TRUE
$strings
[1] "dog" "cat" "horse" "car"
Subsetting the first element:
LST[1] # a list
$numbers
[1] 1 2 3 4 5
LST[[1]] # values inside element
[1] 1 2 3 4 5
LST$numbers # values inside element
[1] 1 2 3 4 5
LST[['numbers']] # values inside element
[1] 1 2 3 4 5
If you subset a list with single-bracket notation, R
will return a smaller list. If you subset a list with double-bracket notation, R
will return the values inside the element subsetted of the list.