# Matrices, Data frames, the function-function, and control flow ... flow control, lists?

18 Feb 2022The code we produced in class this and last week can be downloaded and reviewed here:

This and last week’s problem set and required R script and data can be found here:

Below a recap of the highlights from the last 2 weeks.

### Exporting R objects

In previous meetings with discussed the `load()`

and `save()`

functions. Now consider the `read.csv()`

and `write.csv()`

functions. These allow us to export R objects (i.e., vectors, matrices, and data-frames) to a plain text format that can be read by MS Excel.

```
# write.csv() saves R objects as comma-separated value files
# read.csv() reads data from comma-separated value files into R
write.csv(x, file = "x.csv")
x <- read.csv(file = "x.csv")
```

### Matrices

We barely scratched the surface with matrices. We discussed three functions to generate matrices:
`rbind()`

, `cbind()`

, and `matrix()`

```
x <- 1:10
y <- 10:1
# binds the vectors x and y together (as rows)
# returns a 2 by 10 matrix
rbind(x, y)
# binds the vectors x and y together (as columns)
# returns a 10 by 2 matrix
cbind(x, y)
# returns a 5 by 5 matrix containing 0's
matrix(data = 0, nrow = 5, ncol = 5)
# returns a 4 by 10 matrix containing a
# sequence of integers from 1 to 40, filled by row
matrix(data = 1:40, nrow = 4, ncol = 10, byrow = TRUE)
```

Indexing matrices works similar to the indexing of vectors via the `[]`

notation. The difference is that matrices are two dimensional objects and we need to specify not just one location (in one dimension) but two, like so: `some_object[row-location, column-location]`

.

```
x <- matrix(data = 1:40, nrow = 4, ncol = 10, byrow = TRUE)
# returns the value stored in row 2, column 3 of matrix x
x[2,3]
# returns a sub-matrix with 2 rows and two columns
x[2:3,4:5]
# returns all of row 3
x[3, ]
# returns all of colunm 4
x[ ,4]
```

### Data frames

We began discussing data frames (i.e., 2D heterogenous objects). These objects can most easily be constructed from atomic vectors via the `data.frame()`

function.

```
# 4 different types of vectors
x <- letters[1:5]
y <- 1:5
z <- c(TRUE, TRUE, NA, FALSE, FALSE)
pies <- rep(x = pi, times = 5)
v <- as.factor(c("yes", "yes", "no", "yes", "no"))
dat <- data.frame(x, y, z, pies, v)
# named vectors stored in data.frames can be extracted via the $
dat$pies # extracts the pies vector from the data frame
dat$x[1:2] # extracts the first and second element of x stored in dat
# code below creates new named vector (called random)
# inside data frame dat
# the vector contains 5 random normal variates
dat$random <- rnorm(n = 5, mean = 0, sd = 1)
```

More on indexing both by location for vectors and vectors stored in data frames.

```
x <- c(1, 9, 8, 1)
y <- c(2, 0, 2, 0)
dat <- data.frame(x, y)
x[2] # returns 9
dat$x[2] # returns 9
dat$y[1+2] # returns 2
dat$y[36/9] # returns 0
```

New functions to explore data frames and edit or created vectors included:

`summary()`

basic summary statistics of an object`names()`

names of vectors stored in data frame

```
names(dat) # evaluates to x and y
names(dat)[1] <- "XXX"
# the line above changes the name of the first
# vector stored in dat to XXX
names(dat) # returns the new names
```

`head()`

and`tail()`

get the first and last few rows of a data frame, respectively`ifelse()`

basic if-else construct

The `ifelse()`

function can be very useful to edit or create vectors.

```
x <- c(1, 9, 8, 1)
ifelse(test = x > 1, yes = 2, no = x)
# the line above will replace all elements
# in x that are greater than 1 with 2s
# and replace those smaller or equal to 1
# with themselves (i.e., leave them as is)
# returns: [1] 1 2 2 1
```

### The Control-Flow Construct `if () {} else {}`

We introduced more flexible flow control via `if`

and `else`

which allow us to instruct R to do different things (i.e., follow instruction, evaluate code) based on whether some logical condition obtains or not.

```
# Simple example (condition is TRUE)
if (1 + 1 == 2) {
print("True")
} else {
print("False")
}
# Simple example (condition is FALSE)
if (1 + 1 == 3) {
print("True")
} else {
print("False")
}
# another example getting R to do a bunch of things
if (0.5 > 1) {
x <- rnorm(n = 10, mean = 0, sd = 1)
x[1]
sum(x)
} else {
y <- runif(n = 10, min = 0, max = 1)
y[1]
sum(y)
}
```

Another big concept introduced involved the creation of custom functions via the `function()`

function.

```
# A function to compute the standard deviation of an arbitrary
# numeric input vector that gives user the option to choose between
# the sample and population variants (the default returns the
# standard variation of a sample)
my_sd <- function(input, population = FALSE) {
n <- length(x = input)
x_bar <- mean(x = input)
if (population == FALSE) {
output <- sqrt(x = sum(x = (input - x_bar)^2)/(n-1))
return(output)
} else {
output <- sqrt(x = sum( x = (input - x_bar)^2)/n)
return(output)
}
}
# Try it out and compare to canned sd() function
x <- rnorm(n = 10, mean = 0, sd = 1)
x
sd(x = x)
my_sd(input = x)
my_sd(input = x, population = TRUE)
```

Below is the code for a function to convert temperature in degrees Fahrenheit to degrees Celsius (and vice-versa). Note that this function here is different from the one we created in class. Unlike the version in class which returned a vector of converted temperatures, this function does not create a vector but just “cat-calls” the conversion to the screen via the `cat()`

function. The `cat()`

function concatenates, vectors, or character-strings (i.e., vectors) into a single string and prints it to the screen. (The `\n`

in the last string to be concatenated creates a forced line break so that the prompt appears on a new line.)

```
convert_temp <- function(input, C.to.F = TRUE) {
if (C.to.F == "TRUE") {
output <- input * 9/5 + 32
return(
cat(input, "degrees Celsius is equal to", output,
"degrees Fahrenheit.\n")
)
} else {
output <- (input - 32) * 5/9
return(
cat(input, "degrees Fahrenheit is equal to", output,
"degrees Celsius.\n")
)
}
}
convert_temp(input = 0, C.to.F = TRUE)
convert_temp(input = 32, C.to.F = FALSE)
```

### Lists

As a last thing we briefly discussed lists. Lists are one-dimensional heterogenous storage containers. Whereas, data-frames are two-dimensional storage containers for vectors, lists can store vectors, matrixes, and data-frames.

```
# two vectors
x <- 1:10
y <- x^2
# one data-frame
df <- data.frame(x, y)
# one matrix
my_matrix <- matrix(data = 1:12, nrow = 3, ncol = 4)
# all of the above and more added to a list
my_list <- list(x, y, df, my_matrix, letters, pi, 9001)
# We have a list of length 7. Seven things have been put into
# our list. Individual elements in the list can be accessed or
# extracted using the double-square-bracket notation [[item]]
my_list[[7]] # returns the seventh item in the list which is
# 9001
my_list[[2]] # returns the second item in the list which is
# the vector y
my_list[[3]]$y[10] # returns the 10th element of the vector y
# stored in our data-frame df, stored in the
# third spot in our list
my_list[[4]][1:2, 3] # what about this?
```