I take notes here as I learn R.

Fragmented Content

This post content may appear fragmented or disjointed for I only noted down things I think I need to look at a second time. Therefore, this post is primarily meant for personal consumption.

  • Modulo is %% instead of %

  • Like Python, / in R also returns a float.

  • Assignment: var-name <- value. For example: x <- 42, a_maker <- function() { a <<- 1 }. Difference as compared to =.

  • Global assignment: use <<- instead. Usually used inside functions to perpetuate variables to a global environment.

  • ā€œfloatsā€ are called ā€œnumericsā€, booleans are called ā€œlogicalā€. All texts are called ā€œcharactersā€.

  • R is a vectorized language; stand-alone numbers or characters are implicitly length-one vectors:

  • Logical are TRUE FALSE; Numbers are by default numerics; L suffix is to specify an integer.

  • Single and double quotes can be used interchangeably but double quotes are preferred (and character constants are printed using double quotes). Single quotes are normally only used to delimit character constants containing double quotes.

  • Evaluating R assignment in interactive mode does not print its value by default. To print: type the name of the variable again or surround the expression with braces like (x <- 5 + 3)

  • Checks the data type: class() or str().

  • Vectors (atomic vectors) are defined with c() like c(1, 2, 3), c("a", "b", "c").

  • Vectors that have a continuous sequence of integer can be specified using :. Like 1:20.

  • Vectors that have a continuous sequence of numerics can be specified using seq(), such as seq(0, 1, .1)(the last parameter is step), seq(0, 1, length.out = 5) (the last parameter is count).

  • Repetitive vectors can be made by rep(). Like rep(c("a", "b"), 5) ( "a" "b" "a" "b" "a" "b" "a" "b" "a" "b") and rep(c("a", "b"), each = 5) ("a" "a" "a" "a" "a" "b" "b" "b" "b" "b").

  • length() returns the length of a vector. Standalone numbers and characters have length 1. logical(0), integer(0), numeric(0) and character(0) are length-zero vectors.

  • Named vectors: c(first = 1, second = 2, third = 3) or

    x <- 1:3
    names(x) <- c("a", "b", "c")
  • Matrix: matrices are populated by column by default.

    matrix(1:6, nrow = 2, ncol = 3)
    #       [,1] [,2] [,3]
    # [1,]  1     3     5
    # [2,]  2     4     6
    A <- diag(1:3) # diagonal matrix
    rownames(A) <- c("a", "b", "c")
    colnames(A) <- c("A", "B", "C")
    A
    #   A B C
    # a 1 0 0
    # b 0 2 0
    # c 0 0 3
    diag(3) # identity matrix
    #     [,1] [,2] [,3]
    # [1,]  1   0   0
    # [2,]  0   1   0
    # [3,]  0   0   1
  • Array:

    array(1:(1*2*3), dim = c(1, 2, 3))
    # , , 1
    #
    #     [,1] [,2]
    # [1,]  1   2
    #
    # , , 2
    #
    #     [,1] [,2]
    # [1,]  3   4
    #
    # , , 3
    #
    #     [,1] [,2]
    # [1,]  5   6
  • Vectors and matrices must have the same types of elements. If not, upcasting happens (logical < integer < double < character).

  • Lists are vectors whose elements (1) can be any kinds of objects and (2) donā€™t have to be the same kind of object: list(TRUE, 1:3, rnorm(3), data.frame(a = 1:2, b = 3:4)). Lists can be given names in the same way vectors and data frames can be given names.

  • A data frame is like a matrix, but its columns can have different types (thus allowing spreadsheet-like data). Under the hood data frames are lists whose elements are vectors of the same
    length.

    data.frame(ints = 1:2, chars = c("a","b"))
    #   ints chars
    # 1  1    a
    # 2  2    b
    names(df) <- c("num", "index")
    df
    #     num index
    # 1    1    a
    # 2    2    b
  • Install packages: install.package("devtools"), install.packages(c("devtools", "dplyr", "tidyr")), and install_github("name/repo").

  • Import packages: library(mpoly) or require(mpoly).

  • Specify the package from which a function is used: mpoly::permutations(3). Do not use a dot.

  • Help: ?lm, help(lm), help.search("generalized linear models"), sos::findFn("eastic net").

  • ls(): lists all the variables in the global environment.

  • rm(): removes variable from memory. To remove all, rm(list = ls()).

  • rnorm() generates pseudo-random numbers from the normal distribution.

  • identical(x,y) to test equality.

https://static1.squarespace.com/static/574311933c44d81acd102b0c/t/5752fd7f8a65e246000c4318/1465056644610/R-Crash-pdf.pdf