Advanced R
就从这逼格满满的话,就应该好好看。
Wickham (2014) Be comfortable reading and understanding the majority of R code. You’ll recognise common idioms (even if you wouldn’t use them your- self) and be able to critique others’ code.
看完这个,就可以写R package了。
Data structures
Vectors
Wickham (2014) Atomic vectors are usually created with
c()
x <- c(a = 1, b = 2)
is.vector(x)
y <- as.vector(x)
typeof(y)
length(y)
attributes(y)
is.atomic(y) || is.list(y)
# || = or
# 这个地方不太懂
Wickham (2014) NB:
is.vector()does not test if an object is a vector. Instead it returns TRUE only if the object is a vector with no attributes apart from namesattributes(y) == NULL.
Atomic vectors
dbl_var <- c(1, 2.5, 4.5)
# With the L suffix, you get an integer rather than a double
int_var <- c(1L, 6L, 10L)
Atomic vectors are always flat, even if you nest c()’s:
c(1, c(2, c(3, 4)))
c(1, 2, 3, 4)
NA
NAwill always be coerced to the correct type if used inside c(), or you can create NAs of a specific type withNA_real_(a double vector),NA_integer_andNA_character_.
NA_real_
NA_integer_
NA_character_
Types and tests
is.atomic包含了
is.character(), is.double(), is.integer(), is.logical()
int_var <- c(1L, 6L, 10L)
is.integer(int_var)
is.character(int_var)
is.atomic(int_var)
Lists
x <- list(1:3, "a", c(TRUE, FALSE, TRUE), c(2.3, 5.9))
str(x)
Lists are sometimes called recursive vectors, because a list can con- tain other lists.
x <- list(list(list(list())))
str(x)
is.recursive(x)
x <- list(list(1, 2), c(3, 4))
y <- c(list(1, 2), c(3, 4))
c()是没有梯度的。
unlist() a list to c()
Lists are used to build up many of the more complicated data structures in R. For example, both data frames (described in Section 2.4) and linear models objects (as produced by lm()) are lists:
mtcars %>% is.list()
lm(mpg ~ wt, data = mtcars) %>% is.list()
Attributes
这个解释得很好。
Wickham (2014) Attributes can be thought of as a named list (with unique names). Attributes can be accessed individually with attr() or all at once (as a list) with
attributes().
y <- 1:10
attr(y, "my_attribute") <- "This is a vector"
attr(y, "my_attribute")
str(attributes(y))
str(y)
my_attribute这里就类似于列的名称,
"This is a vector"类似于备注。
Factors
Factors are built on top of integer vectors using two attributes: the
class(), “factor”, which makes them behave differently from regular integer vectors, and thelevels(), which defines the set of allowed values.
所以本质上factors是integer。
Matrices and arrays
Adding a
dim()attribute to an atomic vector allows it to behave like a multi-dimensional array.
c <- 1:6
c
dim(c) <- c(2,3)
c
也可以用matrix和array函数代替。
a <- matrix(1:6,nrow = 2,ncol = 3)
a
b <- array(1:12,c(2,3,2))
b
length() generalises to nrow() and ncol() for matrices, and dim() for arrays.
length(a)
nrow(a)
ncol(a)
rownames(a)
rownames(a) <- c("A","B")
rownames(a)
colnames(a)
colnames(a) <- c("a","b","c")
colnames(a)
length(b)
dim(b)
dimnames(b)
dimnames(b) <- list(c("one", "two"), c("a", "b", "c"), c("A", "B"))
dimnames(b)
b
pp. 27
Data frames
和pandas一样,R中的rownames和colnames/names是一致的。
rownames(mtcars)
colnames(mtcars);names(mtcars)
nrow(mtcars)
ncol(mtcars);length(mtcars)
length表达了有多少个underlying list。
data frame 是一个list,其中元素是等长的vector(不是list)。
不信的话,可以试试typeof。
typeof(mtcars)
is.data.frame(mtcars)
pp. 28