- 1
safely()
函数 - 2 possibly函数解释
- 3
pwalk
批量操作 - 4
rerun
: resample - 5
map
替代mutate
- 6
keep
替代filter
- 7
map_if
替代mutate_if
- 8
map_at
- 9
transpose
函数 - 10 Set names of list elements
- 11
reduce
反馈最后一个值 - 12
accumulate
和accumulate_right
- 13 相关系数\(\rho\)和p value分析
- 14
partial
提前预设参数,进行函数封装 - 15
some
和every
- 16 Use list, UDF in map*
- 17 反馈 NULL 的解决情况
- 参考文献
本文于2020-10-10更新。 如发现问题或者有建议,欢迎提交 Issue
o # map
map
函数相当于for循环,但是不需要注意for(){}
的缩进,而且代码写起来,简单些。
mtcars %>%
group_by(cyl) %>%
nest() %>%
mutate(p = map(data,
function(x){
p <- x %>% ggplot(aes(x = mpg, y = disp)) +
geom_point() +
labs(
x = "Miles/(US) gallon",
y = "Displacement (cu.in.)",
subtitle = paste("cyl =",cyl),
title = "The scatter diagram for Miles/(US) gallon by Displacement (cu.in.)",
caption = "Data source: mtcars"
) +
theme_minimal()
print(p)}))
## # A tibble: 3 x 3
## # Groups: cyl [3]
## cyl data p
## <dbl> <list> <list>
## 1 6 <tibble [7 x 10]> <gg>
## 2 4 <tibble [11 x 10]> <gg>
## 3 8 <tibble [14 x 10]> <gg>
purrr::map
可以批量跑函数,但是会出现报错,这个时候possibly
和safely
可以给报错的打个标签,继续跑
rdata.lu Blog
。
Wickham et al. (n.d.) 在第四和第五部分主要解释了这两个函数。
1 safely()
函数
# Create safe_readLines() by passing readLines() to safely()
safe_readLines <- safely(readLines)
# Call safe_readLines() on "http://example.org"
safe_readLines("http://example.org")
# Call safe_readLines() on "http://asdfasdasdkfjlda"
safe_readLines("http://asdfasdasdkfjlda")
H. Wickham (2018a) 给出了safely()
函数的例子。
显然第二个例子不是一个网页,因此会报错,因此两个例子的区别在于,
- 成功了,因此
$result
非空,$error
空 - 失败了,因此
$result
空,$error
非空
1.1 map 和 safely
urls <- list(
example = "http://example.org",
rproj = "http://www.r-project.org",
asdf = "http://asdfasdasdkfjlda"
)
# Define safe_readLines()
safe_readLines <- safely(readLines)
# Use the safe_readLines() function with map(): html
html <- map(urls, safe_readLines)
# Call str() on html
str(html)
# Extract the result from one of the successful elements
map(html, "result")
# Extract the error from the element that was unsuccessful
map(html, "error")
2 possibly函数解释
possibly_some_function = possibly(some_function, otherwise = "something wrong here")
是safely的简单版本。
3 pwalk
批量操作
看了一下 Wickham and Grolemund (2017) 的书,真是不错啊,不小心重新看一下, 如获至宝。 例如,
Wickham and Grolemund (2017, 336)提到批量保存ggplot图片的方式。
library(ggplot2)
plots <- mtcars %>%
split(.$cyl) %>%
map(~ggplot(., aes(mpg, wt)) + geom_point())
paths <- stringr::str_c(names(plots), ".pdf")
pwalk(list(paths, plots), ggsave, path = getwd())
- 使用了
split
和map
函数,批量使用函数ggplot
。 str_c
用于图片命名,功效类似于paste
。- 最后最强的是
pwalk
,给予一个list,然后使用ggsave
。
4 rerun
: resample
library(purrr)
10 %>% rerun(rnorm(5))
## [[1]]
## [1] -0.45299858 -1.40542814 -0.04279963 0.72608261 -1.41528489
##
## [[2]]
## [1] -0.4615507 1.6824498 -1.1289021 -1.2268022 -0.4856850
##
## [[3]]
## [1] -0.3918068 -0.3901946 0.1806043 -1.6189512 -1.8983127
##
## [[4]]
## [1] 0.7468726 0.6292661 0.3700957 -0.4464054 -0.0136263
##
## [[5]]
## [1] 0.3604068 1.3956616 -0.2069600 -0.9912203 -0.5801156
##
## [[6]]
## [1] 0.1537699 -0.1276541 -0.2858236 -0.1230204 0.3429197
##
## [[7]]
## [1] -0.3390390 0.6448945 -1.1980548 2.4970258 1.2497283
##
## [[8]]
## [1] 0.13972860 1.78194093 0.05602954 0.57862112 1.21168028
##
## [[9]]
## [1] -0.8268868 -0.7651689 1.0638391 0.2099143 -0.4746792
##
## [[10]]
## [1] -0.2862176 -0.9219227 0.5225702 1.4487096 1.2010788
10 %>%
rerun(x = rnorm(5),y= rnorm(5)) %>%
map_dbl(~ cor(.x$x, .x$y))
## [1] -0.15088408 -0.79703083 0.72777903 -0.07017608 -0.75215386 0.64780472
## [7] 0.43406793 -0.26921301 -0.57959740 0.33200593
5 map
替代mutate
FAY (2017) 提供的例子。
library(purrr)
airquality %>% glimpse()
## Rows: 153
## Columns: 6
## $ Ozone <int> 41, 36, 12, 18, NA, 28, 23, 19, 8, NA, 7, 16, 11, 14, 18, 1...
## $ Solar.R <int> 190, 118, 149, 313, NA, NA, 299, 99, 19, 194, NA, 256, 290,...
## $ Wind <dbl> 7.4, 8.0, 12.6, 11.5, 14.3, 14.9, 8.6, 13.8, 20.1, 8.6, 6.9...
## $ Temp <int> 67, 72, 74, 62, 56, 66, 65, 59, 61, 69, 74, 69, 66, 68, 58,...
## $ Month <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,...
## $ Day <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ...
shapiro.test
: Performs the Shapiro-Wilk test of normality.
airquality %>% map(shapiro.test)
## $Ozone
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.87867, p-value = 2.79e-08
##
##
## $Solar.R
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.94183, p-value = 9.492e-06
##
##
## $Wind
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.98575, p-value = 0.1178
##
##
## $Temp
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.97617, p-value = 0.009319
##
##
## $Month
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.88804, p-value = 2.258e-09
##
##
## $Day
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.95313, p-value = 5.048e-05
6 keep
替代filter
airquality %>%
map(shapiro.test)%>%
keep(~ .x$p.value > 0.05)
## $Wind
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.98575, p-value = 0.1178
FAY (2017) 提供的例子。
keep
业务上非常方便,直接选择shapiro.test
反馈的结果.$...
之一作为筛选条件,不需要建立data.frame
再使用filter
函数。
7 map_if
替代mutate_if
map_if(iris, is.numeric, shapiro.test)
## $Sepal.Length
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.97609, p-value = 0.01018
##
##
## $Sepal.Width
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.98492, p-value = 0.1012
##
##
## $Petal.Length
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.87627, p-value = 7.412e-10
##
##
## $Petal.Width
##
## Shapiro-Wilk normality test
##
## data: .x[[i]]
## W = 0.90183, p-value = 1.68e-08
##
##
## $Species
## [1] setosa setosa setosa setosa setosa setosa
## [7] setosa setosa setosa setosa setosa setosa
## [13] setosa setosa setosa setosa setosa setosa
## [19] setosa setosa setosa setosa setosa setosa
## [25] setosa setosa setosa setosa setosa setosa
## [31] setosa setosa setosa setosa setosa setosa
## [37] setosa setosa setosa setosa setosa setosa
## [43] setosa setosa setosa setosa setosa setosa
## [49] setosa setosa versicolor versicolor versicolor versicolor
## [55] versicolor versicolor versicolor versicolor versicolor versicolor
## [61] versicolor versicolor versicolor versicolor versicolor versicolor
## [67] versicolor versicolor versicolor versicolor versicolor versicolor
## [73] versicolor versicolor versicolor versicolor versicolor versicolor
## [79] versicolor versicolor versicolor versicolor versicolor versicolor
## [85] versicolor versicolor versicolor versicolor versicolor versicolor
## [91] versicolor versicolor versicolor versicolor versicolor versicolor
## [97] versicolor versicolor versicolor versicolor virginica virginica
## [103] virginica virginica virginica virginica virginica virginica
## [109] virginica virginica virginica virginica virginica virginica
## [115] virginica virginica virginica virginica virginica virginica
## [121] virginica virginica virginica virginica virginica virginica
## [127] virginica virginica virginica virginica virginica virginica
## [133] virginica virginica virginica virginica virginica virginica
## [139] virginica virginica virginica virginica virginica virginica
## [145] virginica virginica virginica virginica virginica virginica
## Levels: setosa versicolor virginica
map_if
allows you to map only on numeric variables in yourdata.frame
因此map
功效远大于mutate
,
map
将每个columns当成一个element。
FAY (2017) 提供的例子。
numbers <- list(11, 12, 13, 14)
is_even <- function(x){
!as.logical(x %% 2)
}
map_if(numbers, is_even, sqrt)
## [[1]]
## [1] 11
##
## [[2]]
## [1] 3.464102
##
## [[3]]
## [1] 13
##
## [[4]]
## [1] 3.741657
map_if
函数的例子来自 Rodrigues (2017)。
8 map_at
map_at(numbers, c(1,3), sqrt)
## [[1]]
## [1] 3.316625
##
## [[2]]
## [1] 12
##
## [[3]]
## [1] 3.605551
##
## [[4]]
## [1] 14
map_at
函数的例子来自 Rodrigues (2017)。
9 transpose
函数
这里主要介绍transpose
函数(H. Wickham 2018d)。
例如
nested_list <- list(
x1 = list(a = 1, b = 2),
x2 = list(a = 3, b = 4)
)
当我们想要提取x1
中的a
时,我们需要写nested_list[[x1]][[a]]
,按着正序写的。
但是当我们对listnested_list
使用transpose
函数后,可以倒序写,transpose(nested_list)[[a]][[x1]]
。
因此当我们的需要提取一个list中,非常inside的数据时,并且,都是统一命名,例如a
,这个方法会非常管用。
# Define save_readLines() and html
safe_readLines <- safely(readLines)
html <- map(urls, safe_readLines)
# Examine the structure of transpose(html)
str(transpose(html))
# Extract the results: res
res <- transpose(html)[["result"]]
# Extract the errors: errs
errs <- transpose(html)[["error"]]
这里可以看到,以result
和error
的list顺序排在了前面,不是原来的example
、rproj
和asdf
。
# Initialize some objects
safe_readLines <- safely(readLines)
html <- map(urls, safe_readLines)
res <- transpose(html)[["result"]]
errs <- transpose(html)[["error"]]
# Create a logical vector is_ok
is_ok <- map_lgl(errs, is_null)
# Extract the successful results
res[is_ok]
# Extract the input from the unsuccessful results
urls[!is_ok]
最后一步就提取出来那一个元素出了问题urls[!is_ok]
。
(H. Wickham 2018c)
9.1 transpose
和modify_depth
numbers_with_error <- list(1, 2, 3, "spam", 4)
numbers_with_error
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3
##
## [[4]]
## [1] "spam"
##
## [[5]]
## [1] 4
safe_sqrt <- safely(sqrt, otherwise = NA_real_)
safe_result_list <- numbers_with_error %>% map(safe_sqrt)
safe_result_list
## [[1]]
## [[1]]$result
## [1] 1
##
## [[1]]$error
## NULL
##
##
## [[2]]
## [[2]]$result
## [1] 1.414214
##
## [[2]]$error
## NULL
##
##
## [[3]]
## [[3]]$result
## [1] 1.732051
##
## [[3]]$error
## NULL
##
##
## [[4]]
## [[4]]$result
## [1] NA
##
## [[4]]$error
## <simpleError in .Primitive("sqrt")(x): 数学函数中用了非数值参数>
##
##
## [[5]]
## [[5]]$result
## [1] 2
##
## [[5]]$error
## NULL
transposed_list <- transpose(safe_result_list)
transposed_list
## $result
## $result[[1]]
## [1] 1
##
## $result[[2]]
## [1] 1.414214
##
## $result[[3]]
## [1] 1.732051
##
## $result[[4]]
## [1] NA
##
## $result[[5]]
## [1] 2
##
##
## $error
## $error[[1]]
## NULL
##
## $error[[2]]
## NULL
##
## $error[[3]]
## NULL
##
## $error[[4]]
## <simpleError in .Primitive("sqrt")(x): 数学函数中用了非数值参数>
##
## $error[[5]]
## NULL
transposed_list %>%
modify_depth(2, is_null)
## $result
## $result[[1]]
## [1] FALSE
##
## $result[[2]]
## [1] FALSE
##
## $result[[3]]
## [1] FALSE
##
## $result[[4]]
## [1] FALSE
##
## $result[[5]]
## [1] FALSE
##
##
## $error
## $error[[1]]
## [1] TRUE
##
## $error[[2]]
## [1] TRUE
##
## $error[[3]]
## [1] TRUE
##
## $error[[4]]
## [1] FALSE
##
## $error[[5]]
## [1] TRUE
modify_depth
可以直接使用函数作用于对应的层级 Rodrigues (2017)。
9.2 map
的替换
mylist <-
list(
list01= list(
var01 = 1:3
,var02 = 1:5
,var03 = 1:7
)
,list02= list(
var01 = 1:9
,var02 = 1:11
,var03 = 1:13
)
)
str(mylist)
## List of 2
## $ list01:List of 3
## ..$ var01: int [1:3] 1 2 3
## ..$ var02: int [1:5] 1 2 3 4 5
## ..$ var03: int [1:7] 1 2 3 4 5 6 7
## $ list02:List of 3
## ..$ var01: int [1:9] 1 2 3 4 5 6 7 8 9
## ..$ var02: int [1:11] 1 2 3 4 5 6 7 8 9 10 ...
## ..$ var03: int [1:13] 1 2 3 4 5 6 7 8 9 10 ...
library(purrr)
mylist %>%
transpose() %>%
.$var02 %>%
unlist %>%
table()
## .
## 1 2 3 4 5 6 7 8 9 10 11
## 2 2 2 2 2 1 1 1 1 1 1
# or TBT8's answer
# https://stackoverflow.com/a/53213468/8625228
mylist %>%
map(~ .[['var02']] == 1) %>%
unlist %>%
sum
## [1] 2
10 Set names of list elements
name_element <- c("sqrt()", "ok?")
transposed_list %>% set_names(name_element)
## $`sqrt()`
## $`sqrt()`[[1]]
## [1] 1
##
## $`sqrt()`[[2]]
## [1] 1.414214
##
## $`sqrt()`[[3]]
## [1] 1.732051
##
## $`sqrt()`[[4]]
## [1] NA
##
## $`sqrt()`[[5]]
## [1] 2
##
##
## $`ok?`
## $`ok?`[[1]]
## NULL
##
## $`ok?`[[2]]
## NULL
##
## $`ok?`[[3]]
## NULL
##
## $`ok?`[[4]]
## <simpleError in .Primitive("sqrt")(x): 数学函数中用了非数值参数>
##
## $`ok?`[[5]]
## NULL
set_names
函数来自purrr
包,详见
复制一个data.frame
。
11 reduce
反馈最后一个值
numbers
## [[1]]
## [1] 11
##
## [[2]]
## [1] 12
##
## [[3]]
## [1] 13
##
## [[4]]
## [1] 14
reduce(numbers, `*`)
## [1] 24024
还可以批量完成计算。
reduce
函数的例子来自 Rodrigues (2017)。
set.seed(123)
mat1 <- matrix(rnorm(10), nrow = 2)
mat2 <- matrix(rnorm(10), nrow = 2)
mat3 <- matrix(rnorm(10), nrow = 2)
list_mat <- list(mat1, mat2, mat3)
reduce(list_mat, `+`)
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.40421756 0.9334753 -1.051593 1.796554 -1.1236339
## [2,] -0.08833858 -0.5477001 1.815285 -3.078305 0.3353615
想象从每一个矩阵的对应位置\(A_{i,j}\)取出一个值,作为一个list,一起完成+
的计算。
reduce
函数的例子来自 Rodrigues (2017)。
11.1 reduce 多表合并
df1 <- as.data.frame(mat1)
df2 <- as.data.frame(mat2)
df3 <- as.data.frame(mat3)
list_df <- list(df1, df2, df3)
reduce(list_df, dplyr::full_join)
## V1 V2 V3 V4 V5
## 1 -0.5604756 1.55870831 0.1292877 0.4609162 -0.6868529
## 2 -0.2301775 0.07050839 1.7150650 -1.2650612 -0.4456620
## 3 1.2240818 0.40077145 -0.5558411 0.4978505 0.7013559
## 4 0.3598138 0.11068272 1.7869131 -1.9666172 -0.4727914
## 5 -1.0678237 -1.02600445 -0.6250393 0.8377870 -1.1381369
## 6 -0.2179749 -0.72889123 -1.6866933 0.1533731 1.2538149
reduce
函数的例子来自 Rodrigues (2017)。
因此可以多表read
进行合并。
as.list(list.files()) %>% map(function(x) read.csv(x)) %>% reduce(rbind)
12 accumulate
和accumulate_right
accumulate(numbers, `*`)
## [1] 11 132 1716 24024
accumulate_right(numbers, `*`)
## [1] 24024 2184 182 14
accumulate
是按左向右滚动accumulate_right
是按右向左滚动 例子来自 Rodrigues (2017)。
1:3 %>% accumulate(`+`)
## [1] 1 3 6
1:10 %>% accumulate_right(`*`)
## [1] 3628800 3628800 1814400 604800 151200 30240 5040 720 90
## [10] 10
# From Haskell's scanl documentation
1:10 %>% accumulate(max, .init = 5)
## [1] 5 5 5 5 5 5 6 7 8 9 10
# Understanding the arguments .x and .y when .f
# is a lambda function
# .x is the accumulating value
1:10 %>% accumulate(~ .x)
## [1] 1 1 1 1 1 1 1 1 1 1
1:10 %>% accumulate(~ 1)
## [1] 1 1 1 1 1 1 1 1 1 1
# .y is element in the list
1:10 %>% accumulate(~ .y)
## [1] 1 2 3 4 5 6 7 8 9 10
library(dplyr)
library(ggplot2)
set.seed(123)
rerun(5, rnorm(100)) %>%
set_names(paste0("sim", 1:5)) %>%
map(~ accumulate(., ~ .05 + .x + .y)) %>%
map_dfr(~ data_frame(value = .x, step = 1:100), .id = "simulation") %>%
ggplot(aes(x = step, y = value)) +
geom_line(aes(color = simulation)) +
ggtitle("Simulations of a random walk with drift")
rerun
参考 4。
13 相关系数\(\rho\)和p value分析
FAY (2017) 提供的例子。
使用函数tidy_comb_all
(Fay 2018)进行。
library(tidystringdist)
comb <- tidy_comb_all(names(airquality))
comb
## # A tibble: 15 x 2
## V1 V2
## * <chr> <chr>
## 1 Ozone Solar.R
## 2 Ozone Wind
## 3 Ozone Temp
## 4 Ozone Month
## 5 Ozone Day
## 6 Solar.R Wind
## 7 Solar.R Temp
## 8 Solar.R Month
## 9 Solar.R Day
## 10 Wind Temp
## 11 Wind Month
## 12 Wind Day
## 13 Temp Month
## 14 Temp Day
## 15 Month Day
bulk_cor <-
comb %>%
pmap(~ cor.test(airquality[[.x]], airquality[[.y]])) %>%
map_df(broom::tidy) %>%
bind_cols(comb, .)
bulk_cor
## # A tibble: 15 x 10
## V1 V2 estimate statistic p.value parameter conf.low conf.high method
## <chr> <chr> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <chr>
## 1 Ozone Sola~ 0.348 3.88 1.79e- 4 109 0.173 0.502 Pears~
## 2 Ozone Wind -0.602 -8.04 9.27e-13 114 -0.706 -0.471 Pears~
## 3 Ozone Temp 0.698 10.4 2.93e-18 114 0.591 0.781 Pears~
## 4 Ozone Month 0.165 1.78 7.76e- 2 114 -0.0183 0.337 Pears~
## 5 Ozone Day -0.0132 -0.141 8.88e- 1 114 -0.195 0.169 Pears~
## 6 Sola~ Wind -0.0568 -0.683 4.96e- 1 144 -0.217 0.107 Pears~
## 7 Sola~ Temp 0.276 3.44 7.52e- 4 144 0.119 0.419 Pears~
## 8 Sola~ Month -0.0753 -0.906 3.66e- 1 144 -0.235 0.0882 Pears~
## 9 Sola~ Day -0.150 -1.82 7.02e- 2 144 -0.305 0.0125 Pears~
## 10 Wind Temp -0.458 -6.33 2.64e- 9 151 -0.575 -0.323 Pears~
## 11 Wind Month -0.178 -2.23 2.75e- 2 151 -0.328 -0.0202 Pears~
## 12 Wind Day 0.0272 0.334 7.39e- 1 151 -0.132 0.185 Pears~
## 13 Temp Month 0.421 5.70 6.03e- 8 151 0.281 0.543 Pears~
## 14 Temp Day -0.131 -1.62 1.08e- 1 151 -0.283 0.0287 Pears~
## 15 Month Day -0.00796 -0.0978 9.22e- 1 151 -0.166 0.151 Pears~
## # ... with 1 more variable: alternative <chr>
[[.x]]
这种写法非常优秀,达到了哈希匹配的方式。
14 partial
提前预设参数,进行函数封装
FAY (2017) 提供的例子。
titanic <- read_csv("../../../picbackup/titanic.csv")
set.seed(20)
train <- rerun(20, sample_frac(titanic, size = 0.8))
validation <- map(train, ~ anti_join(titanic, .x))
rerun
: 实现bootstrap的功能。anti_join
: 实现train and test split,并且不用建立index,一行代码完成。
map_int(validation, nrow) %>% every(~ .x == 262)
检验每个测试集的样本量一样。
library(rpart)
rpart_pimped <- partial(rpart, formula = survived ~ sex, method = "class")
res <- map(train, ~ rpart_pimped(data = .x))
prediction <- map2(validation, res, ~ predict(.y, .x, type = "class"))
w_prediction <- map2(validation, prediction, ~ mutate(.x, prediction = .y))
library(caret)
conf_mats <- map(w_prediction, ~ confusionMatrix(.x$prediction, .x$survived))
Error:
dataand
referenceshould be factors with the same levels.
这里有报错。
应该是\(y\)和\(\hat y\)长度不一样。
map_dbl(conf_mats, ~ .x$byClass["Sensitivity"]) %>% every(~ .x > 0.8)
15 some
和 every
mtcars %>%
some(is.infinite)
## [1] FALSE
- 这使得infinity的检验可以产生类似于
anyNA
的效果。 some()
: 列表中的某些元素是否满足要求? 返回结果TRUE/FALSEevery()
: 列表中的每个元素是否满足要求?(邬书豪 2017)
16 Use list, UDF in map*
my_summarise_f
函数参考 Rodrigues (2018)。
my_summarise_f <- function(dataset, cols, funcs){
dataset %>%
summarise_at(vars(!!!cols), funs(!!!funcs))
}
mtcars %>%
my_summarise_f(quos(mpg, drat, hp), quos(mean, sd, max))
## mpg_mean drat_mean hp_mean mpg_sd drat_sd hp_sd mpg_max drat_max
## 1 20.09062 3.596563 146.6875 6.026948 0.5346787 68.56287 33.9 4.93
## hp_max
## 1 335
data_list = list(mtcars, iris)
cols_mtcars = quos(mpg, drat, hp)
cols_iris = quos(Sepal.Length, Sepal.Width)
cols_list = list(cols_mtcars, cols_iris)
funcs_mtcars = quos(mean, sd, max)
funcs_iris = quos(median, min)
funcs_list = list(funcs_mtcars, funcs_iris)
map2(data_list,
cols_list,
my_summarise_f, funcs = quos(mean, sd, max))
## [[1]]
## mpg_mean drat_mean hp_mean mpg_sd drat_sd hp_sd mpg_max drat_max
## 1 20.09062 3.596563 146.6875 6.026948 0.5346787 68.56287 33.9 4.93
## hp_max
## 1 335
##
## [[2]]
## Sepal.Length_mean Sepal.Width_mean Sepal.Length_sd Sepal.Width_sd
## 1 5.843333 3.057333 0.8280661 0.4358663
## Sepal.Length_max Sepal.Width_max
## 1 7.9 4.4
pmap(
list(
dataset = data_list,
cols = cols_list,
funcs = funcs_list
),
my_summarise_f)
## [[1]]
## mpg_mean drat_mean hp_mean mpg_sd drat_sd hp_sd mpg_max drat_max
## 1 20.09062 3.596563 146.6875 6.026948 0.5346787 68.56287 33.9 4.93
## hp_max
## 1 335
##
## [[2]]
## Sepal.Length_median Sepal.Width_median Sepal.Length_min Sepal.Width_min
## 1 5.8 3 4.3 2
参考文献
FAY, Colin. 2017. “A Crazy Little Thing Called purrr - Part 6 : Doing Statistics.” 2017. https://colinfay.me/purrr-statistics/.
Fay, Colin. 2018. Tidystringdist: String Distance Calculation with Tidy Data Principles. https://CRAN.R-project.org/package=tidystringdist.
Rodrigues, Bruno. 2017. “Lesser Known Purrr Tricks.” 2017. http://www.brodrigues.co/blog/2017-03-24-lesser_known_purrr/.
———. 2018. “Mapping a List of Functions to a List of Datasets with a List of Columns as Arguments.” 2018. http://www.brodrigues.co/blog/2018-01-19-mapping_functions_with_any_cols/.
Wickham, Hadley. 2018a. “Writing Functions in R: Creating a Safe Function.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=2.
———. 2018b. “Writing Functions in R: Using Map Safely.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=3.
———. 2018c. “Writing Functions in R: Working with Errors and Results.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=5.
———. 2018d. “Writing Functions in R: Working with Safe Output.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=4.
Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.
Wickham, Hadley, Charlotte Wickham, Nick Carchedi, and Tom Jeon. n.d. Writing Functions in R. DataCamp. https://www.dreatacamp.com/courses/writing-functions-in-r.
邬书豪. 2017. “R for Data Science之purrr包(上).” 2017. https://zhuanlan.zhihu.com/p/32293221.