14 min read

purrr包使用技巧

本文于2020-10-10更新。 如发现问题或者有建议,欢迎提交 Issue

o # map

map函数相当于for循环,但是不需要注意for(){}的缩进,而且代码写起来,简单些。

mtcars %>%
  group_by(cyl) %>%
  nest() %>%
  mutate(p = map(data,
                 function(x){
                   p <- x %>% ggplot(aes(x = mpg, y = disp)) +
                     geom_point() +
                     labs(
                       x = "Miles/(US) gallon",
                       y = "Displacement (cu.in.)",
                       subtitle = paste("cyl =",cyl),
                       title = "The scatter diagram for Miles/(US) gallon by Displacement (cu.in.)",
                       caption = "Data source: mtcars"
                     ) +
                     theme_minimal()
                   print(p)}))

## # A tibble: 3 x 3
## # Groups:   cyl [3]
##     cyl data               p     
##   <dbl> <list>             <list>
## 1     6 <tibble [7 x 10]>  <gg>  
## 2     4 <tibble [11 x 10]> <gg>  
## 3     8 <tibble [14 x 10]> <gg>

purrr::map可以批量跑函数,但是会出现报错,这个时候possiblysafely可以给报错的打个标签,继续跑 rdata.lu BlogWickham et al. (n.d.) 在第四和第五部分主要解释了这两个函数。

1 safely()函数

# Create safe_readLines() by passing readLines() to safely()
safe_readLines <- safely(readLines)

# Call safe_readLines() on "http://example.org"
safe_readLines("http://example.org")

# Call safe_readLines() on "http://asdfasdasdkfjlda"
safe_readLines("http://asdfasdasdkfjlda")

H. Wickham (2018a) 给出了safely()函数的例子。 显然第二个例子不是一个网页,因此会报错,因此两个例子的区别在于,

  • 成功了,因此$result非空,$error
  • 失败了,因此$result空,$error非空

1.1 map 和 safely

urls <- list(
  example = "http://example.org",
  rproj = "http://www.r-project.org",
  asdf = "http://asdfasdasdkfjlda"
)
# Define safe_readLines()
safe_readLines <- safely(readLines)

# Use the safe_readLines() function with map(): html
html <- map(urls, safe_readLines)

# Call str() on html
str(html)

# Extract the result from one of the successful elements
map(html, "result")

# Extract the error from the element that was unsuccessful
map(html, "error")

这里提供了提取resulterror的方式 (H. Wickham 2018b)

2 possibly函数解释

possibly_some_function = possibly(some_function, otherwise = "something wrong here")是safely的简单版本。

3 pwalk批量操作

看了一下 Wickham and Grolemund (2017) 的书,真是不错啊,不小心重新看一下, 如获至宝。 例如,

Wickham and Grolemund (2017, 336)提到批量保存ggplot图片的方式。

library(ggplot2)
    plots <- mtcars %>%
      split(.$cyl) %>%
      map(~ggplot(., aes(mpg, wt)) + geom_point())
    paths <- stringr::str_c(names(plots), ".pdf")
pwalk(list(paths, plots), ggsave, path = getwd())
  • 使用了splitmap函数,批量使用函数ggplot
  • str_c用于图片命名,功效类似于paste
  • 最后最强的是pwalk,给予一个list,然后使用ggsave

4 rerun: resample

library(purrr)
10 %>% rerun(rnorm(5))
## [[1]]
## [1] -0.45299858 -1.40542814 -0.04279963  0.72608261 -1.41528489
## 
## [[2]]
## [1] -0.4615507  1.6824498 -1.1289021 -1.2268022 -0.4856850
## 
## [[3]]
## [1] -0.3918068 -0.3901946  0.1806043 -1.6189512 -1.8983127
## 
## [[4]]
## [1]  0.7468726  0.6292661  0.3700957 -0.4464054 -0.0136263
## 
## [[5]]
## [1]  0.3604068  1.3956616 -0.2069600 -0.9912203 -0.5801156
## 
## [[6]]
## [1]  0.1537699 -0.1276541 -0.2858236 -0.1230204  0.3429197
## 
## [[7]]
## [1] -0.3390390  0.6448945 -1.1980548  2.4970258  1.2497283
## 
## [[8]]
## [1] 0.13972860 1.78194093 0.05602954 0.57862112 1.21168028
## 
## [[9]]
## [1] -0.8268868 -0.7651689  1.0638391  0.2099143 -0.4746792
## 
## [[10]]
## [1] -0.2862176 -0.9219227  0.5225702  1.4487096  1.2010788
10 %>%
  rerun(x = rnorm(5),y= rnorm(5)) %>%
  map_dbl(~ cor(.x$x, .x$y))
##  [1] -0.15088408 -0.79703083  0.72777903 -0.07017608 -0.75215386  0.64780472
##  [7]  0.43406793 -0.26921301 -0.57959740  0.33200593

5 map替代mutate

FAY (2017) 提供的例子。

library(purrr)
airquality %>% glimpse()
## Rows: 153
## Columns: 6
## $ Ozone   <int> 41, 36, 12, 18, NA, 28, 23, 19, 8, NA, 7, 16, 11, 14, 18, 1...
## $ Solar.R <int> 190, 118, 149, 313, NA, NA, 299, 99, 19, 194, NA, 256, 290,...
## $ Wind    <dbl> 7.4, 8.0, 12.6, 11.5, 14.3, 14.9, 8.6, 13.8, 20.1, 8.6, 6.9...
## $ Temp    <int> 67, 72, 74, 62, 56, 66, 65, 59, 61, 69, 74, 69, 66, 68, 58,...
## $ Month   <int> 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,...
## $ Day     <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ...

shapiro.test: Performs the Shapiro-Wilk test of normality.

airquality %>% map(shapiro.test)
## $Ozone
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.87867, p-value = 2.79e-08
## 
## 
## $Solar.R
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.94183, p-value = 9.492e-06
## 
## 
## $Wind
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.98575, p-value = 0.1178
## 
## 
## $Temp
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.97617, p-value = 0.009319
## 
## 
## $Month
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.88804, p-value = 2.258e-09
## 
## 
## $Day
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.95313, p-value = 5.048e-05

6 keep替代filter

airquality %>%
  map(shapiro.test)%>%
  keep(~ .x$p.value > 0.05)
## $Wind
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.98575, p-value = 0.1178

FAY (2017) 提供的例子。 keep业务上非常方便,直接选择shapiro.test反馈的结果.$...之一作为筛选条件,不需要建立data.frame再使用filter函数。

7 map_if替代mutate_if

map_if(iris, is.numeric, shapiro.test)
## $Sepal.Length
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.97609, p-value = 0.01018
## 
## 
## $Sepal.Width
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.98492, p-value = 0.1012
## 
## 
## $Petal.Length
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.87627, p-value = 7.412e-10
## 
## 
## $Petal.Width
## 
##  Shapiro-Wilk normality test
## 
## data:  .x[[i]]
## W = 0.90183, p-value = 1.68e-08
## 
## 
## $Species
##   [1] setosa     setosa     setosa     setosa     setosa     setosa    
##   [7] setosa     setosa     setosa     setosa     setosa     setosa    
##  [13] setosa     setosa     setosa     setosa     setosa     setosa    
##  [19] setosa     setosa     setosa     setosa     setosa     setosa    
##  [25] setosa     setosa     setosa     setosa     setosa     setosa    
##  [31] setosa     setosa     setosa     setosa     setosa     setosa    
##  [37] setosa     setosa     setosa     setosa     setosa     setosa    
##  [43] setosa     setosa     setosa     setosa     setosa     setosa    
##  [49] setosa     setosa     versicolor versicolor versicolor versicolor
##  [55] versicolor versicolor versicolor versicolor versicolor versicolor
##  [61] versicolor versicolor versicolor versicolor versicolor versicolor
##  [67] versicolor versicolor versicolor versicolor versicolor versicolor
##  [73] versicolor versicolor versicolor versicolor versicolor versicolor
##  [79] versicolor versicolor versicolor versicolor versicolor versicolor
##  [85] versicolor versicolor versicolor versicolor versicolor versicolor
##  [91] versicolor versicolor versicolor versicolor versicolor versicolor
##  [97] versicolor versicolor versicolor versicolor virginica  virginica 
## [103] virginica  virginica  virginica  virginica  virginica  virginica 
## [109] virginica  virginica  virginica  virginica  virginica  virginica 
## [115] virginica  virginica  virginica  virginica  virginica  virginica 
## [121] virginica  virginica  virginica  virginica  virginica  virginica 
## [127] virginica  virginica  virginica  virginica  virginica  virginica 
## [133] virginica  virginica  virginica  virginica  virginica  virginica 
## [139] virginica  virginica  virginica  virginica  virginica  virginica 
## [145] virginica  virginica  virginica  virginica  virginica  virginica 
## Levels: setosa versicolor virginica

map_if allows you to map only on numeric variables in your data.frame

因此map功效远大于mutatemap将每个columns当成一个element。 FAY (2017) 提供的例子。

numbers <- list(11, 12, 13, 14)
is_even <- function(x){
  !as.logical(x %% 2)
}
map_if(numbers, is_even, sqrt)
## [[1]]
## [1] 11
## 
## [[2]]
## [1] 3.464102
## 
## [[3]]
## [1] 13
## 
## [[4]]
## [1] 3.741657

map_if函数的例子来自 Rodrigues (2017)

8 map_at

map_at(numbers, c(1,3), sqrt)
## [[1]]
## [1] 3.316625
## 
## [[2]]
## [1] 12
## 
## [[3]]
## [1] 3.605551
## 
## [[4]]
## [1] 14

map_at函数的例子来自 Rodrigues (2017)

9 transpose函数

这里主要介绍transpose函数(H. Wickham 2018d)

例如

nested_list <- list(
   x1 = list(a = 1, b = 2),
   x2 = list(a = 3, b = 4)
)

当我们想要提取x1中的a时,我们需要写nested_list[[x1]][[a]],按着正序写的。 但是当我们对listnested_list使用transpose函数后,可以倒序写,transpose(nested_list)[[a]][[x1]]。 因此当我们的需要提取一个list中,非常inside的数据时,并且,都是统一命名,例如a,这个方法会非常管用。

# Define save_readLines() and html
safe_readLines <- safely(readLines)
html <- map(urls, safe_readLines)

# Examine the structure of transpose(html)
str(transpose(html))

# Extract the results: res
res <- transpose(html)[["result"]]

# Extract the errors: errs
errs <- transpose(html)[["error"]]

这里可以看到,以resulterror的list顺序排在了前面,不是原来的examplerprojasdf

# Initialize some objects
safe_readLines <- safely(readLines)
html <- map(urls, safe_readLines)
res <- transpose(html)[["result"]]
errs <- transpose(html)[["error"]]

# Create a logical vector is_ok
is_ok <- map_lgl(errs, is_null)

# Extract the successful results
res[is_ok]

# Extract the input from the unsuccessful results
urls[!is_ok]

最后一步就提取出来那一个元素出了问题urls[!is_ok](H. Wickham 2018c)

9.1 transposemodify_depth

numbers_with_error <- list(1, 2, 3, "spam", 4)
numbers_with_error
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 2
## 
## [[3]]
## [1] 3
## 
## [[4]]
## [1] "spam"
## 
## [[5]]
## [1] 4
safe_sqrt <- safely(sqrt, otherwise = NA_real_)
safe_result_list <- numbers_with_error %>% map(safe_sqrt)
safe_result_list
## [[1]]
## [[1]]$result
## [1] 1
## 
## [[1]]$error
## NULL
## 
## 
## [[2]]
## [[2]]$result
## [1] 1.414214
## 
## [[2]]$error
## NULL
## 
## 
## [[3]]
## [[3]]$result
## [1] 1.732051
## 
## [[3]]$error
## NULL
## 
## 
## [[4]]
## [[4]]$result
## [1] NA
## 
## [[4]]$error
## <simpleError in .Primitive("sqrt")(x): 数学函数中用了非数值参数>
## 
## 
## [[5]]
## [[5]]$result
## [1] 2
## 
## [[5]]$error
## NULL
transposed_list <- transpose(safe_result_list)
transposed_list
## $result
## $result[[1]]
## [1] 1
## 
## $result[[2]]
## [1] 1.414214
## 
## $result[[3]]
## [1] 1.732051
## 
## $result[[4]]
## [1] NA
## 
## $result[[5]]
## [1] 2
## 
## 
## $error
## $error[[1]]
## NULL
## 
## $error[[2]]
## NULL
## 
## $error[[3]]
## NULL
## 
## $error[[4]]
## <simpleError in .Primitive("sqrt")(x): 数学函数中用了非数值参数>
## 
## $error[[5]]
## NULL
transposed_list %>%
    modify_depth(2, is_null)
## $result
## $result[[1]]
## [1] FALSE
## 
## $result[[2]]
## [1] FALSE
## 
## $result[[3]]
## [1] FALSE
## 
## $result[[4]]
## [1] FALSE
## 
## $result[[5]]
## [1] FALSE
## 
## 
## $error
## $error[[1]]
## [1] TRUE
## 
## $error[[2]]
## [1] TRUE
## 
## $error[[3]]
## [1] TRUE
## 
## $error[[4]]
## [1] FALSE
## 
## $error[[5]]
## [1] TRUE

modify_depth可以直接使用函数作用于对应的层级 Rodrigues (2017)

9.2 map的替换

mylist <- 
    list(
        list01= list(
            var01 = 1:3
            ,var02 = 1:5
            ,var03 = 1:7
        )
        ,list02= list(
            var01 = 1:9
            ,var02 = 1:11
            ,var03 = 1:13
        )
        
    )
str(mylist)
## List of 2
##  $ list01:List of 3
##   ..$ var01: int [1:3] 1 2 3
##   ..$ var02: int [1:5] 1 2 3 4 5
##   ..$ var03: int [1:7] 1 2 3 4 5 6 7
##  $ list02:List of 3
##   ..$ var01: int [1:9] 1 2 3 4 5 6 7 8 9
##   ..$ var02: int [1:11] 1 2 3 4 5 6 7 8 9 10 ...
##   ..$ var03: int [1:13] 1 2 3 4 5 6 7 8 9 10 ...
library(purrr)
mylist %>% 
    transpose() %>% 
    .$var02 %>% 
    unlist %>% 
    table()
## .
##  1  2  3  4  5  6  7  8  9 10 11 
##  2  2  2  2  2  1  1  1  1  1  1
# or TBT8's answer
# https://stackoverflow.com/a/53213468/8625228

mylist %>% 
    map(~ .[['var02']] == 1) %>% 
    unlist %>% 
    sum
## [1] 2

10 Set names of list elements

name_element <- c("sqrt()", "ok?")
transposed_list %>% set_names(name_element)
## $`sqrt()`
## $`sqrt()`[[1]]
## [1] 1
## 
## $`sqrt()`[[2]]
## [1] 1.414214
## 
## $`sqrt()`[[3]]
## [1] 1.732051
## 
## $`sqrt()`[[4]]
## [1] NA
## 
## $`sqrt()`[[5]]
## [1] 2
## 
## 
## $`ok?`
## $`ok?`[[1]]
## NULL
## 
## $`ok?`[[2]]
## NULL
## 
## $`ok?`[[3]]
## NULL
## 
## $`ok?`[[4]]
## <simpleError in .Primitive("sqrt")(x): 数学函数中用了非数值参数>
## 
## $`ok?`[[5]]
## NULL

set_names函数来自purrr包,详见 复制一个data.frame

11 reduce反馈最后一个值

numbers
## [[1]]
## [1] 11
## 
## [[2]]
## [1] 12
## 
## [[3]]
## [1] 13
## 
## [[4]]
## [1] 14
reduce(numbers, `*`)
## [1] 24024

还可以批量完成计算。 reduce函数的例子来自 Rodrigues (2017)

set.seed(123)
mat1 <- matrix(rnorm(10), nrow = 2)
mat2 <- matrix(rnorm(10), nrow = 2)
mat3 <- matrix(rnorm(10), nrow = 2)
list_mat <- list(mat1, mat2, mat3)
reduce(list_mat, `+`)
##             [,1]       [,2]      [,3]      [,4]       [,5]
## [1,] -0.40421756  0.9334753 -1.051593  1.796554 -1.1236339
## [2,] -0.08833858 -0.5477001  1.815285 -3.078305  0.3353615

想象从每一个矩阵的对应位置\(A_{i,j}\)取出一个值,作为一个list,一起完成+的计算。 reduce函数的例子来自 Rodrigues (2017)

11.1 reduce 多表合并

df1 <- as.data.frame(mat1)
df2 <- as.data.frame(mat2)
df3 <- as.data.frame(mat3)
list_df <- list(df1, df2, df3)
reduce(list_df, dplyr::full_join)
##           V1          V2         V3         V4         V5
## 1 -0.5604756  1.55870831  0.1292877  0.4609162 -0.6868529
## 2 -0.2301775  0.07050839  1.7150650 -1.2650612 -0.4456620
## 3  1.2240818  0.40077145 -0.5558411  0.4978505  0.7013559
## 4  0.3598138  0.11068272  1.7869131 -1.9666172 -0.4727914
## 5 -1.0678237 -1.02600445 -0.6250393  0.8377870 -1.1381369
## 6 -0.2179749 -0.72889123 -1.6866933  0.1533731  1.2538149

reduce函数的例子来自 Rodrigues (2017)

因此可以多表read进行合并。

as.list(list.files()) %>% map(function(x) read.csv(x)) %>% reduce(rbind)

12 accumulateaccumulate_right

accumulate(numbers, `*`)
## [1]    11   132  1716 24024
accumulate_right(numbers, `*`)
## [1] 24024  2184   182    14
  • accumulate是按左向右滚动
  • accumulate_right是按右向左滚动 例子来自 Rodrigues (2017)
1:3 %>% accumulate(`+`)
## [1] 1 3 6
1:10 %>% accumulate_right(`*`)
##  [1] 3628800 3628800 1814400  604800  151200   30240    5040     720      90
## [10]      10
# From Haskell's scanl documentation
1:10 %>% accumulate(max, .init = 5)
##  [1]  5  5  5  5  5  5  6  7  8  9 10
# Understanding the arguments .x and .y when .f
# is a lambda function
# .x is the accumulating value
1:10 %>% accumulate(~ .x)
##  [1] 1 1 1 1 1 1 1 1 1 1
1:10 %>% accumulate(~ 1)
##  [1] 1 1 1 1 1 1 1 1 1 1
# .y is element in the list
1:10 %>% accumulate(~ .y)
##  [1]  1  2  3  4  5  6  7  8  9 10
library(dplyr)
library(ggplot2)

set.seed(123)
rerun(5, rnorm(100)) %>%
  set_names(paste0("sim", 1:5)) %>%
  map(~ accumulate(., ~ .05 + .x + .y)) %>%
  map_dfr(~ data_frame(value = .x, step = 1:100), .id = "simulation") %>%
  ggplot(aes(x = step, y = value)) +
    geom_line(aes(color = simulation)) +
    ggtitle("Simulations of a random walk with drift")

rerun参考 4

13 相关系数\(\rho\)和p value分析

FAY (2017) 提供的例子。 使用函数tidy_comb_all(Fay 2018)进行。

library(tidystringdist)
comb <- tidy_comb_all(names(airquality))
comb
## # A tibble: 15 x 2
##    V1      V2     
##  * <chr>   <chr>  
##  1 Ozone   Solar.R
##  2 Ozone   Wind   
##  3 Ozone   Temp   
##  4 Ozone   Month  
##  5 Ozone   Day    
##  6 Solar.R Wind   
##  7 Solar.R Temp   
##  8 Solar.R Month  
##  9 Solar.R Day    
## 10 Wind    Temp   
## 11 Wind    Month  
## 12 Wind    Day    
## 13 Temp    Month  
## 14 Temp    Day    
## 15 Month   Day
bulk_cor <-
  comb %>%
  pmap(~ cor.test(airquality[[.x]], airquality[[.y]])) %>%
  map_df(broom::tidy) %>%
  bind_cols(comb, .)
bulk_cor
## # A tibble: 15 x 10
##    V1    V2    estimate statistic  p.value parameter conf.low conf.high method
##    <chr> <chr>    <dbl>     <dbl>    <dbl>     <int>    <dbl>     <dbl> <chr> 
##  1 Ozone Sola~  0.348      3.88   1.79e- 4       109   0.173     0.502  Pears~
##  2 Ozone Wind  -0.602     -8.04   9.27e-13       114  -0.706    -0.471  Pears~
##  3 Ozone Temp   0.698     10.4    2.93e-18       114   0.591     0.781  Pears~
##  4 Ozone Month  0.165      1.78   7.76e- 2       114  -0.0183    0.337  Pears~
##  5 Ozone Day   -0.0132    -0.141  8.88e- 1       114  -0.195     0.169  Pears~
##  6 Sola~ Wind  -0.0568    -0.683  4.96e- 1       144  -0.217     0.107  Pears~
##  7 Sola~ Temp   0.276      3.44   7.52e- 4       144   0.119     0.419  Pears~
##  8 Sola~ Month -0.0753    -0.906  3.66e- 1       144  -0.235     0.0882 Pears~
##  9 Sola~ Day   -0.150     -1.82   7.02e- 2       144  -0.305     0.0125 Pears~
## 10 Wind  Temp  -0.458     -6.33   2.64e- 9       151  -0.575    -0.323  Pears~
## 11 Wind  Month -0.178     -2.23   2.75e- 2       151  -0.328    -0.0202 Pears~
## 12 Wind  Day    0.0272     0.334  7.39e- 1       151  -0.132     0.185  Pears~
## 13 Temp  Month  0.421      5.70   6.03e- 8       151   0.281     0.543  Pears~
## 14 Temp  Day   -0.131     -1.62   1.08e- 1       151  -0.283     0.0287 Pears~
## 15 Month Day   -0.00796   -0.0978 9.22e- 1       151  -0.166     0.151  Pears~
## # ... with 1 more variable: alternative <chr>

[[.x]] 这种写法非常优秀,达到了哈希匹配的方式。

14 partial提前预设参数,进行函数封装

FAY (2017) 提供的例子。

titanic <- read_csv("../../../picbackup/titanic.csv")
set.seed(20)
train <- rerun(20, sample_frac(titanic, size = 0.8))
validation <- map(train, ~ anti_join(titanic, .x))
  • rerun: 实现bootstrap的功能。
  • anti_join: 实现train and test split,并且不用建立index,一行代码完成。
map_int(validation, nrow) %>% every(~ .x == 262)

检验每个测试集的样本量一样。

library(rpart)
rpart_pimped <- partial(rpart, formula = survived ~ sex, method = "class")
res <- map(train, ~ rpart_pimped(data = .x))
prediction <- map2(validation, res, ~ predict(.y, .x, type = "class"))
w_prediction <- map2(validation, prediction, ~ mutate(.x, prediction = .y))
library(caret)
conf_mats <- map(w_prediction, ~ confusionMatrix(.x$prediction, .x$survived))

Error:dataandreferenceshould be factors with the same levels.这里有报错。 应该是\(y\)\(\hat y\)长度不一样。

map_dbl(conf_mats, ~ .x$byClass["Sensitivity"]) %>% every(~ .x > 0.8)

15 someevery

mtcars %>%
    some(is.infinite)
## [1] FALSE
  • 这使得infinity的检验可以产生类似于anyNA的效果。
  • some(): 列表中的某些元素是否满足要求? 返回结果TRUE/FALSE
  • every(): 列表中的每个元素是否满足要求?(邬书豪 2017)

16 Use list, UDF in map*

my_summarise_f函数参考 Rodrigues (2018)

my_summarise_f <- function(dataset, cols, funcs){
  dataset %>%
    summarise_at(vars(!!!cols), funs(!!!funcs))
}
mtcars %>%
  my_summarise_f(quos(mpg, drat, hp), quos(mean, sd, max))
##   mpg_mean drat_mean  hp_mean   mpg_sd   drat_sd    hp_sd mpg_max drat_max
## 1 20.09062  3.596563 146.6875 6.026948 0.5346787 68.56287    33.9     4.93
##   hp_max
## 1    335
data_list = list(mtcars, iris)

    cols_mtcars = quos(mpg, drat, hp)
    cols_iris = quos(Sepal.Length, Sepal.Width)

cols_list = list(cols_mtcars, cols_iris)

    funcs_mtcars = quos(mean, sd, max)
    funcs_iris = quos(median, min)

funcs_list = list(funcs_mtcars, funcs_iris)

map2(data_list,
     cols_list,
     my_summarise_f, funcs = quos(mean, sd, max))
## [[1]]
##   mpg_mean drat_mean  hp_mean   mpg_sd   drat_sd    hp_sd mpg_max drat_max
## 1 20.09062  3.596563 146.6875 6.026948 0.5346787 68.56287    33.9     4.93
##   hp_max
## 1    335
## 
## [[2]]
##   Sepal.Length_mean Sepal.Width_mean Sepal.Length_sd Sepal.Width_sd
## 1          5.843333         3.057333       0.8280661      0.4358663
##   Sepal.Length_max Sepal.Width_max
## 1              7.9             4.4
pmap(
  list(
    dataset = data_list,
    cols = cols_list,
    funcs = funcs_list
  ),
  my_summarise_f)
## [[1]]
##   mpg_mean drat_mean  hp_mean   mpg_sd   drat_sd    hp_sd mpg_max drat_max
## 1 20.09062  3.596563 146.6875 6.026948 0.5346787 68.56287    33.9     4.93
##   hp_max
## 1    335
## 
## [[2]]
##   Sepal.Length_median Sepal.Width_median Sepal.Length_min Sepal.Width_min
## 1                 5.8                  3              4.3               2

17 反馈 NULL 的解决情况

map 反馈 NULL 的情况有些讨论Github Issue 231,参考Stack Overflow使用 if else 完成。

具体参考github的代码。

参考文献

FAY, Colin. 2017. “A Crazy Little Thing Called purrr - Part 6 : Doing Statistics.” 2017. https://colinfay.me/purrr-statistics/.

Fay, Colin. 2018. Tidystringdist: String Distance Calculation with Tidy Data Principles. https://CRAN.R-project.org/package=tidystringdist.

Rodrigues, Bruno. 2017. “Lesser Known Purrr Tricks.” 2017. http://www.brodrigues.co/blog/2017-03-24-lesser_known_purrr/.

———. 2018. “Mapping a List of Functions to a List of Datasets with a List of Columns as Arguments.” 2018. http://www.brodrigues.co/blog/2018-01-19-mapping_functions_with_any_cols/.

Wickham, Hadley. 2018a. “Writing Functions in R: Creating a Safe Function.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=2.

———. 2018b. “Writing Functions in R: Using Map Safely.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=3.

———. 2018c. “Writing Functions in R: Working with Errors and Results.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=5.

———. 2018d. “Writing Functions in R: Working with Safe Output.” 2018. https://campus.datacamp.com/courses/writing-functions-in-r/advanced-inputs-and-outputs?ex=4.

Wickham, Hadley, and Garrett Grolemund. 2017. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media, Inc.

Wickham, Hadley, Charlotte Wickham, Nick Carchedi, and Tom Jeon. n.d. Writing Functions in R. DataCamp. https://www.dreatacamp.com/courses/writing-functions-in-r.

邬书豪. 2017. “R for Data Science之purrr包(上).” 2017. https://zhuanlan.zhihu.com/p/32293221.