3 min read

rlist 的使用技巧

本文于2020-10-10更新。 如发现问题或者有建议,欢迎提交 Issue

rlist包是任坤开发,用于处理非表列数据(non-tabular data),常见数据类型为JSON和YAML字段。

1 pipeline的选择

%>% also works with rlist functions. However, in some cases, the operator may impose conflicting interpretation on . symbol to cause unexpected error.(Ren 2016)

这里会考虑使用pipeR:%>>%:

2 JSON格式

In the coming tutorial pages, we will mainly use JSON data to demonstrate the features and examples of rlist package.

  1. [] creates a unnamed node array.
  2. {} creates a named node list.
  3. "key" : value creates a key-value pair where value can be a number, a string, a [] array, or a {} list.

rlist package imports jsonlite package to read/write JSON data.

3 YAML格式

rlist also imports yaml package to read/write YAML data.

- Name: Ken
  Age: 24
  Interests:
  - reading
  - music
  - movies
  Expertise:
    R: 2
    CSharp: 4
    Python: 3
- Name: James
  Age: 25
  Interests:
  - sports
  - music
  Expertise:
    R: 3
    Java: 2
    Cpp: 5
- Name: Penny
  Age: 24
  Interests:
  - movies
  - reading
  Expertise:
    R: 1
    Cpp: 4
    Python: 2

数据来源于 Ren (2016) https://renkun-ken.github.io/rlist-tutorial/data/sample.yaml 常见于RMarkdown文档的开头申明。

博客可以使用这个来提取yaml数据

4 举例

library(tidyverse)
## Warning: 程辑包'tidyverse'是用R版本3.6.3 来建造的
## -- Attaching packages --------------------------------------------------------------------------- tidyverse 1.3.0 --
## √ ggplot2 3.3.2     √ purrr   0.3.4
## √ tibble  3.0.3     √ dplyr   1.0.2
## √ tidyr   1.1.2     √ stringr 1.4.0
## √ readr   1.3.1     √ forcats 0.5.0
## Warning: 程辑包'ggplot2'是用R版本3.6.3 来建造的
## Warning: 程辑包'tibble'是用R版本3.6.3 来建造的
## Warning: 程辑包'tidyr'是用R版本3.6.3 来建造的
## Warning: 程辑包'readr'是用R版本3.6.3 来建造的
## Warning: 程辑包'purrr'是用R版本3.6.3 来建造的
## Warning: 程辑包'dplyr'是用R版本3.6.3 来建造的
## Warning: 程辑包'stringr'是用R版本3.6.3 来建造的
## Warning: 程辑包'forcats'是用R版本3.6.3 来建造的
## -- Conflicts ------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(rlist)
## Warning: 程辑包'rlist'是用R版本3.6.3 来建造的
people <- list.load("data/sample.json")
str(people)
## List of 3
##  $ :List of 4
##   ..$ Name     : chr "Ken"
##   ..$ Age      : int 24
##   ..$ Interests: chr [1:3] "reading" "music" "movies"
##   ..$ Expertise:List of 3
##   .. ..$ R     : int 2
##   .. ..$ CSharp: int 4
##   .. ..$ Python: int 3
##  $ :List of 4
##   ..$ Name     : chr "James"
##   ..$ Age      : int 25
##   ..$ Interests: chr [1:2] "sports" "music"
##   ..$ Expertise:List of 3
##   .. ..$ R   : int 3
##   .. ..$ Java: int 2
##   .. ..$ Cpp : int 5
##  $ :List of 4
##   ..$ Name     : chr "Penny"
##   ..$ Age      : int 24
##   ..$ Interests: chr [1:2] "movies" "reading"
##   ..$ Expertise:List of 3
##   .. ..$ R     : int 1
##   .. ..$ Cpp   : int 4
##   .. ..$ Python: int 2

NOTE: str() previews the structure of an object. We may use this function more often to avoid verbose representation of list objects.

这个函数展示很不错。

people %>% 
    map(
        ~ list(
            name = .$Name
            ,age = 
                .$Age
            ,range = 
                .$Expertise %>% 
                as.numeric %>% 
                range %>% 
                as.character %>% 
                str_flatten('-')
        )
    ) %>% 
    map(~glue::glue("{.$name}: {.$age} and has {.$range} year experience"))
## [[1]]
## Ken: 24 and has 2-4 year experience
## 
## [[2]]
## James: 25 and has 2-5 year experience
## 
## [[3]]
## Penny: 24 and has 1-4 year experience

这个可以用来出tutorial的题目。

5 filter

people %>% 
    list.filter(Age >= 25 & "music" %in% Interests) %>% 
    map(~.$Name)
## [[1]]
## [1] "James"

这个还没找好很好的替代方式。

  1. list.find
  2. list.findi
  3. list.first
  4. list.last
  5. list.take
  6. list.skip

太多了,没什么好选择的。

list.filter 的替代品,只反馈一部分的字段,因为json格式很大时,反馈时间很长。

Ren, Kun. 2016. “Rlist Tutorial.” 2016. https://renkun-ken.github.io/rlist-tutorial/index.html.