本文于2020-10-10更新。 如发现问题或者有建议,欢迎提交 Issue
rlist
包是任坤开发,用于处理非表列数据(non-tabular data),常见数据类型为JSON和YAML字段。
1 pipeline的选择
%>%
also works with rlist functions. However, in some cases, the operator may impose conflicting interpretation on.
symbol to cause unexpected error.(Ren 2016)
这里会考虑使用pipeR:%>>%:
。
2 JSON格式
In the coming tutorial pages, we will mainly use JSON data to demonstrate the features and examples of rlist package.
[]
creates a unnamed node array.{}
creates a named node list."key" : value
creates a key-value pair wherevalue
can be a number, a string, a[]
array, or a{}
list.
rlist
package imports jsonlite
package to read/write JSON data.
3 YAML格式
rlist
also imports yaml package to read/writeYAML
data.
- Name: Ken
Age: 24
Interests:
- reading
- music
- movies
Expertise:
R: 2
CSharp: 4
Python: 3
- Name: James
Age: 25
Interests:
- sports
- music
Expertise:
R: 3
Java: 2
Cpp: 5
- Name: Penny
Age: 24
Interests:
- movies
- reading
Expertise:
R: 1
Cpp: 4
Python: 2
数据来源于 Ren (2016) https://renkun-ken.github.io/rlist-tutorial/data/sample.yaml 常见于RMarkdown文档的开头申明。
博客可以使用这个来提取yaml数据
4 举例
library(tidyverse)
## Warning: 程辑包'tidyverse'是用R版本3.6.3 来建造的
## -- Attaching packages --------------------------------------------------------------------------- tidyverse 1.3.0 --
## √ ggplot2 3.3.2 √ purrr 0.3.4
## √ tibble 3.0.3 √ dplyr 1.0.2
## √ tidyr 1.1.2 √ stringr 1.4.0
## √ readr 1.3.1 √ forcats 0.5.0
## Warning: 程辑包'ggplot2'是用R版本3.6.3 来建造的
## Warning: 程辑包'tibble'是用R版本3.6.3 来建造的
## Warning: 程辑包'tidyr'是用R版本3.6.3 来建造的
## Warning: 程辑包'readr'是用R版本3.6.3 来建造的
## Warning: 程辑包'purrr'是用R版本3.6.3 来建造的
## Warning: 程辑包'dplyr'是用R版本3.6.3 来建造的
## Warning: 程辑包'stringr'是用R版本3.6.3 来建造的
## Warning: 程辑包'forcats'是用R版本3.6.3 来建造的
## -- Conflicts ------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(rlist)
## Warning: 程辑包'rlist'是用R版本3.6.3 来建造的
people <- list.load("data/sample.json")
str(people)
## List of 3
## $ :List of 4
## ..$ Name : chr "Ken"
## ..$ Age : int 24
## ..$ Interests: chr [1:3] "reading" "music" "movies"
## ..$ Expertise:List of 3
## .. ..$ R : int 2
## .. ..$ CSharp: int 4
## .. ..$ Python: int 3
## $ :List of 4
## ..$ Name : chr "James"
## ..$ Age : int 25
## ..$ Interests: chr [1:2] "sports" "music"
## ..$ Expertise:List of 3
## .. ..$ R : int 3
## .. ..$ Java: int 2
## .. ..$ Cpp : int 5
## $ :List of 4
## ..$ Name : chr "Penny"
## ..$ Age : int 24
## ..$ Interests: chr [1:2] "movies" "reading"
## ..$ Expertise:List of 3
## .. ..$ R : int 1
## .. ..$ Cpp : int 4
## .. ..$ Python: int 2
NOTE:
str()
previews the structure of an object. We may use this function more often to avoid verbose representation of list objects.
这个函数展示很不错。
people %>%
map(
~ list(
name = .$Name
,age =
.$Age
,range =
.$Expertise %>%
as.numeric %>%
range %>%
as.character %>%
str_flatten('-')
)
) %>%
map(~glue::glue("{.$name}: {.$age} and has {.$range} year experience"))
## [[1]]
## Ken: 24 and has 2-4 year experience
##
## [[2]]
## James: 25 and has 2-5 year experience
##
## [[3]]
## Penny: 24 and has 1-4 year experience
这个可以用来出tutorial的题目。
5 filter
people %>%
list.filter(Age >= 25 & "music" %in% Interests) %>%
map(~.$Name)
## [[1]]
## [1] "James"
这个还没找好很好的替代方式。
list.find
list.findi
list.first
list.last
list.take
list.skip
太多了,没什么好选择的。
是list.filter
的替代品,只反馈一部分的字段,因为json格式很大时,反馈时间很长。
Ren, Kun. 2016. “Rlist Tutorial.” 2016. https://renkun-ken.github.io/rlist-tutorial/index.html.