本文于2026-03-28更新。 如发现问题或者有建议,欢迎提交 Issue
## Error in `library()`:
## ! there is no package called 'visdat'
## Error in `library()`:
## ! there is no package called 'tidyverse'
## Error in `library()`:
## ! there is no package called 'naniar'
缺失值位置
## Error in `vis_dat()`:
## ! could not find function "vis_dat"
## Error in `vis_miss()`:
## ! could not find function "vis_miss"
- 缺点是不能进行
ggplot2函数的叠加。 ( Github )
缺失值分布
One approach to visualising missing data comes from
ggobiandmanet, where we replace “NA” values with values 10% lower than the minimum value in that variable. [@Tierney2018]
缺失值的展示方法中,一种是使用最小值还要小10%的值来代替,这样就可以进行散点图等方式进行展示。
ggplot(airquality,
aes(x = Solar.R,
y = Ozone)) +
geom_miss_point() +
labs(
caption = "Jiaxiang Li - jiaxiangli.netlify.com"
)## Error in `ggplot()`:
## ! could not find function "ggplot"
缺失值数量
## Error in `gg_miss_var()`:
## ! could not find function "gg_miss_var"
样本缺失值情况
## Error in `miss_case_summary()`:
## ! could not find function "miss_case_summary"
case是行的indexn_miss是某行缺失值的个数,pct_miss是某行缺失值的比例。
## Error in `miss_case_table()`:
## ! could not find function "miss_case_table"
再按照n_miss_in_case进行了汇总。
样本缺失值预测
根据miss_case_summary可以知道每个样本的缺失情况,可以对这个缺失率或者值,进行模型预测,看哪个变量比较显著。
## Error in `library()`:
## ! there is no package called 'rpart.plot'
airquality %>%
add_prop_miss() %>%
rpart(prop_miss_all ~ ., data = .) %>%
prp(type = 4, extra = 101, prefix = "Prop. Miss = ") ## Error in `airquality %>% add_prop_miss() %>% rpart(prop_miss_all ~ ., data = .) %>% prp(
## type = 4, extra = 101, prefix = "Prop. Miss = ")`:
## ! could not find function "%>%"
变量缺失值情况
## Error in `miss_var_summary()`:
## ! could not find function "miss_var_summary"
## Error in `miss_var_table()`:
## ! could not find function "miss_var_table"
## Error in `pedestrian %>% group_by(month) %>% miss_var_summary() %>% filter(variable ==
## "hourly_counts")`:
## ! could not find function "%>%"
填补缺失值
## Error in `library()`:
## ! there is no package called 'simputation'
ocean_imp <- oceanbuoys %>%
bind_shadow() %>%
impute_lm(air_temp_c ~ wind_ew + wind_ns) %>%
impute_lm(humidity ~ wind_ew + wind_ns) %>%
impute_lm(sea_temp_c ~ wind_ew + wind_ns) %>%
add_label_shadow() %>%
paged_table()## Error in `oceanbuoys %>% bind_shadow() %>% impute_lm(air_temp_c ~ wind_ew + wind_ns) %>%
## impute_lm(humidity ~ wind_ew + wind_ns) %>% impute_lm(sea_temp_c ~ wind_ew +
## wind_ns) %>% add_label_shadow() %>% paged_table()`:
## ! could not find function "%>%"
add_label_shadow函数打上标记any_missing。
[@Tierney2018Imputed]
## Error in `library()`:
## ! there is no package called 'ggplot2'
ggplot(ocean_imp,
aes(x = air_temp_c,
y = humidity,
color = any_missing)) +
geom_point() +
scale_color_brewer(palette = "Dark2") +
theme(legend.position = "bottom") +
labs(
caption = "Jiaxiang Li - jiaxiangli.netlify.com"
)## Error in `ggplot()`:
## ! could not find function "ggplot"
ggplot(ocean_imp,
aes(x = air_temp_c,
fill = any_missing)) +
geom_density(alpha = 0.3) +
scale_fill_brewer(palette = "Dark2") +
theme(legend.position = "bottom") +
labs(
caption = "Jiaxiang Li - jiaxiangli.netlify.com"
)## Error in `ggplot()`:
## ! could not find function "ggplot"
ggplot(ocean_imp,
aes(x = humidity,
fill = any_missing)) +
geom_density(alpha = 0.3) +
scale_fill_brewer(palette = "Dark2") +
theme(legend.position = "bottom") +
labs(
caption = "Jiaxiang Li - jiaxiangli.netlify.com"
)## Error in `ggplot()`:
## ! could not find function "ggplot"
Cases
## Error in `library()`:
## ! there is no package called 'data.table'
## Error in `library()`:
## ! there is no package called 'tidyverse'
## Error in `library()`:
## ! there is no package called 'visdat'
## Error in `library()`:
## ! there is no package called 'naniar'
# data <- fread('ldfilter_cbind.txt')
data <- fread(here::here('../tutoring2/pansiyu/analysis/NA_inlm/ldfilter_cbind.txt'))## Error in `fread()`:
## ! could not find function "fread"
## NULL
缺失值处理参考 naniar 使用技巧 缺失值展示 。
vis_miss(data) +
theme(text = element_text(size=10),
axis.text.x = element_text(angle=90, hjust=1)) +
labs(
caption = "Jiaxiang Li - jiaxiangli.netlify.com"
)## Error in `vis_miss()`:
## ! could not find function "vis_miss"
联动缺失不高。
联动缺失不高,点击前面的三角查看更多。
## Error in `gg_miss_upset()`:
## ! could not find function "gg_miss_upset"
## Error in `gg_miss_upset()`:
## ! could not find function "gg_miss_upset"
样本缺失率不高。
## Error in `miss_case_summary()`:
## ! could not find function "miss_case_summary"
## Error in `miss_case_table()`:
## ! could not find function "miss_case_table"
- 最多一个样本,缺失值也就5个。
变量缺失率不高。
gg_miss_var(data) +
theme(
text = element_text(size=8))+
labs(
caption = "Jiaxiang Li - jiaxiangli.netlify.com"
)## Error in `gg_miss_var()`:
## ! could not find function "gg_miss_var"
## Error in `miss_var_summary()`:
## ! could not find function "miss_var_summary"
## Error in `miss_var_table()`:
## ! could not find function "miss_var_table"
## Error in `library()`:
## ! there is no package called 'rpart.plot'
data %>%
add_prop_miss() %>%
rpart(prop_miss_all ~ ., data = .) %>%
prp(type = 4, extra = 101, prefix = "Prop. Miss = ") ## Error in `data %>% add_prop_miss() %>% rpart(prop_miss_all ~ ., data = .) %>% prp(type = 4,
## extra = 101, prefix = "Prop. Miss = ")`:
## ! could not find function "%>%"
- 如图是影响缺失的主要变量。
由于缺失值不严重,因此进行lm。
## Error in `library()`:
## ! there is no package called 'broom'
data %>%
mutate_at(vars(-MPB)
,~fct_explicit_na(factor(.),'No_infos')) %>%
lm(MPB~.,data=.) %>%
tidy %>%
DT::datatable(
rownames = FALSE,
extensions = 'Buttons', options = list(
dom = 'Bfrtip',
buttons = c('copy', 'csv', 'excel', 'pdf', 'print')
)
)## Error in `data %>% mutate_at(vars(-MPB), ~ fct_explicit_na(factor(.), "No_infos")) %>% lm(
## MPB ~ ., data = .) %>% tidy %>% DT::datatable(rownames = FALSE, extensions = "Buttons",
## options = list(dom = "Bfrtip", buttons = c("copy", "csv", "excel", "pdf",
## "print")))`:
## ! could not find function "%>%"
- 点击对应格式可以下载。