以caret
包的oil
数据进行分析。
假设一个数据的Class
中,最少的level
是A
,最多的是B
。
那么随机采样下,
downSample使得所有level
的数量都等于A
upSample使得所有level
的数量都等于B
downSample
will randomly sample a data set so that all classes have the same frequency as the minority class.
# Perform logistic regression with upsampling and no resampling
vote_glm <- train(turnout16_2016 ~ ., method = "glm", family = "binomial",
data = training,
trControl = trainControl(method = "none",
sampling = "up"))
同时,caret::train
函数的参数trControl
中可以直接设计sampling
的方法。
(Silge 2018)
参考文献
Silge, Julia. 2018. “Supervised Learning in R: Case Studies.” 2018. https://campus.datacamp.com/courses/supervised-learning-in-r-case-studies/get-out-the-vote?ex=9.