有两种方法可以降低数据帧子集中的因子水平,第一种方法是使用因子函数,第二种方法是使用lapply。
> df <- data.frame(alphabets=letters[1:10], numbers=seq(0:9)) > levels(df$alphabets) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" > subdf <- subset(df, numbers <= 6) > subdf alphabets numbers 1 a 1 2 b 2 3 c 3 4 d 4 5 e 5 6 f 6 > levels(subdf$alphabets) [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
尽管我们创建了一个子集,但是因子变量字母的级别仍然显示10个级别。如果我们想降低因子水平,则可以通过
使用因子函数
> subdf$alphabets <- factor(subdf$alphabets) > levels(subdf$alphabets) [1] "a" "b" "c" "d" "e" "f"
使用lapply
> subdf[] <- lapply(subdf, function(x) if(is.factor(x)) factor(x) else x) > levels(subdf$alphabets) [1] "a" "b" "c" "d" "e" "f"