在基本 R 中使用by
:
设置一些测试数据,包括额外的超出范围的值:
test <- read.table(text="name city
John Atlanta
Josh Atlanta
Matt Atlanta
Bob Boston
Kate Boston
Lily Boston
Matt Boston
Bob Seattle
Kate Seattle",header=TRUE)
在每个城市获得第 3 件物品:
do.call(rbind,by(test,test$city,function(x) x[3,]))
Result:
name city
Atlanta Matt Atlanta
Boston Lily Boston
Seattle <NA> <NA>
为了得到你想要的,这里有一个小函数:
nthrow <- function(dset,splitvar,n) {
result <- do.call(rbind,by(dset,dset[splitvar],function(x) x[n,]))
result[,splitvar][is.na(result[,splitvar])] <- row.names(result)[is.na(result[,splitvar])]
row.names(result) <- NULL
return(result)
}
称呼它为:
nthrow(test,"city",3)
Result:
name city
1 Matt Atlanta
2 Lily Boston
3 <NA> Seattle