最后两条语句返回错误消息:
dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
Error: unexpected '=' in "dat[,`:=`(paste0(date_col, "_year", sep="") ="
dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
Error: unexpected '=' in "dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) ="
调用的正确语法:=()
函数是:
dat[, `:=`(paste0(date_col, "_year", sep = ""),
substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
dat[, `:=`(noquote(paste0(date_col, "_year", sep = "")),
substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
即替换=
by ,
.
然而,赋值语法和右手边太复杂了。
The order_date
列已经属于类Date
:
str(dat)
Classes ‘data.table’ and 'data.frame': 5 obs. of 3 variables:
$ one : int 1 2 3 4 5
$ two : int 1 2 3 4 5
$ order_date: Date, format: "2015-01-01" "2015-02-01" ...
- attr(*, ".internal.selfref")=<externalptr>
为了提取年份,year()
可以使用函数(从data.table
包或lubridate
打包最后加载的内容),因此不需要转换回字符并提取年份字符串:
date_col = "order_date"
dat[, paste0(date_col, "_year") := lapply(.SD, year), .SDcols = date_col][]
one two order_date order_date_year
1: 1 1 2015-01-01 2015
2: 2 2 2015-02-01 2015
3: 3 3 2015-03-01 2015
4: 4 4 2015-04-01 2015
5: 5 5 2015-05-01 2015
或者,
dat[, paste0(date_col, "_year") := year(get(date_col))][]
dat[, `:=`(paste0(date_col, "_year"), year(get(date_col)))][]
工作也一样。