问题中没有足够的数据来运行 4 年,并且因变量的值丢失,因此这里是一个使用w
3 个月(而不是 4 年)和一组简化的统计数据,可以通过更改输入和数据进行调整reg
.
请注意,yearmon 类将仅由年和月组成的日期存储为年 + 分数,其中分数 = 0, 1/12, ..., 11/12 表示 Jan、Feb、...、Dec,因此 w 的间隔长度月份为 w/12。
library(zoo)
# inputs
set.seed(123)
ndata <- data.frame(date = as.Date("2000-01-01") + 0:365,
z = rnorm(366))
A <- sqrt(0:365)
B <- (0:365)^0.25
w <- 3 # number of trailing months to regress over
depvars <- c("A", "B")
indep <- c("date", "z")
reg <- function(ym_, depvar, indep, data, w, ym) {
ok <- ym > ym_ - w/12 & ym <= ym_
fo <- reformulate(indep, depvar)
fm <- lm(fo, data, subset = ok)
co <- coef(fm)
n <- nobs(fm)
c(co, n = n)
}
ym <- as.yearmon(ndata$date)
ym_u <- tail(unique(ym), -(w-1))
L <- Map(function(depvar) {
data.frame(yearmon = ym_u,
t(sapply(ym_u, reg,
depvar = depvar, indep = indep, data = ndata, w = w, ym = ym)),
check.names = FALSE)
}, depvars)
L
给出以下数据框列表,其中yearmon是执行回归的w个月期间最后一个月的年份和月份,n是该期间的天数。
$A
yearmon (Intercept) date z n
1 Mar 2000 -931.0836 0.08520186 -3.783475e-02 91
2 Apr 2000 -645.7504 0.05930666 5.638294e-03 90
3 May 2000 -536.6141 0.04942836 3.528984e-03 92
4 Jun 2000 -468.3192 0.04326379 -6.769498e-03 91
5 Jul 2000 -420.6956 0.03897671 -7.307754e-05 92
6 Aug 2000 -384.5289 0.03573000 1.343427e-03 92
7 Sep 2000 -356.8805 0.03325475 -1.272157e-03 92
8 Oct 2000 -333.4633 0.03116400 1.980825e-03 92
9 Nov 2000 -314.3980 0.02946651 2.223839e-04 91
10 Dec 2000 -298.0596 0.02801567 -2.949753e-04 92
$B
yearmon (Intercept) date z n
1 Mar 2000 -206.66238 0.019006840 -7.802128e-03 91
2 Apr 2000 -110.66468 0.010294703 1.301456e-03 90
3 May 2000 -83.11581 0.007801199 8.920903e-04 92
4 Jun 2000 -67.34099 0.006377318 -1.520903e-03 91
5 Jul 2000 -57.03138 0.005449255 -1.435477e-05 92
6 Aug 2000 -49.58352 0.004780660 2.702669e-04 92
7 Sep 2000 -44.11908 0.004291454 -2.438281e-04 92
8 Oct 2000 -39.65054 0.003892493 3.683646e-04 92
9 Nov 2000 -36.12215 0.003578342 4.162776e-05 91
10 Dec 2000 -33.18009 0.003317091 -5.103712e-05 92
或者如果首选数据框,则:
dplyr::bind_rows(L, .id = "depvar")
giving:
depvar yearmon (Intercept) date z n
1 A Mar 2000 -931.08360 0.085201863 -3.783475e-02 91
2 A Apr 2000 -645.75036 0.059306657 5.638294e-03 90
3 A May 2000 -536.61413 0.049428357 3.528984e-03 92
4 A Jun 2000 -468.31918 0.043263786 -6.769498e-03 91
5 A Jul 2000 -420.69558 0.038976709 -7.307754e-05 92
6 A Aug 2000 -384.52887 0.035729997 1.343427e-03 92
7 A Sep 2000 -356.88052 0.033254748 -1.272157e-03 92
8 A Oct 2000 -333.46329 0.031163998 1.980825e-03 92
9 A Nov 2000 -314.39800 0.029466506 2.223839e-04 91
10 A Dec 2000 -298.05960 0.028015670 -2.949753e-04 92
11 B Mar 2000 -206.66238 0.019006840 -7.802128e-03 91
12 B Apr 2000 -110.66468 0.010294703 1.301456e-03 90
13 B May 2000 -83.11581 0.007801199 8.920903e-04 92
14 B Jun 2000 -67.34099 0.006377318 -1.520903e-03 91
15 B Jul 2000 -57.03138 0.005449255 -1.435477e-05 92
16 B Aug 2000 -49.58352 0.004780660 2.702669e-04 92
17 B Sep 2000 -44.11908 0.004291454 -2.438281e-04 92
18 B Oct 2000 -39.65054 0.003892493 3.683646e-04 92
19 B Nov 2000 -36.12215 0.003578342 4.162776e-05 91
20 B Dec 2000 -33.18009 0.003317091 -5.103712e-05 92
Note
我不清楚问题中统计计算的意图。我确实在第8页的顶部找到了公式这个文件 https://www.erawa.com.au/cproot/11652/2/Energy%20Networks%20Association%20-%20Draft%20Rate%20of%20Return%20Guidelines%20-%20Report%208%20-%20Vasicek%20Adjustment.pdf但它似乎与问题中提到的有所不同。无论如何,至少问题中的代码似乎需要对某些未平方的项目进行平方,并注意coef(fm)
, sigma(fm)
and diag(vcov(fm))
是系数、残差标准误差和系数标准误差的平方。