我的问题与分面有关。在下面的示例代码中,我查看了一些分面散点图,然后尝试在每个方面覆盖信息(在本例中为平均线)。
tl;dr 版本是我的尝试失败了。要么我添加的平均线计算所有数据(不尊重方面变量),要么我尝试编写一个公式,但 R 抛出错误,然后是对我母亲的尖锐且特别贬低的评论。
library(ggplot2)
# Let's pretend we're exploring the relationship between a car's weight and its
# horsepower, using some sample data
p <- ggplot()
p <- p + geom_point(aes(x = wt, y = hp), data = mtcars)
print(p)
# Hmm. A quick check of the data reveals that car weights can differ wildly, by almost
# a thousand pounds.
head(mtcars)
# Does the difference matter? It might, especially if most 8-cylinder cars are heavy,
# and most 4-cylinder cars are light. ColorBrewer to the rescue!
p <- p + aes(color = factor(cyl))
p <- p + scale_color_brewer(pal = "Set1")
print(p)
# At this point, what would be great is if we could more strongly visually separate
# the cars out by their engine blocks.
p <- p + facet_grid(~ cyl)
print(p)
# Ah! Now we can see (given the fixed scales) that the 4-cylinder cars flock to the
# left on weight measures, while the 8-cylinder cars flock right. But you know what
# would be REALLY awesome? If we could visually compare the means of the car groups.
p.with.means <- p + geom_hline(
aes(yintercept = mean(hp)),
data = mtcars
)
print(p.with.means)
# Wait, that's not right. That's not right at all. The green (8-cylinder) cars are all above the
# average for their group. Are they somehow made in an auto plant in Lake Wobegon, MN? Obviously,
# I meant to draw mean lines factored by GROUP. Except also obviously, since the code below will
# print an error, I don't know how.
p.with.non.lake.wobegon.means <- p + geom_hline(
aes(yintercept = mean(hp) ~ cyl),
data = mtcars
)
print(p.with.non.lake.wobegon.means)
There must是我缺少的一些简单的解决方案。
你的意思是这样的:
rs <- ddply(mtcars,.(cyl),summarise,mn = mean(hp))
p + geom_hline(data=rs,aes(yintercept=mn))
或许可以在ggplot
呼叫使用stat_*
,但我必须回去修改一下。但一般来说,如果我将摘要添加到多面图中,我会单独计算摘要,然后将它们添加到自己的摘要中geom
.
EDIT
只是对您最初的尝试进行一些扩展说明。一般来说,最好放置aes
来电ggplot
这将贯穿整个情节,然后在其中指定不同的数据集或美学geom
与“基本”情节不同。那么你就不需要继续指定data = ...
每一个geom
.
最后我想出了一个巧妙的利用方法geom_smooth
做一些类似于你要求的事情:
p <- ggplot(data = mtcars,aes(x = wt, y = hp, colour = factor(cyl))) +
facet_grid(~cyl) +
geom_point() +
geom_smooth(se=FALSE,method="lm",formula=y~1,colour="black")
水平线(即常数回归方程)只会延伸到每个方面数据的极限,但它会跳过单独的数据汇总步骤。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)