将 vline 添加到 geom_密度和均值 R 的阴影置信区间

2023-12-22

阅读不同的帖子后，我发现如何将均值 vline 添加到密度图，如图所示here http://www.cookbook-r.com/Graphs/Plotting_distributions_(ggplot2)/。使用上面链接中提供的数据：

1) 如何使用 geom_ribbon 在平均值附近添加 95% 的置信区间？ CI 可以计算为

#computation of the standard error of the mean
sem<-sd(x)/sqrt(length(x))
#95% confidence intervals of the mean
c(mean(x)-2*sem,mean(x)+2*sem)

2) 如何将 vline 限制在曲线下方的区域？您将在下图中看到 vline 绘制在曲线之外。

与我的实际问题非常接近的示例数据可以在以下位置找到https://www.dropbox.com/s/bvvfdpgekbjyjh0/test.csv?dl=0 https://www.dropbox.com/s/bvvfdpgekbjyjh0/test.csv?dl=0

UPDATE

使用上面链接中的真实数据，我使用@beetroot的答案尝试了以下操作。

# Find the mean of each group
dat=me
library(dplyr)
library(plyr)
cdat <- ddply(data,.(direction,cond), summarise, rating.mean=mean(rating,na.rm=T))# summarize by season and variable
cdat

#ggplot
p=ggplot(data,aes(x = rating)) + 
  geom_density(aes(colour = cond),size=1.3,adjust=4)+
  facet_grid(.~direction, scales="free")+
  xlab(NULL) + ylab("Density")
p=p+coord_cartesian(xlim = c(0, 130))+scale_color_manual(name="",values=c("blue","#00BA38","#F8766D"))+
  scale_fill_manual(values=c("blue", "#00BA38", "#F8766D"))+
  theme(legend.title = element_text(colour="black", size=15, face="plain"))+
  theme(legend.text = element_text(colour="black", size = 15, face = "plain"))+
  theme(title = red.bold.italic.text, axis.title = red.bold.italic.text)+
  theme(strip.text.x = element_text(size=20, color="black",face="plain"))+ # facet labels
  ggtitle("SAMPLE A") +theme(plot.title = element_text(size = 20, face = "bold"))+
    theme(axis.text = blue.bold.italic.16.text)+ theme(legend.position = "none")+
  geom_vline(data=cdat, aes(xintercept=rating.mean, color=cond),linetype="dotted",size=1)
p

## implementing @beetroot's code to restrict lines under the curve and shade CIs around the mean
# I will use ddply for mean and CIs
cdat <- ddply(data,.(direction,cond), summarise, rating.mean=mean(rating,na.rm=T),
              sem = sd(rating,na.rm=T)/sqrt(length(rating)),
              ci.low = mean(rating,na.rm=T) - 2*sem,
              ci.upp = mean(rating,na.rm=T) + 2*sem)# summarize by direction and variable


#In order to limit the lines to the outline of the curves you first need to find out which y values
#of the curves correspond to the means, e.g. by accessing the density values with ggplot_build and 
#using approx:

   cdat.dens <- ggplot_build(ggplot(data, aes(x=rating, colour=cond)) +
                              facet_grid(.~direction, scales="free")+
                              geom_density(aes(colour = cond),size=1.3,adjust=4))$data[[1]] %>%
  mutate(cond = ifelse(group==1, "A",
                       ifelse(group==2, "B","C"))) %>%
  left_join(cdat) %>%
  select(y, x, cond, rating.mean, sem, ci.low, ci.upp) %>%
  group_by(cond) %>%
  mutate(dens.mean = approx(x, y, xout = rating.mean)[[2]],
         dens.cilow = approx(x, y, xout = ci.low)[[2]],
         dens.ciupp = approx(x, y, xout = ci.upp)[[2]]) %>%
  select(-y, -x) %>%
  slice(1)

 cdat.dens

#---
 #You can then combine everything with various geom_segments:

   ggplot(data, aes(x=rating, colour=cond)) +
   geom_density(data = data, aes(x = rating, colour = cond),size=1.3,adjust=4) +facet_grid(.~direction, scales="free")+
   geom_segment(data = cdat.dens, aes(x = rating.mean, xend = rating.mean, y = 0, yend = dens.mean, colour = cond),
                linetype = "dashed", size = 1) +
   geom_segment(data = cdat.dens, aes(x = ci.low, xend = ci.low, y = 0, yend = dens.cilow, colour = cond),
                linetype = "dotted", size = 1) +
   geom_segment(data = cdat.dens, aes(x = ci.upp, xend = ci.upp, y = 0, yend = dens.ciupp, colour = cond),
                linetype = "dotted", size = 1)

给出这个：

您会注意到平均值和置信区间并未像原始图中那样对齐。我做错了什么@beetroot？

使用链接中的数据，您可以像这样计算平均值、se 和 ci （我建议使用dplyr, 的继承者plyr):

set.seed(1234)
dat <- data.frame(cond = factor(rep(c("A","B"), each=200)), 
                  rating = c(rnorm(200),rnorm(200, mean=.8)))

library(ggplot2)
library(dplyr)
cdat <- dat %>%
  group_by(cond) %>%
  summarise(rating.mean = mean(rating),
            sem = sd(rating)/sqrt(length(rating)),
            ci.low = mean(rating) - 2*sem,
            ci.upp = mean(rating) + 2*sem)

为了将线条限制在曲线轮廓上，您首先需要找出曲线的哪些 y 值对应于平均值，例如通过访问密度值ggplot_build并使用approx:

cdat.dens <- ggplot_build(ggplot(dat, aes(x=rating, colour=cond)) + geom_density())$data[[1]] %>%
  mutate(cond = ifelse(group == 1, "A", "B")) %>%
  left_join(cdat) %>%
  select(y, x, cond, rating.mean, sem, ci.low, ci.upp) %>%
  group_by(cond) %>%
  mutate(dens.mean = approx(x, y, xout = rating.mean)[[2]],
         dens.cilow = approx(x, y, xout = ci.low)[[2]],
         dens.ciupp = approx(x, y, xout = ci.upp)[[2]]) %>%
  select(-y, -x) %>%
  slice(1)

> cdat.dens
Source: local data frame [2 x 8]
Groups: cond [2]

   cond rating.mean        sem     ci.low     ci.upp dens.mean dens.cilow dens.ciupp
  <chr>       <dbl>      <dbl>      <dbl>      <dbl>     <dbl>      <dbl>      <dbl>
1     A -0.05775928 0.07217200 -0.2021033 0.08658471 0.3865929   0.403623  0.3643583
2     B  0.87324927 0.07120697  0.7308353 1.01566320 0.3979347   0.381683  0.4096153

然后您可以将所有内容与各种组合geom_segments:

ggplot() +
  geom_density(data = dat, aes(x = rating, colour = cond)) +
  geom_segment(data = cdat.dens, aes(x = rating.mean, xend = rating.mean, y = 0, yend = dens.mean, colour = cond),
             linetype = "dashed", size = 1) +
  geom_segment(data = cdat.dens, aes(x = ci.low, xend = ci.low, y = 0, yend = dens.cilow, colour = cond),
             linetype = "dotted", size = 1) +
  geom_segment(data = cdat.dens, aes(x = ci.upp, xend = ci.upp, y = 0, yend = dens.ciupp, colour = cond),
               linetype = "dotted", size = 1)

正如 Axeman 指出的那样，您可以根据功能区区域创建多边形，如中所述这个答案 https://stackoverflow.com/questions/12429333/how-to-shade-a-region-under-a-curve-using-ggplot2.

因此，对于您的数据，您可以子集并添加附加行，如下所示：

ribbon <- ggplot_build(ggplot(dat, aes(x=rating, colour=cond)) + geom_density())$data[[1]] %>%
  mutate(cond = ifelse(group == 1, "A", "B")) %>%
  left_join(cdat.dens) %>%
  group_by(cond) %>%
  filter(x >= ci.low & x <= ci.upp) %>%
  select(cond, x, y)

ribbon <- rbind(data.frame(cond = c("A", "B"), x = c(-0.2021033, 0.7308353), y = c(0, 0)), 
                as.data.frame(ribbon), 
                data.frame(cond = c("A", "B"), x = c(0.08658471, 1.01566320), y = c(0, 0)))

And add geom_polygon到情节：

ggplot() +
  geom_polygon(data = ribbon, aes(x = x, y = y, fill = cond), alpha = .5) +
  geom_density(data = dat, aes(x = rating, colour = cond)) +
  geom_segment(data = cdat.dens, aes(x = rating.mean, xend = rating.mean, y = 0, yend = dens.mean, colour = cond),
             linetype = "dashed", size = 1) +
  geom_segment(data = cdat.dens, aes(x = ci.low, xend = ci.low, y = 0, yend = dens.cilow, colour = cond),
             linetype = "dotted", size = 1) +
  geom_segment(data = cdat.dens, aes(x = ci.upp, xend = ci.upp, y = 0, yend = dens.ciupp, colour = cond),
               linetype = "dotted", size = 1)

这是针对您的真实数据的改编代码。合并两个组而不是一个组有点棘手：

cdat <- dat %>%
  group_by(direction, cond) %>%
  summarise(rating.mean = mean(rating, na.rm = TRUE),
            sem = sd(rating, na.rm = TRUE)/sqrt(length(rating)),
            ci.low = mean(rating, na.rm = TRUE) - 2*sem,
            ci.upp = mean(rating, na.rm = TRUE) + 2*sem)

cdat.dens <- ggplot_build(ggplot(dat, aes(x=rating, colour=interaction(direction, cond))) + geom_density())$data[[1]] %>%
  mutate(cond = ifelse((group == 1 | group == 2 | group == 3 | group == 4), "A",
                        ifelse((group == 5 | group == 6 | group == 7 | group == 8), "B", "C")),
         direction = ifelse((group == 1 | group == 5 | group == 9), "EAST",
                            ifelse((group == 2 | group == 6 | group == 10), "NORTH",
                                   ifelse((group == 3 | group == 7 | group == 11), "SOUTH", "WEST")))) %>%
  left_join(cdat) %>%
  select(y, x, cond, direction, rating.mean, sem, ci.low, ci.upp) %>%
  group_by(cond, direction) %>%
  mutate(dens.mean = approx(x, y, xout = rating.mean)[[2]],
         dens.cilow = approx(x, y, xout = ci.low)[[2]],
         dens.ciupp = approx(x, y, xout = ci.upp)[[2]]) %>%
  select(-y, -x) %>%
  slice(1)

ggplot() +
  geom_density(data = dat, aes(x = rating, colour = cond)) +
  geom_segment(data = cdat.dens, aes(x = rating.mean, xend = rating.mean, y = 0, yend = dens.mean, colour = cond),
               linetype = "dashed", size = 1) +
  geom_segment(data = cdat.dens, aes(x = ci.low, xend = ci.low, y = 0, yend = dens.cilow, colour = cond),
               linetype = "dotted", size = 1) +
  geom_segment(data = cdat.dens, aes(x = ci.upp, xend = ci.upp, y = 0, yend = dens.ciupp, colour = cond),
               linetype = "dotted", size = 1) +
  facet_wrap(~direction)

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

将 vline 添加到 geom_密度和均值 R 的阴影置信区间的相关文章

R中添加水印

我在用magickR中的库我想在一些图片上添加水印 I used image annotate功能如下 img lt image read C Users Maydin Desktop manzara png image annotate
替换列表列表中的元素

The applyR 中的函数是简化 for 循环以获得输出的好方法是否有一个等效的函数可以帮助人们在替换向量的值时避免 for 循环通过示例可以更好地理解这一点 Take this list for example x list li
从 R 中的向量中选择所有可能的元组

我正在尝试用 R 编写一个程序当给定一个向量时将返回所有可能的tuples http en wikipedia org wiki Tuples该向量中的元素例如元组 c a b c c a b c 出租车 c a c c b c c
基于服务器中的条件逻辑呈现闪亮的用户输入

我正在尝试设置一个闪亮的导航栏面板页面其中用户控制我根据一组单选按钮中所做的初始选择来显示更改我直接在 ui 中渲染单选按钮然后在 Server r 中的观察到的逻辑控制结构内构建条件控件弹出错误是因为我的初始 if 语句计算结
返回数据帧 R 中的下一行

我有一个看起来像这样的数据框 kind datetime book 2016 04 23 04 23 00 pen 2016 04 23 04 30 00 toy 2016 04 23 06 45 00 我想为数据集中的每一行返回下一行的日
如何对同一列上的数据帧列表中的所有数据帧进行排序？

我有一个数据框列表dataframes list 举个例子我把dput dataframes list 在底部我想对列列表中的所有数据框进行排序enrichment 我可以对一个数据框进行排序 first dataframe lt da
如何从 R 运行带有特定模块的 perl 脚本？

我可以从终端运行 perl 脚本 myperlscript pl 没有任何问题但是如果我尝试从 RStudio 中运行相同的 perl 脚本则会出现以下错误 command lt myperlscript pl outputfile
R foreach问题（某些进程返回NULL）

我遇到了问题foreach我正在 R 中使用的程序的一部分该程序用于运行不同参数的模拟然后将结果返回到单个列表然后用于生成报告当并非所有分配的模拟运行都在报告上实际可见时就会出现问题从各方面来看似乎只有分配的运行的一个子集实际
选择 R 中的数据表中隐藏时（在绿色加号下方）列的显示顺序

Context 使用 DataTables 库制作交互式表格时当屏幕宽度对于列的数量和宽度来说太窄时列将隐藏在绿色号下我有一个非常宽的表格有 20 多列其中一些内容非常冗长因此某些列在所有屏幕宽度下总是隐藏的每次隐藏新列时
时间戳半小时窗口内字段的平均值

我的数据框有列名Timestamp es看起来像 Timestamp es 2015 04 01 09 07 42 31 2015 04 01 09 08 01 29 5 2015 04 01 09 15 03 18 5 2015 04 0
pyomo + 网状错误 6 句柄无效

我正在尝试运行pyomo优化我收到错误消息 Error 6 The handle is invalid 不知道如何解释它环顾四周似乎与特权有关但我不太明白在下面找到完整的错误跟踪以及重现它的玩具示例完整的错误跟踪 py run f
twitterR 和 ROAuth R 软件包安装

我在安装 CRAN 上的 twitteR 和 RAOuth 软件包时遇到一些问题我尝试了几种不同的方法在 Windows 下使用源代码在 Ubuntu 下使用 RStudio 我尝试了以下命令 sudo apt get install
以引用透明的方式从函数的省略号参数中提取符号

事情又发生了我正要按下发布答案按钮的问题被删除了我正在寻找一种方法来从函数的省略号参数中提取绑定到符号的对象的值以及符号也就是说我试图以引用透明的方式从省略号中提取符号我尝试过使用替代品和lazy dots 但没有成功 funct
使用 R 选择第一个非 NA 值

df lt data frame ID c 1 1 1 2 3 3 3 test c NA 5 5 6 4 NA 7 3 NA 10 9 我想创建一个名为 value 的变量它是每个单独 ID 测试的第一个非 NA 值对于只有NA的个体
朴素贝叶斯分类器仅基于先验概率做出决策

我试图根据推文的情绪将推文分为三类买入持有卖出我正在使用 R 和包 e1071 我有两个数据框一个训练集和一组需要预测情绪的新推文训练集数据框 text sentiment this stock is a good buy Bu
使用 Shiny 发布平行坐标图表时出现“错误：路径[1]="”：没有这样的文件或目录”

我有一个似乎很常见但我还没有找到解决方案的问题当尝试使用 rCharts Parcoords 发布 Web 应用程序时出现以下错误错误路径 1 没有这样的文件或目录奇怪的是该应用程序在我的笔记本电脑上运行得很好下面是我正在使用
ddply 和aggregate 之间的区别

有人可以通过以下示例帮助我了解聚合和 ddply 之间的区别数据框 mydat lt data frame first rpois 10 10 second rpois 10 10 third rpois 10 10 group c re
更改闪亮 R 中的默认浏览器

我在 RStudio 中使用 01 hello 虽然在 IE 中默认打开程序时它不会显示直方图但即使在 Chrome 中滑块也不起作用我无法滑动条形图并看到直方图中的变化如何更改 R 中的默认浏览器以便闪亮启动 Chrome 而不
相当于 min() 的 rowMeans()

我在 R 邮件列表上多次看到这个问题但仍然找不到满意的答案假设我有一个矩阵m m lt matrix rnorm 10000000 ncol 10 我可以通过以下方式获得每行的平均值 system time rowMeans m use
旋转 Markdown 的表格 pdf 输出

我想将 pdf 上的表格输出旋转 90 度我正在使用 Markdown 生成报告并kable循环显示表格如果可以的话我想继续使用kable因为还有很多其他依赖于它的东西我没有包含在这个 MWE 中这是一个简单的例子使用iris数据集

随机推荐

如何处理未跟踪的文件并提高 git status 性能

我目前在 Mac 上在 Git 2 35 1 中当我克隆我的存储库时枚举未跟踪的文件花了 7 秒当我这样做时time git status 大约花了2秒而且当我签出到其他分支时大约需要 15 秒当我签出回我的主存储库时大约
SQL Server bacpac 在最新更新中本地导入失败（在线索引操作只能在 SQL Server 企业版中执行）

我有一个 SQL Azure 数据库并在 SQL Server Management Studio 中连接到它我导出数据层应用程序然后导入数据层应用程序 bacpac文件以将其放入我的 localdb 中或者我使用任务部署数据库不
有没有办法根据不同列中的离散变量制作 matplotlib 散点图标记或颜色？

我正在使用 matplotlib 从 DF 中制作散点图为了为每个数据集获取不同的颜色我对 plt scatter 进行了两次单独的调用 plt scatter zzz HFmV zzz LFmV label dut groups 0
无缝音频循环到任意位置

我最喜欢 MOD 格式的事情之一是能够循环回到歌曲中的任何给定点这使其非常适合具有前奏和主循环的歌曲当然MP3做不到这一点到目前为止我已经做了这样的事情
如何用Python绘制时间序列热图？ [关闭]

Closed 这个问题需要多问focused help closed questions 目前不接受答案我想绘制一个图表其中 x 轴作为时间轴 y 轴作为其值颜色将指示其频率频率越高颜色越深我认为您正在寻找二维直方图 impor
在oracle11g中创建参数化视图

我有一个带有嵌套和左连接的大查询我需要从中创建一个视图以免从应用程序中运行它问题是我需要日期范围和其他一些字段作为输入参数因为每个请求的前端都会有所不同我刚刚查了一下看到一些帖子提到使用 SYS CONTEXT 进行参数化视图
Jersey - servlet 上下文路径和/或 servlet 路径包含百分比编码的字符

我正在使用 Jersey 和 Tomcat 每当我单击链接时都会收到以下消息 HTTP Status 500 The servlet context path and or the servlet path contain characte
读者-作者访问多个读者

在 UNIX 中实现 WRITER READER 问题时我有一些无法解决的问题第一个是我不知道如何修改代码才能像线程总是调用进入阅览室一样工作例如当作家在阅览室时读者正在等待进入阅览室当作家逃离阅览室读者进入阅览室时他仍在等
在 DigitalOcean 应用程序平台上为 HTTPS 节点应用程序使用 Let's Encrypt 证书

我习惯于使用 Docker Express 框架和 https 包在 DigitalOcean 服务器上部署 HTTPS 节点应用程序如下所示 const https require https const app express con
为什么我的 MIPS 基础转换器在当前值之后打印出前一个循环的值？

我对 MIPS 很陌生这让我完全困惑我编写了一个程序来转换基数第一次运行时效果很好但是当它循环时它会显示先前循环迭代中其他寄存器的值输出如下我已经尝试了我能想到的一切但我没有想法 Enter a decimal number
Javascript 的 sort() 是如何工作的？

下面的代码如何按数字顺序对该数组进行排序 var array 25 8 7 41 array sort function a b return a b 我知道如果计算结果是小于0 a 被排序为比 b 更低的索引 Zero a 和 b 被认
jQuery Mobile 无法刷新可折叠集

我正在使用 jQuery mobile 创建一个应用程序并从 wordpress throw jsonp 加载其菜单和页面我以可折叠集和列表视图的形式加载其菜单但我不断收到错误当我尝试通过此代码刷新可折叠集时 childnev ht
如何快速取消多个线程之一

我有一个启动 3 个异步线程的函数每个线程都会做一些需要一些时间的事情当某个线程首先完成时我需要它来停止其他两个线程但我不知道该怎么做还 My code class SomeController UIViewController
django、phpmyadmin 和 mysql？

我想开始使用 Django 和 MYSQL 而不是一直使用 sqlite 但是我使用 MSQL 的唯一经验是通过 XAMPP 通过 phpmyadmin 操作数据库我真的很想保持与 mysql 的 GUI 交互而不必通过命令行完成所有操
pyinstaller：ModuleNotFoundError：没有名为“cv2”的模块

我正在尝试创建一个简单的软件 pyinstaller 激活计算机摄像头但我遇到了错误ModuleNotFoundError No module named cv2 但是当我使用命令提示符运行相同的代码时它会按照我想要的方式完美运行 i
JVM Hotspot 上的 PrintAssembly 选项已启用，但未显示任何程序集跟踪

我正在使用 intel i386 Ubuntu 14 OpenJDK 版本信息显示为 java version is java version 1 7 0 65 OpenJDK Runtime Environment IcedTea 2 5
Windows 10 inkscape 0.92 中无法识别 libxml

Inkscape 抱怨没有 libxml 所以我安装了 Windows 10 的 lxml 在修改路径 gt 抖动节点期间它再次抱怨所以我重新启动了 Inkscape 它仍然在抱怨有谁知道如何解决这个问题吗提前致谢错误 inkex
了解议程组的锁定活动

我尝试了一个示例来了解锁定活动的工作原理当我在不使用议程组的情况下触发规则时一切似乎都很好但是当我取消注释下面代码中的议程组并将焦点设置为组 B 组时不会触发任何规则 Rule rule Additional Rs 1 tax
在 pandas.DataFrame.query() 表达式中使用负数

我尝试使用 pandas DataFrame query 函数如下所示 expression string ColumnName lt 1000 output dataframe dataframe query expression st
将 vline 添加到 geom_密度和均值 R 的阴影置信区间

阅读不同的帖子后我发现如何将均值 vline 添加到密度图如图所示here http www cookbook r com Graphs Plotting distributions ggplot2 使用上面链接中提供的数据 1 如何使

将 vline 添加到 geom_密度 和均值 R 的阴影置信区间

将 vline 添加到 geom_密度 和均值 R 的阴影置信区间 的相关文章

随机推荐

热门标签

将 vline 添加到 geom_密度和均值 R 的阴影置信区间

将 vline 添加到 geom_密度和均值 R 的阴影置信区间的相关文章