使用 sed 在文本范围下方插入行

2024-05-11

我有一段文本,其中一些部分由四空格缩进清晰地界定:

PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that. 

    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.

我希望每个缩进块前面紧接着START QUOTE并立即跟进END QUOTE。我已经使用 sed 玩了十五分钟,但仍然不能完全正确。这是我迄今为止所做的最大努力:

#!/usr/bin/sed -Ef
/^$/ {
N
    /\n    / {
    P
    s/^\n//
    i\
    START QUOTE
    }
}

/^    / {
N
    /\n$/ {
    s/\n$/&END QUOTE/
    G
    }
}

Running ./parse.sed <script.txt,我得到以下输出:

PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that. 

START QUOTE
    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

START QUOTE
    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.
END QUOTE

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.

注意遗漏的END QUOTE在第一个引用的块上。我认为这里发生的是脚本中的第二个命令:

/^    / {
N
    /\n$/ {
    s/\n$/&END QUOTE/
    G
    }
}

如果当前行是引用块的最后一行,则只能正确找到块末尾的边界。但有时,它会偏离一,并且边界会被分成两个单独的部分N命令,因此无法识别。有关执行此操作的正确方法的任何指示sed is?


使用 sed

当寻找引用的结尾时,原始脚本成对地读取。因此,只有当引用包含奇数行时才能找到引用的结尾。解决方案是立即读取整个引用,然后添加END QUOTE到最后:

#!/usr/bin/sed -Ef
/^$/ {
N
    /\n    / {
    P
    s/^\n//
    i\
    START QUOTE
    }
}

/^    / {
    :a;N;/\n$/!ba
    s/$/END QUOTE\n/
}

这里的关键变化是:a;N;/\n$/!ba它读取行直到找到空行。

[以上是在GNU sed下测试的。 BSD (OSX) sed 通常略有不同。]

使用 awk

sed可以做任何事情,但逻辑复杂的事情往往更容易做awk。对于您的问题,请尝试:

awk '/^    / && q{print;next} q{print "END QUOTE"; q=0} /^    /{print "START QUOTE"; q=1} 1' file

根据您的输入,例如:

$ awk '/^    / && q{print;next} q{print "END QUOTE"; q=0} /^    /{print "START QUOTE"; q=1} 1' file
PERCHANCE he for whom this bell tolls may be so ill, as that he knows not it
tolls for him; and perchance I may think myself so much better than I am, as
that they who are about me, and see my state, may have caused it to toll for me,
and I know not that. 

START QUOTE
    The church is Catholic, universal, so are all her actions; all that she does
    belongs to all. When she baptizes a child, that action concerns me; for that
    child is thereby connected to that body which is my head too, and ingrafted into
    that body whereof I am a member.
END QUOTE

And when she buries a man, that action concerns me: all mankind is of one
author, and is one volume; when one man dies, one chapter is not torn out of the
book, but translated into a better language; and every chapter must be so
translated; God employs several translators; some pieces are translated by age,
some by sickness, some by war, some by justice; but God's hand is in every
translation, and his hand shall bind up all our scattered leaves again for that
library where every book shall lie open to one another.

START QUOTE
    As therefore the bell that rings to a sermon calls not upon the preacher only,
    but upon the congregation to come, so this bell calls us all; but how much more
    me, who am brought so near the door by this sickness.
END QUOTE

There was a contention as far as a suit (in which both piety and dignity,
religion and estimation, were mingled), which of the religious orders should
ring to prayers first in the morning; and it was determined, that they should
ring first that rose earliest.

怎么运行的

该脚本使用单个变量q当我们在引用中时,它是 1,否则为零。

  • /^ / && q{print;next}

    If q为 true 并且该行以 4 个空格开头,然后打印该行,跳过其余命令并跳转到next line.

  • q{print "END QUOTE"; q=0}

    如果我们到达这里时q为 true,则该行不以 4 个空格开头。这意味着报价刚刚结束,我们打印END QUOTE并重置q为假 (0)。

  • /^ /{print "START QUOTE"; q=1}

    如果我们到达的行以 4 个空格开头,那么引用就刚刚开始。我们打印START QUOTE并设置q为真 (1)。

  • 1

    这是 awk 打印该行的神秘简写。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

使用 sed 在文本范围下方插入行 的相关文章

随机推荐

  • 如何扩展/架构 ASP.NET MVC 3 授权属性来处理这种情况

    我一直在努力思考这个答案 但找不到如何正确执行此操作的好解决方案 我读过这些文章 http schotime net blog index php 2009 02 17 custom authorization with aspnet mv
  • 如何在Python中使用多处理来加速循环执行

    我有两个清单 列表 A 包含 500 个单词 列表 B 包含 10000 个单词 我正在尝试为列表 A 找到与 B 相关的相似单词 我正在使用 Spacy 的相似函数 我面临的问题是计算需要很长时间 我是多处理使用的新手 因此请求帮助 如何
  • 构建战争时如何包含额外文件?

    我正在尝试添加一个目录 garils app store 对我的战争就像这样BuildConfig groovy grails war resources stagingDir args gt copy file grails app st
  • C# 中的 false 运算符有什么用?

    C 中有两个奇怪的运算符 the 真算子 http msdn microsoft com en us library 6x6y6z4d aspx the 假算子 http msdn microsoft com en us library 6
  • 果园:自定义注册字段

    对于我的 Orchard 项目 我需要用户在注册时提供一些附加信息 说 名字 姓氏 裤子颜色 此信息必须在注册时输入 并且不能推迟到以后 根据客户的订单 我尝试使用配置文件和扩展注册插件来请求这些 但据我所知 这只为我提供了在注册表中显示的
  • 当应用程序终止时,我可以安全地依赖 Threads 中的 IsBackground 吗?

    我正在 GUI 中运行一些后台线程 目前 我正在实现个人线程取消代码 但线程中有 IsBackground 属性 根据 MSDN 它们会自行取消 我知道它将进入 Thread Abort 这很令人讨厌 但是在这个后台线程中没有任何事情需要我
  • 用于神经网络模型预测的数据的缺失值

    我目前有大量数据将用于训练预测神经网络 美国主要机场的千兆字节天气数据 我几乎每天都有数据 但有些机场的数据中存在缺失值 例如 机场在 1995 年之前可能不存在 因此在此之前我没有该特定位置的数据 此外 有些还缺少整年 可能跨度为 199
  • Codeigniter 处理大文件时允许的内存大小耗尽

    我发布此内容是为了防止其他人正在寻找相同的解决方案 因为我刚刚在这个废话上浪费了两天时间 我有一个 cron 作业 每天使用一个非常大的文件更新数据库一次 使用以下代码 if handle fopen dirname FILE uncomp
  • 如何缓存单元格并重用每个单元格中嵌入了 avplayers 的集合视图中的单元格?

    基本上我想做的是缓存单元格并让视频继续播放 当用户滚动回到单元格时 视频应该只从播放的位置显示 问题是玩家被移除并且单元格最终出现在随机单元格上 而不是其指定区域 您需要有两个视频才能正常工作 我从这里下载了视频https commonda
  • 平面列表滚动时响应触摸事件的延迟

    我在反应本机应用程序中使用 FlatList 实现了无限滚动 这个列表是一个轮播列表 可以认为是一个很长的列表 当我滚动列表时 列表外部的触摸事件在单击时没有响应 但在 FlatList 滚动完成时响应 我该如何改进这个 这个问题很难回答
  • 停止在列表视图中滚动

    我的活动中有一个列表视图和一个图像按钮 当我单击图像按钮时 我想转到列表中的特定位置 我通过调用列表上的 setSelection intposition 来实现此目的 当用户滑动列表视图然后单击图像按钮时会出现问题 列表将转到指定位置但继
  • Python Tweepy:Twitter Api 说 /users/lookup 不存在

    我正在制作一个研究应用程序 研究具有高权威的 Twitter 用户之间的交互 其中一部分是提取有关用户的信息 我使用 Tweepy for Python 构建了一个应用程序 过去 2 天我一直在提取用户信息 没有出现任何问题 突然提出这样的
  • C# 相当于 PHP 中的 hash_hmac

    使用 NET 和 C 我需要使用 HMAC SHA512 向 PHP 服务器提供完整性字符串 在 C 中使用 Encoding encoding Encoding UTF8 byte keyByte encoding GetBytes ke
  • 在企业代理内的 Windows 中安装 Ruby 2.4.1。 MSYS2更新失败

    我正在使用 ruby 安装程序rubyinstaller 2 4 1 2 x64 exe https rubyinstaller org downloads archives 安装ruby 2 4 1 on windows 10 就成功了并
  • 渲染从 SimpleDocTemplate 构建的 ReportLab pdf

    我有一个 django 应用程序 当前使用用户可以下载的画布生成 pdf 我创建一个 StringIO 缓冲区 执行一些操作 然后发送调用 response write Set up response response HttpRespon
  • 在 PowerPoint 中查找文本并替换为 Excel 单元格中的文本

    我正在尝试查找 PowerPoint 幻灯片中的单词列表并将其替换为 Excel 文件中单元格中的值 我在 PowerPoint 中运行 VBA 但出现此错误 运行时错误 2147024809 80070057 指定的值超出范围 代码似乎停
  • Cassandra Pojo Sink Flink 中的动态表名称

    我是 Apache Flink 的新手 我正在使用 Pojo Sink 将数据加载到 Cassandra 中 现在 我在以下命令的帮助下指定表和键空间名称 Table注解 现在 我想在运行时动态传递表名称和键空间名称 以便可以将数据加载到用
  • 如何结合GetX和build_value的使用?

    我们的应用程序有很多提供商 https pub dev packages provider https pub dev packages provider 使用的代码built value https pub dev packages bu
  • 为什么这个泛型方法要求 T 有一个公共的、无参数的构造函数?

    public void Getrecords ref IList iList T dataItem iList Populate GetList
  • 使用 sed 在文本范围下方插入行

    我有一段文本 其中一些部分由四空格缩进清晰地界定 PERCHANCE he for whom this bell tolls may be so ill as that he knows not it tolls for him and p