清华 NLP 团队推荐:必读的77篇机器阅读理解论文

2023-11-13

https://mp.weixin.qq.com/s/2VhgEieBwXymAv2qxO3MPw

 

【导读】机器阅读理解(Machine Reading Comprehension)是指让机器阅读文本,然后回答和阅读内容相关的问题。阅读理解是自然语言处理和人工智能领域的重要前沿课题,对于提升机器智能水平、使机器具有持续知识获取能力具有重要价值,近年来受到学术界和工业界的广泛关注。清华 NLP 团队近期在 Github 上开源了一个必读的机器阅读理解文章项目,满满的都是干货,相信读完这个 list 我们离 NLP 大牛更近了一步。

 

Github | https://github.com/thunlp/RCPapers

作者 | Yankai Lin, Deming Ye and Haoze Ji

整理报道 | huaiwen

【模型结构篇】

  1. Memory networks. Jason Weston, Sumit Chopra, and Antoine Bordes. arXiv preprint arXiv:1410.3916 (2014).

  2. Teaching Machines to Read and Comprehend. Hermann, Karl Moritz, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. NIPS 2015.

  3. Text Understanding with the Attention Sum Reader Network. Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, and Jan Kleindienst. ACL 2016.

  4. A Thorough Examination of the Cnn/Daily Mail Reading Comprehension Task. Danqi Chen, Jason Bolton, and Christopher D. Manning. ACL 2016.

  5. Long Short-Term Memory-Networks for Machine Reading. Jianpeng Cheng, Li Dong, and Mirella Lapata. EMNLP 2016.

  6. Key-value Memory Networks for Directly Reading Documents. Alexander Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, and Jason Weston. EMNLP 2016.

  7. Modeling Human Reading with Neural Attention. Michael Hahn and Frank Keller. EMNLP 2016.

  8. Learning Recurrent Span Representations for Extractive Question Answering Kenton Lee, Shimi Salant, Tom Kwiatkowski, Ankur Parikh, Dipanjan Das, and Jonathan Berant. arXiv preprint arXiv:1611.01436 (2016).

  9. Multi-Perspective Context Matching for Machine Comprehension. Zhiguo Wang, Haitao Mi, Wael Hamza, and Radu Florian. arXiv preprint arXiv:1612.04211.

  10. Natural Language Comprehension with the Epireader. Adam Trischler, Zheng Ye, Xingdi Yuan, and Kaheer Suleman. EMNLP 2016.

  11. Iterative Alternating Neural Attention for Machine Reading. Alessandro Sordoni, Philip Bachman, Adam Trischler, and Yoshua Bengio. arXiv preprint arXiv:1606.02245 (2016).

  12. Bidirectional Attention Flow for Machine Comprehension. Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, and Hannaneh Hajishirzi. ICLR 2017.

  13. Machine Comprehension Using Match-lstm and Answer Pointer. Shuohang Wang and Jing Jiang. arXiv preprint arXiv:1608.07905 (2016).

  14. Gated Self-matching Networks for Reading Comprehension and Question Answering. Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. ACL 2017.

  15. Attention-over-attention Neural Networks for Reading Comprehension. Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, and Guoping Hu. ACL 2017.

  16. Gated-attention Readers for Text Comprehension. Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W. Cohen, and Ruslan Salakhutdinov. ACL 2017.

  17. A Constituent-Centric Neural Architecture for Reading Comprehension. Pengtao Xie and Eric Xing. ACL 2017.

  18. Structural Embedding of Syntactic Trees for Machine Comprehension. Rui Liu, Junjie Hu, Wei Wei, Zi Yang, and Eric Nyberg. EMNLP 2017.

  19. Accurate Supervised and Semi-Supervised Machine Reading for Long Documents. Izzeddin Gur, Daniel Hewlett, Alexandre Lacoste, and Llion Jones. EMNLP 2017.

  20. MEMEN: Multi-layer Embedding with Memory Networks for Machine Comprehension. Boyuan Pan, Hao Li, Zhou Zhao, Bin Cao, Deng Cai, and Xiaofei He. arXiv preprint arXiv:1707.09098 (2017).

  21. Dynamic Coattention Networks For Question Answering. Caiming Xiong, Victor Zhong, and Richard Socher. ICLR 2017

  22. R-NET: Machine Reading Comprehension with Self-matching Networks. Natural Language Computing Group, Microsoft Research Asia.

  23. Reasonet: Learning to Stop Reading in Machine Comprehension. Yelong Shen, Po-Sen Huang, Jianfeng Gao, and Weizhu Chen. KDD 2017.

  24. FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension. Hsin-Yuan Huang, Chenguang Zhu, Yelong Shen, and Weizhu Chen. ICLR 2018.

  25. Making Neural QA as Simple as Possible but not Simpler. Dirk Weissenborn, Georg Wiese, and Laura Seiffe. CoNLL 2017.

  26. Efficient and Robust Question Answering from Minimal Context over Documents. Sewon Min, Victor Zhong, Richard Socher, and Caiming Xiong. ACL 2018.

  27. Simple and Effective Multi-Paragraph Reading Comprehension. Christopher Clark and Matt Gardner. ACL 2018.

  28. Neural Speed Reading via Skim-RNN. Minjoon Seo, Sewon Min, Ali Farhadi, and Hannaneh Hajishirzi. ICLR2018.

  29. Hierarchical Attention Flow forMultiple-Choice Reading Comprehension. Haichao Zhu,� Furu Wei, Bing Qin, and Ting Liu. AAAI 2018.

  30. Towards Reading Comprehension for Long Documents. Yuanxing Zhang, Yangbin Zhang, Kaigui Bian, and Xiaoming Li. IJCAI 2018.

  31. Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension. Zhen Wang, Jiachen Liu, Xinyan Xiao, Yajuan Lyu, and Tian Wu. ACL 2018.

  32. Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification. Yizhong Wang, Kai Liu, Jing Liu, Wei He, Yajuan Lyu, Hua Wu, Sujian Li, and Haifeng Wang. ACL 2018.

  33. Reinforced Mnemonic Reader for Machine Reading Comprehension. Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, and Ming Zhou. IJCAI 2018.

  34. Stochastic Answer Networks for Machine Reading Comprehension. Xiaodong Liu, Yelong Shen, Kevin Duh, and Jianfeng Gao. ACL 2018.

  35. Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering. Wei Wang, Ming Yan, and Chen Wu. ACL 2018.

  36. A Multi-Stage Memory Augmented Neural Networkfor Machine Reading Comprehension. Seunghak Yu, Sathish Indurthi, Seohyun Back, and Haejun Lee. ACL 2018 workshop.

  37. S-NET: From Answer Extraction to Answer Generation for Machine Reading Comprehension. Chuanqi Tan, Furu Wei, Nan Yang, Bowen Du, Weifeng Lv, and Ming Zhou. AAAI2018.

  38. Ask the Right Questions: Active Question Reformulation with Reinforcement Learning. Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, and Wei Wang. ICLR2018.

  39. QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. Adams Wei Yu, David Dohan, Minh-Thang Luong, Rui Zhao, Kai Chen, Mohammad Norouzi, and Quoc V. Le. ICLR2018.

  40. Read + Verify: Machine Reading Comprehension with Unanswerable Questions. Minghao Hu, Furu Wei, Yuxing Peng, Zhen Huang, Nan Yang, and Ming Zhou. NAACL2018.

 

【利用额外知识篇】

  1. Leveraging Knowledge Bases in LSTMs for Improving Machine Reading. Bishan Yang and Tom Mitchell. ACL 2017.

  2. Learned in Translation: Contextualized Word Vectors. Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. arXiv preprint arXiv:1708.00107 (2017).

  3. Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge. Todor Mihaylov and Anette Frank. ACL 2018.

  4. A Comparative Study of Word Embeddings for Reading Comprehension. Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, and William W. Cohen. arXiv preprint arXiv:1703.00993 (2017).

  5. Deep contextualized word representations. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. NAACL 2018.

  6. Improving Language Understanding by Generative Pre-Training. Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. OpenAI.

  7. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. arXiv preprint arXiv:1810.04805 (2018).

 

【探索篇】

  1. Adversarial Examples for Evaluating Reading Comprehension Systems. Robin Jia, and Percy Liang. EMNLP 2017.

  2. Did the Model Understand the Question? Pramod Kaushik Mudrakarta, Ankur Taly, Mukund Sundararajan, and Kedar Dhamdhere. ACL 2018.

 

【开放域问答篇】

  1. Reading Wikipedia to Answer Open-Domain Questions. Danqi Chen, Adam Fisch, Jason Weston, and Antoine Bordes. ACL 2017.

  2. R^3: Reinforced Reader-Ranker for Open-Domain Question Answering. Shuohang Wang, Mo Yu, Xiaoxiao Guo, Zhiguo Wang, Tim Klinger, Wei Zhang, Shiyu Chang, Gerald Tesauro, Bowen Zhou, and Jing Jiang. AAAI 2018.

  3. Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering. Shuohang Wang, Mo Yu, Jing Jiang, Wei Zhang, Xiaoxiao Guo, Shiyu Chang, Zhiguo Wang, Tim Klinger, Gerald Tesauro, and Murray Campbell. ICLR 2018.

  4. Denoising Distantly Supervised Open-Domain Question Answering. Yankai Lin, Haozhe Ji, Zhiyuan Liu, and Maosong Sun. ACL 2018.

 

【数据集篇】

  1. (SQuAD 1.0) SQuAD: 100,000+ Questions for Machine Comprehension of Text. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. EMNLP 2016.

  2. (SQuAD 2.0) Know What You Don't Know: Unanswerable Questions for SQuAD. Pranav Rajpurkar, Robin Jia, and Percy Liang. ACL 2018.

  3. (MS MARCO) MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. arXiv preprint arXiv:1611.09268 (2016).

  4. (Quasar) Quasar: Datasets for Question Answering by Search and Reading. Bhuwan Dhingra, Kathryn Mazaitis, and William W. Cohen. arXiv preprint arXiv:1707.03904 (2017).

  5. (TriviaQA) TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer. arXiv preprint arXiv:1705.03551 (2017).

  6. (SearchQA) SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine. Matthew Dunn, Levent Sagun, Mike Higgins, V. Ugur Guney, Volkan Cirik, and Kyunghyun Cho. arXiv preprint arXiv:1704.05179 (2017).

  7. (QuAC) QuAC : Question Answering in Context. Eunsol Choi, He He, Mohit Iyyer, Mark Yatskar, Wen-tau Yih, Yejin Choi, Percy Liang, and Luke Zettlemoyer. arXiv preprint arXiv:1808.07036 (2018).

  8. (CoQA) CoQA: A Conversational Question Answering Challenge. Siva Reddy, Danqi Chen, and Christopher D. Manning. arXiv preprint arXiv:1808.07042 (2018).

  9. (MCTest) MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text. Matthew Richardson, Christopher J.C. Burges, and Erin Renshaw. EMNLP 2013.

  10. (CNN/Daily Mail) Teaching Machines to Read and Comprehend. Hermann, Karl Moritz, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom. NIPS 2015.

  11. (CBT) The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations. Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston. arXiv preprint arXiv:1511.02301 (2015).

  12. (bAbi) Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, and Tomas Mikolov. arXiv preprint arXiv:1502.05698 (2015).

  13. (LAMBADA) The LAMBADA Dataset:Word Prediction Requiring a Broad Discourse Context. Denis Paperno, Germ ́an Kruszewski, Angeliki Lazaridou, Quan Ngoc Pham, Raffaella Bernardi, Sandro Pezzelle, Marco Baroni, Gemma Boleda, and Raquel Fern ́andez. ACL 2016.

  14. (SCT) LSDSem 2017 Shared Task: The Story Cloze Test. Nasrin Mostafazadeh, Michael Roth, Annie Louis,Nathanael Chambers, and James F. Allen. ACL 2017 workshop.

  15. (Who did What) Who did What: A Large-Scale Person-Centered Cloze Dataset Takeshi Onishi, Hai Wang, Mohit Bansal, Kevin Gimpel, and David McAllester. EMNLP 2016.

  16. (NewsQA) NewsQA: A Machine Comprehension Dataset. Adam Trischler, Tong Wang, Xingdi Yuan, Justin Harris, Alessandro Sordoni, Philip Bachman, and Kaheer Suleman. arXiv preprint arXiv:1611.09830 (2016).

  17. (RACE) RACE: Large-scale ReAding Comprehension Dataset From Examinations. Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. EMNLP 2017.

  18. (ARC) Think you have Solved Question Answering?Try ARC, the AI2 Reasoning Challenge. Peter Clark, Isaac Cowhey, Oren Etzioni, Tushar Khot,Ashish Sabharwal, Carissa Schoenick, and Oyvind Tafjord. arXiv preprint arXiv:1803.05457 (2018).

  19. (MCScript) MCScript: A Novel Dataset for Assessing Machine Comprehension Using Script Knowledge. Simon Ostermann, Ashutosh Modi, Michael Roth, Stefan Thater, and Manfred Pinkal. arXiv preprint arXiv:1803.05223.

  20. (NarrativeQA) The NarrativeQA Reading Comprehension Challenge. Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, and Edward Grefenstette. TACL 2018.

  21. (DuoRC) DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension. Amrita Saha, Rahul Aralikatte, Mitesh M. Khapra, and Karthik Sankaranarayanan. ACL 2018.

  22. (CLOTH) Large-scale Cloze Test Dataset Created by Teachers. Qizhe Xie, Guokun Lai, Zihang Dai, and Eduard Hovy. EMNLP 2018.

  23. (DuReader) DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications. Wei He, Kai Liu, Yajuan Lyu, Shiqi Zhao, Xinyan Xiao, Yuan Liu, Yizhong Wang, Hua Wu, Qiaoqiao She, Xuan Liu, Tian Wu, and Haifeng Wang. ACL 2018 Workshop.

  24. (CliCR) CliCR: a Dataset of Clinical Case Reports for Machine Reading Comprehension. Simon Suster and Walter Daelemans. NAACL 2018.

 

-END-

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

清华 NLP 团队推荐:必读的77篇机器阅读理解论文 的相关文章

随机推荐

  • 环保行业如何开发废品回收微信小程序

    废品回收是近年来受到越来越多人关注的环保行动 为了推动废品回收的普及和方便 我们可以利用微信小程序进行制作 方便人们随时随地参与废品回收 首先 我们需要注册并登录乔拓云账号 并进入后台 乔拓云是一个提供微信小程序制作平台的服务商 非常适合我
  • php user.ini详解

    0x00 前言 本篇主要是讲解分析一下user ini相关的内容 因为这个知识点涉及到文件上传的绕过 0x01 正文 user ini 文件是PHP的配置文件 用于自定义PHP的配置选项 该文件通常位于PHP安装目录的根目录下 或者在特定的
  • 2. 依赖管理和自动配置

    文章目录 2 1 依赖管理 2 1 1 什么是依赖管理 2 1 2 修改自动仲裁 默认版本号 2 2 starter 场景启动器 2 2 1 starter 场景启动器基本介绍 2 2 2 官方提供的 starter 2 2 2 1 地址
  • PyTorch基础入门六:PyTorch搭建卷积神经网络实现MNIST手写数字识别

    1 卷积神经网络 CNN 简介 关于什么是卷积神经网络 CNN 请自行查阅资料进行学习 如果是初学者 这里推荐一下台湾的李宏毅的深度学习课程 链接就不给了 这些资料网站上随处可见 值得一提的是 CNN虽然在图像处理的领域具有不可阻挡的势头
  • ps2020无法显示最近打开

    首选项 常规 选择 自动显示主屏幕
  • 关于BeanUtils.copyProperties() 用法及区别

    这两个类在不同的包下面 而这两个类的copyProperties 方法里面传递的参数赋值是相反的 例如 a b为对象BeanUtils copyProperties a b BeanUtils是org springframework bea
  • FakeMsdMiner挖矿病毒分析报告

    近日 亚信安全截获新型挖矿病毒FakeMsdMiner 该病毒利用永恒之蓝 永恒浪漫等NSA漏洞进行攻击传播 该病毒具有远控功能 可以获取系统敏感信息 其通过修改HOST文件方式截获其他挖矿病毒的成果 由于该病毒的挖矿程序伪装成微软系统服务
  • [note] 深度学习 tensorflow 笔记(3) cnn 卷积神经网络

    假设我们想要辨识一张图片里面是不是有猫咪的存在 这只猫咪可以在图片的任何位置 什么办法才能辨别这个图片里面有没有猫呢 一个很简单的想法就是 将图片分成一些子图片的集合 逐个辨别子图片里面有没有猫咪 的确 卷积神经网络就是这样做的 但是 分割
  • Android高德地图自定义Mark并实现聚合效果

    Android高德地图自定义Mark并实现聚合效果 起因 公司本来项目里面用到了高德地图 然后最近老板看见别人的APP里面有个聚合的这个功能 老板 这个效果能不能实现 我也要 没有办法因为以前没有做过高德地图点聚合这个东西 然后只能勉强的答
  • Brocade FOS下载 博科光纤交换机固件升级

    百度网盘 https pan baidu com s 1lCAsjoDG3rMXs7uYoJETWA 输入码 7nv4 1 BT下载 比如用迅雷 17F8E2FAC8CD08C682B3D2A5CC294B48B1DA2ED6 7313C1
  • 韦东山 IMX6ULL和正点原子_「正点原子Linux连载」第四十三章Linux设备树(一)

    1 实验平台 正点原子Linux开发板 2 摘自 正点原子I MX6U嵌入式Linux驱动开发指南 关注官方微信号公众号 获取更多资料 正点原子 前面章节中我们多次提到 设备树 这个概念 因为时机未到 所以当时并没有详细的讲解什么是 设备树
  • 【Dubbo】Dubbo(二)简单实践

    Dubbo 二 实践 安装注册中心 下载zookeeper 在zookeeper路径下新增date文件夹存储数据 conf路径下新增zoo cfg 编辑zoo cfg 修改数据目录dataDir为新增的data文件夹 其他与zoo samp
  • 牛客小白月赛76

    牛客小白月赛76 ACM NOI CSP CCPC ICPC算法编程高难度练习赛 牛客竞赛OJ A 猜拳游戏 AC代码 include
  • COCO数据集的使用笔记

    一 简介 官方网站 http cocodataset org 全称 Microsoft Common Objects in Context MS COCO 支持任务 Detection Keypoints Stuff Panoptic Ca
  • 【C语言】关键字

    1 定义 声明的关键字 关键字 说明 int 声明整型变量或函数返回值类型 signed 声明有符号类型变量或函数 unsigned 声明无符号类型变量或函数 short 声明短整型变量或函数 long 声明长整型变量或函数返回值类型 fl
  • Seaborn lineplot图例标题设置

    详细版本见个人博客 Seaborn lineplot图例标题设置 一 问题描述 下面是我的lineplot 代码 如果此方法用了hue参数同时绘制多个类别图 这时候seaborn会把这个hue的标签当做子标题 sns lineplot x
  • 【BATJ面试必会】Jvm 虚拟机篇

    一 运行时数据区域 程序计数器 Java 虚拟机栈 本地方法栈 堆 方法区 运行时常量池 直接内存 二 垃圾收集 判断一个对象是否可被回收 引用类型 垃圾收集算法 垃圾收集器 三 内存分配与回收策略 Minor GC 和 Full GC 内
  • 网络层——IP协议及IP地址总结

    基本概念 在复杂的网络环境中确定一个合适的路径 主机 配有IP地址 但是不进行路由控制的设备 路由器 既然 配有IP地址 又能进行路由控制 节点 主机和路由器的统称 IP协议的功能 寻址和路由 根据对方的IP地址 寻找出最佳的路径传输信息
  • 数组下标的别致用法

    数组常识 common sense 当一个数组被创建 数组下标都是从0开始计数的 例如 创建了一个名为 arr 的数组 可以通过 arr i 来读取数据 arr 为数组名 i 为数组下标 arr i 在编译器内部会被读取转换为另外一种形式
  • 清华 NLP 团队推荐:必读的77篇机器阅读理解论文

    https mp weixin qq com s 2VhgEieBwXymAv2qxO3MPw 导读 机器阅读理解 Machine Reading Comprehension 是指让机器阅读文本 然后回答和阅读内容相关的问题 阅读理解是自然