Google Scholar 是否有可供我们在研究应用程序中使用的 API？

2024-03-20

我正在开展一个研究出版物和合作项目，其中有文献检索功能。 Google Scholar 似乎可以工作，因为它是一个开源工具，但是当我研究 Google Scholar 时，我找不到任何有关它具有 API 的信息。

有谷歌学术的 API 吗？

没有官方的 Google Scholar API https://academia.stackexchange.com/questions/34970/how-to-get-permission-from-google-to-use-google-scholar-data-if-needed/34973#34973.

有第三方解决方案，例如免费的scholarly https://github.com/scholarly-python-package/scholarly支持的Python包profile https://scholarly.readthedocs.io/en/stable/quickstart.html#search-by-keyword-and-return-a-generator-of-author-objects, author https://scholarly.readthedocs.io/en/stable/quickstart.html#search-for-an-author-by-name-and-return-a-generator-of-author-objects, cite https://scholarly.readthedocs.io/en/stable/quickstart.html#citedby and organic https://github.com/scholarly-python-package/scholarly/blob/9269ff36ad2314e6cc0c5b499efc3b79b844707e/scholarly/_scholarly.py#L24结果（search_pubs https://scholarly.readthedocs.io/en/stable/quickstart.html?highlight=organic#search-pubs似乎是获得有机结果的方法，尽管方法名称让我感到困惑）。

请注意，通过使用scholarly如果持续没有请求速率限制，Google 可能会阻止您的 IP (@RadioControlled 提到 https://stackoverflow.com/questions/62938110/does-google-scholar-have-an-api-available-that-we-can-use-in-our-research-applic/71236400?noredirect=1#comment131414631_71236400）。明智地使用它。

此外，还有一个scrape-google-scholar-py https://github.com/dimitryzub/scrape-google-scholar-py模块可让您提取几乎所有 Google Scholar 页面。

或者，有一个谷歌学术 API https://serpapi.com/google-scholar-api来自 SerpApi，这是一个付费 API，具有免费计划，支持organic https://serpapi.com/google-scholar-organic-results, cite https://serpapi.com/google-scholar-cite-api, profile https://serpapi.com/google-scholar-profiles-api, author https://serpapi.com/google-scholar-author-api结果并绕过 SerpApi 后端上的所有阻止，因此它不会阻止您的 IP，并且它会处理抓取的合法部分。

使用以下命令解析配置文件结果的示例代码scholarly using search_by_keyword https://scholarly.readthedocs.io/en/stable/quickstart.html#search-by-keyword-and-return-a-generator-of-author-objects method:

import json
from scholarly import scholarly

# will paginate to the next page by default
authors = scholarly.search_keyword("biology")

for author in authors:
    print(json.dumps(author, indent=2))

# part of the output:

'''
{
  "container_type": "Author",
  "filled": [],
  "source": "SEARCH_AUTHOR_SNIPPETS",
  "scholar_id": "LXVfPc8AAAAJ",
  "url_picture": "https://scholar.google.com/citations?view_op=medium_photo&user=LXVfPc8AAAAJ",
  "name": "Eric Lander",
  "affiliation": "Broad Institute",
  "email_domain": "",
  "interests": [
    "Biology",
    "Genomics",
    "Genetics",
    "Bioinformatics",
    "Mathematics"
  ],
  "citedby": 552013
}
... other author results
'''

使用示例scrape-google-scholar-py https://github.com/dimitryzub/scrape-google-scholar-py#example-usage-custom-backend:

from google_scholar_py import CustomGoogleScholarProfiles
import json

parser = CustomGoogleScholarProfiles()
data = parser.scrape_google_scholar_profiles(
    query='blizzard',
    pagination=False,
    save_to_csv=False,
    save_to_json=False
)
print(json.dumps(data, indent=2))

Outputs:

[
  {
    "name": "Adam Lobel",
    "link": "https://scholar.google.com/citations?hl=en&user=_xwYD2sAAAAJ",
    "affiliations": "Blizzard Entertainment",
    "interests": [
      "Gaming",
      "Emotion regulation"
    ],
    "email": "Verified email at AdamLobel.com",
    "cited_by_count": 3593
  }, # other results...
]

使用以下命令解析有机结果的示例代码Google 学术搜索结果 API https://serpapi.com/google-scholar-profiles-api来自 SerpApi：

import json
from serpapi import GoogleScholarSearch

# search parameters
params = {
    "api_key": "Your SerpApi API key",
    "engine": "google_scholar_profiles",
    "hl": "en",                            # language
    "mauthors": "biology"                  # search query
}

search = GoogleScholarSearch(params)
results = search.get_dict()

# only first page results
for result in results["profiles"]:
    print(json.dumps(result, indent=2))

# part of the output:
'''
{
  "name": "Masatoshi Nei",
  "link": "https://scholar.google.com/citations?hl=en&user=VxOmZDgAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=VxOmZDgAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "VxOmZDgAAAAJ",
  "affiliations": "Laura Carnell Professor of Biology, Temple University",
  "email": "Verified email at temple.edu",
  "cited_by": 384074,
  "interests": [
    {
      "title": "Evolution",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aevolution",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:evolution"
    },
    {
      "title": "Evolutionary biology",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aevolutionary_biology",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:evolutionary_biology"
    },
    {
      "title": "Molecular evolution",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Amolecular_evolution",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:molecular_evolution"
    },
    {
      "title": "Population genetics",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Apopulation_genetics",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:population_genetics"
    },
    {
      "title": "Phylogenetics",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aphylogenetics",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:phylogenetics"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=VxOmZDgAAAAJ&citpid=3"
}
... other results
'''

有一个专门的使用 Python 抓取历史 Google Scholar 结果 https://serpapi.com/blog/scrape-historic-google-scholar-results-using-python/我在 SerpApi 上的博客文章展示了如何将历史性的 2017-2021 Organic、Cite Google Scholar 结果抓取到 CSV、SQLite。

还有一篇关于在 R 中抓取 Google Scholar https://dimitryzub.medium.com/scrape-google-scholar-in-r-d521cfe0e8d，如果你不是 Python 爱好者。

免责声明，我为 SeprApi 工作

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

Google Scholar 是否有可供我们在研究应用程序中使用的 API？的相关文章

wttr.in 对浏览器的响应与对curl 的响应有何不同？

如果你去http wttr in http wttr in 在浏览器中您将看到一个包含在标签中具有链接并使用跨度着色的页面如果您然后转到终端并输入curl http wttr in 您将得到几乎完全相同的页面但代码却截然不同 wtt
参数前缀“:”后不允许有空格

我的问题是我尝试在查询中插入包含 char 的文本我尝试在 char 之前添加双反斜杠但仍然不起作用 ABNORMALLY java lang IllegalArgumentException org hibernate QueryEx
新的 TypeScript 版本不包括“window.navigator.msSaveBlob”

我有一个 TypeScript 项目 https github com jmaister excellentexport https github com jmaister excellentexport 并且工作正常添加dependab
如何使用Azure Blob存储挂载数据？

我是 Azure Databricks 的新手我的导师建议我完成机器学习训练营 https aischool microsoft com en us machine learning learning paths ai platform
StackFrame 的性能如何？

我正在考虑使用类似的东西StackFrame stackFrame new StackFrame 1 记录执行方法但我不知道它的性能影响堆栈跟踪是否是在每次方法调用时都会构建的因此性能不应该成为问题还是仅在需要时才构建您是否建议在

随机推荐

上传进度 - 有时 $_SESSION[$key] 为空

我有 Ubuntu 12 04 LTS 并使用 PHP 5 5 和 Apache2 通过 PHP 会话上传进度来实现上传进度问题是它有时有效有时无效我的意思是有时我在上传开始时直接获得 100 的进度百分比而没有完成上传这意味着在这
自动重新加载作为 PySpark 依赖项添加到驱动程序中的模块中的更改

我刚刚注意到一个微妙的问题在使用时addPyFile在 PySpark 和autoreload在 Jupyter 笔记本中发生的情况是我在 PySpark UDF 中使用的模块中有一些代码因此据我所知这些模块需要通过以下方式添加
在Three.js中，如何翻译Vector3？

我创建了一个Vector3 called ori 我已经填充了它的坐标 x y 和 z 现在我该如何平移该矢量例如沿 z 轴的指示值我试过这个 ori translateZ 100 这给我一个错误 TypeError 无法读取未定义
使用哈希将类添加到具有特定 href 的 a 元素

我试图根据 URL 哈希向特定元素添加活动类但它没有按我的预期工作这是我的代码 var hash window location hash substr 1 if hash false products copy div produ
在 Sequelize model.destroy({ truncate: true }) 中不重置主键

在 Sequelize 中我使用这个函数 model destory truncate true 它删除表中的所有数据但问题是它不会重置表中应设置为零的主键序列我正在使用Mysql 有人说Mysql会自动重置主键序列但在我的情况下并
C++ 标准是否对浮点数的表示指定了任何内容？

对于类型T为此std is floating point
在 igraph 中绘制网络时，R 绘图边距太大

我已经搜索过但还没有找到可以开始工作的解决方案我没有任何可复制的数据但我有下面的图表说明了我的问题我的图太小了对我的问题可能是什么有什么想法吗下图大约有一半的区域是空白使用以下命令设置边距参数 par mar c 0 0 0
webapi 返回的动态 JavaScript

我正在使用 requirejs 为我的页面加载 JavaScript 我有一个 webApi 路由可以使用 Newtonsoft JObject 动态读取文件并返回 JSON 然后在客户端我获取结果并将其分配给本地 JavaScrip
iOS7 上自动布局忽略 UITabBar（栏下的内容）

我已在情节提要中将选项卡栏设置为不透明但我似乎仍然是半透明的当我设置自定义 UITabBarController 时setBarStyle仅提供不透明黑色但这是最不重要的问题无论我做什么我的视图内容都会位于选项卡栏下方就像它被
我刚刚部署了我的 Vite React 站点，但我的图标/图像没有部署

I deployed my Vite React site on Netlify but my skill icons aren t rendered Here s the site https mjshubham21 portfolio
Swift：验证用户名输入

在开发 Swift 应用程序时我有一个由用户填写的表单我希望用户选择自己的用户名我想要对用户名的唯一限制是没有特殊字符例如只允许使用字母下划线和数字长度最多应为 18 个字符最少为 7 个字符在哪里可以找到验证输入字符串
由于导入量角器而无法加载 config.ts

我正在尝试启动一个新的量角器项目来测试有角度的网站我全局安装了node js typescript protractor 和jasmine 我转到项目文件夹并执行webdriver manager update 那我就做webdriver
Apache CXF 生成数字而不是字符串

如果一个字符串看起来像一个数字例如 111 CXF 不将其作为字符串返回而是作为数字返回
Apache Spark 中的 CPU 使用率是否受到限制？

我最近发现在 UDF 中添加并行计算例如使用并行集合可以显着提高性能即使在运行 Spark 时也是如此local 1 模式或使用具有 1 个执行器和 1 个核心的 Yarn E g in local 1 模式下 Spark Jobs
弧形边框 CSS 实现

最近我在Dribbble上看到一个设计理念对我很有启发特别是顶部和底部带有弯曲 U 形的侧边栏给了我关于制作选项卡堆栈或流程图的很好的想法我可以用 alpha 图像来制作它但使用纯 CSS 会更好而且我不介意 CSS3 不过我对
快速从模数和指数创建 SecKey

我尝试制作自己的 SecKey exponent let exponent 10001 modulus let modulus D6250B831F82EC984513922E797283E4D3879E1F0AD52364EBDA5A56
SQL Server：什么是 ODBC 规范函数？

什么是 ODBC 规范函数例如 fn NOW 基本上这些规范函数是微软承诺将适用于其提供商支持的所有类型的数据源的一组函数这意味着您不必根据实际使用的数据提供程序来区分您的代码这些函数不是基本 SQL 的一部分因此您应该尝试找到可
用于输入金额的 UITextField

我正在开发一个销售点应用程序所以我想假设用户输入100000但我希望它自动显示100 000 and 1000000 become 1 000 000 第二个问题是我不希望用户能够输入他们自己第三个问题是既然这是钱我们不能让用
Docker maven Fabric8 插件（在 Windows 上）：构建映像会出现不兼容问题？

我想通过 Maven 从 Springboot 项目构建 Docker 镜像我运行 mvn clean package docker build 问题 ERROR Failed to execute goal io fabric8 doc
Google Scholar 是否有可供我们在研究应用程序中使用的 API？

我正在开展一个研究出版物和合作项目其中有文献检索功能 Google Scholar 似乎可以工作因为它是一个开源工具但是当我研究 Google Scholar 时我找不到任何有关它具有 API 的信息有谷歌学术的 API 吗没有

Google Scholar 是否有可供我们在研究应用程序中使用的 API？

Google Scholar 是否有可供我们在研究应用程序中使用的 API？ 的相关文章

随机推荐

热门标签

Google Scholar 是否有可供我们在研究应用程序中使用的 API？的相关文章