使用 Git 2.10(2016 年第 3 季度),您可以更多地了解这些损坏链接的来源。
git fsck --name-objects
See commit 90cf590 https://github.com/git/git/commit/90cf590f53f2939a47ca7b397e270e8228699829, commit 1cd772c https://github.com/git/git/commit/1cd772cc4124e43b14231dcaeae8a5dddf4ffdb9, commit 7b35efd https://github.com/git/git/commit/7b35efd734e501f9e4692768a8b6aea818c0c93e, commit 993a21b https://github.com/git/git/commit/993a21b0a05bf2e2063c58e5722c29f5747e39d4 (17 Jul 2016) by Johannes Schindelin (dscho) https://github.com/dscho.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit 9db3979 https://github.com/git/git/commit/9db397978416f9d562a94e55db86c7a45210a05c, 25 Jul 2016)
fsck
:可选择显示有关损坏链接的更多有用信息
当报告提交/树/blob 之间断开的链接时,如果用户被告知如何访问该对象,有时会非常有帮助。
随着新--name-objects
选项,git-fsck
将尝试做到这一点:
以显示对象可访问性的方式命名对象。
例如,当某些引用日志损坏并且丢失不应丢失的 blob 时,用户可能希望删除相应的引用日志条目。
此选项可帮助他们找到该条目:git fsck --name-objects
现在将报告如下内容:
broken link from tree b5eb6ff... (refs/stash@{<date>}~37:)
to blob ec5cf80...
如果这些损坏的链接不是来自本地存储而是来自远程存储库,获取这些包对象就可以解决这个问题 https://stackoverflow.com/a/26228383/6309.
也可以看看 ”如何恢复因硬盘故障而损坏的 Git 对象? https://stackoverflow.com/a/22694491/6309".
With Git 2.31 (Q1 2021), fix "git fsck --name-objects https://github.com/git/git/blob/9e634a91c8b6f57508aa91bd7306194d6ef6c14a/Documentation/git-fsck.txt#L94"(man https://git-scm.com/docs/git-fsck#Documentation/git-fsck.txt---name-objects) which apparently has not been used by anybody who is motivated enough to report breakage.
See commit e89f893 https://github.com/git/git/commit/e89f89361cd7b706858eb22a6cf3d59d31a00acf, commit 8c891ee https://github.com/git/git/commit/8c891eed3a89ff945b7957cdf62037b2e2b6eca7 (10 Feb 2021) by Johannes Schindelin (dscho) https://github.com/dscho.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit 9e634a9 https://github.com/git/git/commit/9e634a91c8b6f57508aa91bd7306194d6ef6c14a, 17 Feb 2021)
fsck --name-objects https://github.com/git/git/commit/e89f89361cd7b706858eb22a6cf3d59d31a00acf:解析代号时要更加小心
Signed-off-by: Johannes Schindelin
In 7b35efd https://github.com/git/git/commit/7b35efd734e501f9e4692768a8b6aea818c0c93e (fsck_walk()
:可选择随时随地命名对象,2016-07-17,Git v2.10.0-rc0 --merge https://github.com/git/git/commit/9db397978416f9d562a94e55db86c7a45210a05c列于batch #7 https://github.com/git/git/commit/8c6d1f9807c67532e7fb545a944b064faff0f70b) (fsck_walk()
:可选择在旅途中命名对象,2016-07-17),fsck
机器学会了有选择地命名对象,这样就可以更容易地看到存储库的哪个部分状况不佳,例如,当对象丢失时。
为了节省复杂性,该机制使用解析器来确定给定提交名称的父级名称:任何~<n>
后缀被解析,父母的名字由前缀和~<n+1>
.
然而,这个解析器有一个错误:如果它找到后缀<n>
那是not ~<n>
,它会将空字符串误认为前缀并且<n>
为世代数。
换句话说,它将生成一个以下形式的名称~<bogus-number>
.
让我们解决这个问题。
With Git 2.40 (Q1 2023), "git hash-object https://github.com/git/git/blob/abf2bb895b429e9fefc478d9c230bf74622be620/Documentation/git-hash-object.txt"(man https://git-scm.com/docs/git-hash-object) now checks that the resulting object is well formed with the same code as git fsck
".
See commit 8e43090 https://github.com/git/git/commit/8e4309038f0a72aef950f87fe187af824fc8efc0 (19 Jan 2023), and commit 69bbbe4 https://github.com/git/git/commit/69bbbe484ba10bd88efb9ae3f6a58fcc687df69e, commit 35ff327 https://github.com/git/git/commit/35ff327e2da2e9fa9820643d2e44f3b30530d06c, commit 34959d8 https://github.com/git/git/commit/34959d80db602b7d6893c9e2dfa81d78fd16f702, commit ad5dfea https://github.com/git/git/commit/ad5dfeac040c16057a23f341408d229656e42ab4, commit 61cc4be https://github.com/git/git/commit/61cc4be7ec21f0217abacc396287ca12c68e923d, commit 6e26460 https://github.com/git/git/commit/6e2646075c456f2bd3dfe6afd7d72316174b02ed (18 Jan 2023) by Jeff King (peff) https://github.com/peff.
(Merged by Junio C Hamano -- gitster -- https://github.com/gitster in commit abf2bb8 https://github.com/git/git/commit/abf2bb895b429e9fefc478d9c230bf74622be620, 30 Jan 2023)
hash-object https://github.com/git/git/commit/69bbbe484ba10bd88efb9ae3f6a58fcc687df69e: use fsck
用于对象检查
Signed-off-by: Jeff King
Since c879daa https://github.com/git/git/commit/c879daa23729547fb28aa7e8783c5e4e619a9e7c ("Make hash-object
对格式错误的对象更强大”,2011-02-05,Git v1.7.5-rc0 --merge https://github.com/git/git/commit/fc7ae9c156775cc9679c0bcc7156abb7dba1bd3a),我们通过我们常用的树、提交和标签解析器运行它们,对我们即将编写的对象进行了一些基本检查。
这些解析器发现了一些问题,但它们并不像fsck
函数(这是有道理的;解析器被设计为快速且宽容,仅当输入难以理解时才放弃)。
我们最好做得更彻底fsck
写入对象时进行检查。
在写入时这样做比编写垃圾后才发现(在其上构建更多历史之后!)要好得多fsck
对此进行投诉,或者房东transfer.fsckObjects
拒绝它。
这显然将是用户可见的行为更改,本系列前面的测试更改显示了影响的范围。
但我认为这是可以的:
- the documentation for
hash-object
is already vague about which checks we might do, saying that --literally
will allow any garbage[...] which might not otherwise pass standard object parsing or git-fsck https://github.com/git/git/blob/69bbbe484ba10bd88efb9ae3f6a58fcc687df69e/Documentation/git-fsck.txt(man https://git-scm.com/docs/git-fsck) checks".
So we are already covered under the documented behavior.
- 无论如何,用户通常不会运行 hash-object。
测试中有很多地方需要更新,因为创建垃圾对象是 Git 测试不成比例的事情。
- 很难想象有人会认为新行为更糟糕。
我们拒绝的任何对象都将成为用户未来的潜在问题。
如果他们真的想制造垃圾--literally
已经是他们需要的逃生口了。
Note that the change here is actually in index_mem()
, which handles the HASH_FORMAT_CHECK
flag passed by hash-object.
That flag is also used by "git-replace --edit https://github.com/git/git/blob/69bbbe484ba10bd88efb9ae3f6a58fcc687df69e/Documentation/git-replace.txt"(man https://git-scm.com/docs/git-replace) to sanity-check the result.
Covering that with more thorough checks likewise seems like a good thing.
除了更彻底之外,还有一些其他好处:
-
我们摆脱了一些有问题的对象结构的堆栈分配。
这些目前似乎在实践中不会造成任何问题,但它们巧妙地违反了其余代码所做的一些假设(例如,我们放在堆栈上的“结构提交”和零初始化将没有适当的值)索引来自alloc_comit_index()
.
-
同样,那些解析的对象结构是一些小内存泄漏的根源
-
由此产生的消息要好得多。
例如:
[before]
$ echo 'tree 123' | git hash-object -t commit --stdin
error: bogus commit object 0000000000000000000000000000000000000000
fatal: corrupt commit
[after]
$ echo 'tree 123' | git.compile hash-object -t commit --stdin
error: object fails fsck: badTreeSha1: invalid 'tree' line format - bad sha1
fatal: refusing to create malformed object