我之前曾多次希望有这个功能,由于它再次出现在这里,我决定用谷歌搜索一下并找到了 perl 的Algorithm::Diff
您可以将其提供给哈希函数(他们称之为“密钥生成函数”)“应该返回一个唯一标识给定元素的字符串”算法用来进行比较的内容(而不是您提供给它的实际内容)。
基本上,您需要做的就是添加一个 sub,它以您希望从字符串中过滤掉不需要的内容的方式执行一些正则表达式魔法,并将 subref 作为参数添加到调用中diff()
(参见我的CHANGE 1
and CHANGE 2
下面的代码片段中的评论)。
如果您需要正常(或统一)diff
输出,检查详细的diffnew.pl
模块附带的示例并在此文件中进行必要的更改。为了演示目的,我将使用简单的diff.pl
它也附带,因为它很短,我可以将它完整地发布在这里。
mydiff.pl
#!/usr/bin/perl
# based on diff.pl that ships with Algorithm::Diff
# demonstrates the use of a key generation function
# the original diff.pl is:
# Copyright 1998 M-J. Dominus. ([email protected])
# This program is free software; you can redistribute it and/or modify it
# under the same terms as Perl itself.
use Algorithm::Diff qw(diff);
die("Usage: $0 file1 file2") unless @ARGV == 2;
my ($file1, $file2) = @ARGV;
-f $file1 or die("$file1: not a regular file");
-f $file2 or die("$file2: not a regular file");
-T $file1 or die("$file1: binary file");
-T $file2 or die("$file2: binary file");
open (F1, $file1) or die("Couldn't open $file1: $!");
open (F2, $file2) or die("Couldn't open $file2: $!");
chomp(@f1 = <F1>);
close F1;
chomp(@f2 = <F2>);
close F2;
# CHANGE 1
# $diffs = diff(\@f1, \@f2);
$diffs = diff(\@f1, \@f2, \&keyfunc);
exit 0 unless @$diffs;
foreach $chunk (@$diffs)
{
foreach $line (@$chunk)
{
my ($sign, $lineno, $text) = @$line;
printf "%4d$sign %s\n", $lineno+1, $text;
}
}
exit 1;
# CHANGE 2 {
sub keyfunc
{
my $_ = shift;
s/^(\d{2}:\d{2})\s+//;
return $_;
}
# }
one.txt
12:15 one two three
13:21 three four five
two.txt
10:01 one two three
14:38 seven six eight
示例运行
$ ./mydiff.pl one.txt two.txt
2- 13:21 three four five
2+ 13:21 seven six eight
示例运行 2
这是正常情况下的一个diff
输出基于diffnew.pl
$ ./my_diffnew.pl one.txt two.txt
2c2
< 13:21 three four five
---
> 13:21 seven six eight
正如您所看到的,两个文件中的第一行都会被忽略,因为它们仅在时间戳上有所不同,并且哈希函数会删除这些行以进行比较。
瞧,您刚刚推出了自己的内容感知功能diff
!