首先,您需要决定要使用的重复标准是什么。例如,我想检查同一组织名称、公司名称和邮政编码中的电话号码是否重复。我可以做这个:
第一个测试表
CREATE TABLE TEST
(
id int,
phonenumber int,
orgno int,
companyname varchar(100),
postalcode varchar(4)
)
然后测试数据
insert into TEST
values (1,4533660,1,'COMPANY 1',1234)
insert into TEST
values (2,4533660,1,'COMPANY 1',1234)
insert into TEST
values (3,954189547,1,'COMPANY 2',4444)
insert into TEST
values (4,954189547,1,'COMPANY 2',4444)
insert into TEST
values (5,3652591,1,'COMPANY 3',4444)
insert into TEST
values (6,4201580,1,'COMPANY 4',4444)
insert into TEST
values (7,3337788,1,'COMPANY 5',4444)
最后,使用选定的条件查询重复项。
Select phonenumber,
orgno,
companyname,
postalcode,
COUNT(*)
from test
group by phonenumber,
orgno,
companyname,
postalcode
HAVING COUNT(*) > 1
通过此查询,您可以轻松找到重复项,最重要的是检查重复项的条件。
Edit
如果要删除除 1 之外的所有重复行,可以这样做:
DELETE A
FROM TEST A
INNER JOIN
(
SELECT ROW_NUMBER()OVER(PARTITION BY a.phonenumber,
a.orgno,
a.companyname,
a.postalcode ORDER BY a.id)AS POS,
a.phonenumber,
a.orgno,
a.companyname,
a.postalcode,
a.id
FROM TEST A
JOIN
(
SELECT
phonenumber,
orgno,
companyname,
postalcode,
COUNT(*) AS CONTADOR
FROM test
GROUP BY phonenumber,
orgno,
companyname,
postalcode
HAVING COUNT(*) > 1
) TB
ON A.companyname = TB.companyname
AND A.orgno = TB.orgno
AND A.phonenumber = TB.phonenumber
AND A.postalcode = TB.postalcode
) TB_2
ON
a.id = tb_2.id
and tb_2.pos > 1
使用提供的数据结构编辑脚本。
DELETE FROM TEST B
JOIN
(
SELECT ROW_NUMBER()OVER(PARTITION BY
a.org_nr,
a.phone1,
a.phone2,
a.name,
a.marketing_name
ORDER BY
a.org_nr,
a.phone1,
a.phone2,
a.name,
a.marketing_name
)AS POS,
a.address_id,
a.org_nr,
a.phone1,
a.phone2,
a.name,
a.marketing_name
FROM TEST A
JOIN
(
SELECT
org_nr,
phone1,
phone2,
name,
marketing_name
FROM TEST
GROUP BY
org_nr,
phone1,
phone2,
name,
marketing_name
HAVING COUNT(*) > 1) TB
ON a.org_nr = tb.org_nr
and a.phone1 = tb.phone1
and a.phone2 = tb.phone2
and a.name = tb.name
and a.marketing_name = tb.marketing_name
) TB_2
ON
b.address_id = tb_2.address_id
and tb_2.pos > 1