我有一个数据框:
Prop_ID Unit_ID Prop_Usage Unit_Usage
1 1 RESIDENTIAL RESIDENTIAL
1 2 RESIDENTIAL COMMERCIAL
1 3 RESIDENTIAL INDUSTRIAL
1 4 RESIDENTIAL RESIDENTIAL
2 1 COMMERCIAL RESIDENTIAL
2 2 COMMERCIAL COMMERCIAL
2 3 COMMERCIAL COMMERCIAL
3 1 INDUSTRIAL INDUSTRIAL
3 2 INDUSTRIAL COMMERCIAL
4 1 RESIDENTIAL - COMMERCIAL RESIDENTIAL
4 2 RESIDENTIAL - COMMERCIAL COMMERCIAL
4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL
5 1 COMMERCIAL / RESIDENTIAL RESIDENTIAL
5 2 COMMERCIAL / RESIDENTIAL COMMERCIAL
5 3 COMMERCIAL / RESIDENTIAL INDUSTRIAL
5 4 COMMERCIAL / RESIDENTIAL COMMERCIAL
一处房产可能有超过 1 个单元。这意味着单位是属性的子类别。我想过滤行Prop_Usage
不匹配Unit_Usage
。我们有一个类别Prop_Usage
列就是RESIDENTIAL - COMMERCIAL
then Unit_Usage
可以是RESIDENTIAL
or COMMERCIAL
。同样对于COMMERCIAL / RESIDENTIAL
.
预期输出:
Prop_ID Unit_ID Prop_Usage Unit_Usage
1 2 RESIDENTIAL COMMERCIAL
1 3 RESIDENTIAL INDUSTRIAL
2 1 COMMERCIAL RESIDENTIAL
3 2 INDUSTRIAL COMMERCIAL
4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL
5 3 COMMERCIAL / RESIDENTIAL INDUSTRIAL
Use in
中的声明DataFrame.apply:
df = df[~df.apply(lambda x: x['Unit_Usage'] in x['Prop_Usage'], axis=1)]
Or use zip
在列表理解中:
df = df[[not a in b for a, b in zip(df['Unit_Usage'], df['Prop_Usage'])]]
print (df)
Prop_ID Unit_ID Prop_Usage Unit_Usage
1 1 2 RESIDENTIAL COMMERCIAL
2 1 3 RESIDENTIAL INDUSTRIAL
4 2 1 COMMERCIAL RESIDENTIAL
8 3 2 INDUSTRIAL COMMERCIAL
11 4 3 RESIDENTIAL - COMMERCIAL INDUSTRIAL
14 5 3 COMMERCIAL / RESIDENTIAL INDUSTRIAL
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)