我想对具有日期时间索引的数据框执行联接/合并/追加操作。
假设我有df1
我想添加df2
到它。df2
可以有更少或更多的列以及重叠的索引。对于索引匹配的所有行,如果df2
具有相同的列df1
,我想要的值df1
被那些来自df2
.
我怎样才能获得想要的结果?
怎么样:df2.combine_first(df1) https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.combine_first.html?
In [33]: df2
Out[33]:
A B C D
2000-01-03 0.638998 1.277361 0.193649 0.345063
2000-01-04 -0.816756 -1.711666 -1.155077 -0.678726
2000-01-05 0.435507 -0.025162 -1.112890 0.324111
2000-01-06 -0.210756 -1.027164 0.036664 0.884715
2000-01-07 -0.821631 -0.700394 -0.706505 1.193341
2000-01-10 1.015447 -0.909930 0.027548 0.258471
2000-01-11 -0.497239 -0.979071 -0.461560 0.447598
In [34]: df1
Out[34]:
A B C
2000-01-03 2.288863 0.188175 -0.040928
2000-01-04 0.159107 -0.666861 -0.551628
2000-01-05 -0.356838 -0.231036 -1.211446
2000-01-06 -0.866475 1.113018 -0.001483
2000-01-07 0.303269 0.021034 0.471715
2000-01-10 1.149815 0.686696 -1.230991
2000-01-11 -1.296118 -0.172950 -0.603887
2000-01-12 -1.034574 -0.523238 0.626968
2000-01-13 -0.193280 1.857499 -0.046383
2000-01-14 -1.043492 -0.820525 0.868685
In [35]: df2.comb
df2.combine df2.combineAdd df2.combine_first df2.combineMult
In [35]: df2.combine_first(df1)
Out[35]:
A B C D
2000-01-03 0.638998 1.277361 0.193649 0.345063
2000-01-04 -0.816756 -1.711666 -1.155077 -0.678726
2000-01-05 0.435507 -0.025162 -1.112890 0.324111
2000-01-06 -0.210756 -1.027164 0.036664 0.884715
2000-01-07 -0.821631 -0.700394 -0.706505 1.193341
2000-01-10 1.015447 -0.909930 0.027548 0.258471
2000-01-11 -0.497239 -0.979071 -0.461560 0.447598
2000-01-12 -1.034574 -0.523238 0.626968 NaN
2000-01-13 -0.193280 1.857499 -0.046383 NaN
2000-01-14 -1.043492 -0.820525 0.868685 NaN
请注意,它的值来自df1
对于不重叠的索引df2
。如果这不能完全满足您的要求,我愿意改进此功能/为其添加选项。
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)