我有这个数据框作为例子
import pandas as pd
#create dataframe
df = pd.DataFrame([['DE', 'Table',201705,201705, 1000], ['DE', 'Table',201705,201704, 1000],\
['DE', 'Table',201705,201702, 1000], ['DE', 'Table',201705,201701, 1000],\
['AT', 'Table',201708,201708, 1000], ['AT', 'Table',201708,201706, 1000],\
['AT', 'Table',201708,201705, 1000], ['AT', 'Table',201708,201704, 1000]],\
columns=['ISO','Product','Billed Week', 'Created Week', 'Billings'])
print (df)
ISO Product Billed Week Created Week Billings
0 DE Table 201705 201705 1000
1 DE Table 201705 201704 1000
2 DE Table 201705 201702 1000
3 DE Table 201705 201701 1000
4 AT Table 201708 201708 1000
5 AT Table 201708 201706 1000
6 AT Table 201708 201705 1000
7 AT Table 201708 201704 1000
我需要做的是用 0 Billings 为每个 groupby['ISO','Product'] 填充一些缺失的数据,其中序列中存在中断,即在某一周内没有创建帐单,因此它丢失了。它需要基于计费周的最大值和创建周的最小值。也就是说,这些组合应该是完整的,顺序上没有中断。
因此,对于上述情况,我需要以编程方式追加到数据库中的缺失记录如下所示:
ISO Product Billed Week Created Week Billings
0 DE Table 201705 201703 0
1 AT Table 201708 201707 0