如何将嵌套字典转换为 pandas 数据框

2024-02-08

我正在尝试转换包含其他数据帧的数据帧,例如:

{
  'id': 3241234,
  'data': {
           'name':'carol',
           'lastname': 'netflik',
           'office': {
                       'num': 3543,
                       'department': 'trigy'
                    }
        }


}

我尝试使用:

pd.DataFrame.from_dict(data)

但结果数据框如下所示:

               id                                  data
lastname  3241234                               netflik
name      3241234                                 carol
office    3241234  {'num': 3543, 'department': 'trigy'}

任何想法?


加载 JSON/字典:

  • Using .json_normalized https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html扩大dict.
import pandas as pd

data = {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}

df = pd.json_normalize(data)

# display(df)
        id data.name data.lastname  data.office.num data.office.department
0  3241234     carol       netflik             3543                  trigy

如果数据框的列为dicts

  • 另请参阅此answer https://stackoverflow.com/questions/63311361,对此SO:用 pandas 将一列字典拆分/分解为单独的列 https://stackoverflow.com/questions/38231591
# dataframe with column of dicts
df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})

# display(df)
   col2                                                                                                                col
0     1  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
1     2  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}
2     3  {'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}

# normalize the column of dicts
normalized = pd.json_normalize(df['col'])

# join the normalized column to df
df = df.join(normalized).drop(columns=['col'])

# display(df)
   col2       id data.name data.lastname  data.office.num data.office.department
0     1  3241234     carol       netflik             3543                  trigy
1     2  3241234     carol       netflik             3543                  trigy
2     3  3241234     carol       netflik             3543                  trigy

如果数据框有一列lists with dicts

  • The dicts需要从lists with .explode
data = [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]

df = pd.DataFrame({'col2': [1, 2, 3], 'col': [data, data, data]})

# display(df)
   col2                                                                                                                  col
0     1  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
1     2  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]
2     3  [{'id': 3241234, 'data': {'name': 'carol', 'lastname': 'netflik', 'office': {'num': 3543, 'department': 'trigy'}}}]

# explode the lists
df = df.explode('col', ignore_index=True)

# remove and normalize the column of dicts
normalized = pd.json_normalize(df.pop('col'))

# join the normalized column to df
df = df.join(normalized)
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何将嵌套字典转换为 pandas 数据框 的相关文章

随机推荐