如何将 pandas DataFrame 表保存为 png

2023-12-19

我构建了一个结果的 pandas 数据框。该数据框充当表格。有 MultiIndexed 列，每行代表一个名称，即index=['name1','name2',...]创建 DataFrame 时。我想显示这个表格并将其保存为 png （或任何图形格式）。目前，我能得到的最接近的是将其转换为 html，但我想要一个 png。看起来类似的问题已经被问过，例如如何将 Pandas 数据框/系列数据保存为图形？ https://stackoverflow.com/questions/19726663/how-to-save-the-pandas-dataframe-series-data-as-a-figure

然而，标记的解决方案将数据帧转换为线图（而不是表格），而另一个解决方案依赖于 PySide，我想远离它，因为我无法在 Linux 上安装它。我希望这段代码易于移植。我真的很期待用 python 创建 png 表格很容易。感谢所有帮助。

Pandas 允许您使用 matplotlib 绘制表格（详细信息here http://pandas.pydata.org/pandas-docs/stable/visualization.html#plotting-tables）。通常这会将表格直接绘制到绘图上（带有轴和所有内容），这不是您想要的。但是，可以先删除这些：

import matplotlib.pyplot as plt
import pandas as pd
from pandas.table.plotting import table # EDIT: see deprecation warnings below

ax = plt.subplot(111, frame_on=False) # no visible frame
ax.xaxis.set_visible(False)  # hide the x axis
ax.yaxis.set_visible(False)  # hide the y axis

table(ax, df)  # where df is your data frame

plt.savefig('mytable.png')

输出可能不是最漂亮的，但您可以找到 table() 函数的其他参数here http://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes.table。也感谢这个帖子 http://matplotlib.1069221.n5.nabble.com/Draw-only-table-without-XY-Axis-td19546.html有关如何删除 matplotlib 中的轴的信息。

EDIT:

这是使用上述方法进行绘图时模拟多索引的一种（诚然相当老套的）方法。如果您有一个名为 df 的多索引数据框，如下所示：

first  second
bar    one       1.991802
       two       0.403415
baz    one      -1.024986
       two      -0.522366
foo    one       0.350297
       two      -0.444106
qux    one      -0.472536
       two       0.999393
dtype: float64

首先重置索引，使它们成为普通列

df = df.reset_index() 
df
    first second       0
0   bar    one  1.991802
1   bar    two  0.403415
2   baz    one -1.024986
3   baz    two -0.522366
4   foo    one  0.350297
5   foo    two -0.444106
6   qux    one -0.472536
7   qux    two  0.999393

通过将高阶多索引列设置为空字符串来删除所有重复项（在我的示例中，我仅在“first”中具有重复索引）：

df.ix[df.duplicated('first') , 'first'] = '' # see deprecation warnings below
df
  first second         0
0   bar    one  1.991802
1          two  0.403415
2   baz    one -1.024986
3          two -0.522366
4   foo    one  0.350297
5          two -0.444106
6   qux    one -0.472536
7          two  0.999393

将“索引”上的列名称更改为空字符串

new_cols = df.columns.values
new_cols[:2] = '',''  # since my index columns are the two left-most on the table
df.columns = new_cols

现在调用表函数，但将表中的所有行标签设置为空字符串（这可确保不显示绘图的实际索引）：

table(ax, df, rowLabels=['']*df.shape[0], loc='center')

瞧：

您的不太漂亮但功能齐全的多索引表。

编辑：弃用警告

正如评论中指出的，导入声明table:

from pandas.tools.plotting import table

现在在较新版本的 pandas 中已弃用，取而代之的是：

from pandas.plotting import table

编辑：弃用警告 2

The ix索引器现已完全已弃用 http://pandas-docs.github.io/pandas-docs-travis/user_guide/indexing.html#ix-indexer-is-deprecated所以我们应该使用loc相反，索引器。代替：

df.ix[df.duplicated('first') , 'first'] = ''

with

df.loc[df.duplicated('first') , 'first'] = ''

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

python

pandas