Pandas/Google BigQuery:架构不匹配导致上传失败

2024-05-18

我的谷歌表中的架构如下所示:

price_datetime : DATETIME,
symbol         : STRING,
bid_open       : FLOAT,
bid_high       : FLOAT,
bid_low        : FLOAT,
bid_close      : FLOAT,
ask_open       : FLOAT,
ask_high       : FLOAT,
ask_low        : FLOAT,
ask_close      : FLOAT

当我做了一个pandas.read_gbq我得到一个dataframe列数据类型如下:

price_datetime     object
symbol             object
bid_open          float64
bid_high          float64
bid_low           float64
bid_close         float64
ask_open          float64
ask_high          float64
ask_low           float64
ask_close         float64
dtype: object

现在我想用to_gbq所以我从这些数据类型转换我的本地数据框(我刚刚制作的):

price_datetime    datetime64[ns]
symbol                    object
bid_open                 float64
bid_high                 float64
bid_low                  float64
bid_close                float64
ask_open                 float64
ask_high                 float64
ask_low                  float64
ask_close                float64
dtype: object

对于这些数据类型:

price_datetime     object
symbol             object
bid_open          float64
bid_high          float64
bid_low           float64
bid_close         float64
ask_open          float64
ask_high          float64
ask_low           float64
ask_close         float64
dtype: object

通过做:

df['price_datetime'] = df['price_datetime'].astype(object)

现在我(认为)我已经阅读并使用to_gbq所以我这样做:

import pandas
pandas.io.gbq.to_gbq(df, <table_name>, <project_name>, if_exists='append')

但我收到错误:

---------------------------------------------------------------------------
InvalidSchema                             Traceback (most recent call last)
<ipython-input-15-d5a3f86ad382> in <module>()
      1 a = time.time()
----> 2 pandas.io.gbq.to_gbq(df, <table_name>, <project_name>, if_exists='append')
      3 b = time.time()
      4 
      5 print(b-a)

C:\Users\me\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\io\gbq.py in to_gbq(dataframe, destination_table, project_id, chunksize, verbose, reauth, if_exists, private_key)
    825         elif if_exists == 'append':
    826             if not connector.verify_schema(dataset_id, table_id, table_schema):
--> 827                 raise InvalidSchema("Please verify that the structure and "
    828                                     "data types in the DataFrame match the "
    829                                     "schema of the destination table.")

InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table.

我必须做两件事来解决我的问题。首先,我删除了我的表并重新上传了它,其中的列为TIMESTAMP类型而不是DATETIME类型。这确保了模式匹配时pandas.DataFrame与列类型datetime64[ns]已上传至使用to_gbq,这会转换datetime64[ns] to TIMESTAMP键入而不是DATETIME type (for now https://github.com/pydata/pandas-gbq/issues/69).

我做的第二件事是升级pandas 0.19 to pandas 0.20。这两件事解决了我的模式不匹配的问题。

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

Pandas/Google BigQuery:架构不匹配导致上传失败 的相关文章