这是 Bigquery 的一个令人讨厌的、未记录的副作用时间旅行。时间旅行使其使用不安全ALTER TABLE
bigquery 中的语句。
问题演示:
create table apu.time_travel_problem
( id int64
, name string
);
select column_name, data_type
FROM apu.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'time_travel_problem';
column_name |
data_type |
id |
INT64 |
name |
STRING |
到目前为止这一切都是正常的,但经过一段时间后ALTER TABLE
一切都变得奇怪:
alter table apu.time_travel_problem drop column name;
select column_name, data_type
FROM apu.INFORMATION_SCHEMA.COLUMNS
WHERE table_name = 'time_travel_problem';
column_name |
data_type |
id |
INT64 |
name |
STRING |
我们刚刚删除的专栏仍然存在!
现在试试这个:
alter table apu.time_travel_problem add column name string;
Column `name` was recently deleted in the table `time_travel_problem`. Deleted column name is reserved for up to the time travel duration, use a different column name instead.
解决方案:
不使用ALTER TABLE
在大查询中。反而DROP
and reCREATE
使用临时表。
这是一个jinja我使用的模板:
/* {{TABLE}} */
ALTER TABLE {{DATASET}}.{{TABLE}} RENAME TO {{TABLE}}_migration;
ALTER TABLE {{DATASET}}.{{TABLE}}_migration
SET OPTIONS (expiration_timestamp = timestamp_add(CURRENT_TIMESTAMP(), INTERVAL 8 HOUR))l
CREATE TABLE {{DATASET}}.{{TABLE}}
(
{{COLUMN_DDL}}
);
INSERT INTO {{DATASET}}.{{TABLE}}
(
{{COLUMN_LIST}}
)
SELECT
{{COLUMN_LIST}}
FROM {{DATASET}}.{{TABLE}}_migration;