我正在迁移数亿条该格式的推文{'id_str': , 'created_at': , 'text': }
使用 pymongo 将文本文件转换为 MongoDB。为每个用户创建一个集合来存储他/她的推文。我使用的插入方法是insert_many()
。常常会遇到BulkWriteError
.
Traceback (most recent call last):
File "pipeline.py", line 105, in <module>
timeline_db, meta_db, negative_db, log_col, dir_path)
File "/media/haitao/Storage/twitter_pipeline/migrate_old.py", line 134, in migrate_dir
timeline_db[user_id].insert_many(utility.temporal_sort(statuses))
File "/home/haitao/anaconda3/envs/py27/lib/python2.7/site-packages/pymongo/collection.py", line 711, in insert_many
blk.execute(self.write_concern.document)
File "/home/haitao/anaconda3/envs/py27/lib/python2.7/site-packages/pymongo/bulk.py", line 493, in execute
return self.execute_command(sock_info, generator, write_concern)
File "/home/haitao/anaconda3/envs/py27/lib/python2.7/site-packages/pymongo/bulk.py", line 331, in execute_command
raise BulkWriteError(full_result)
pymongo.errors.BulkWriteError: batch op errors occurred
当存在重复的键时似乎会发生此错误,但此处不应出现这种情况。我还可以检查其他事项来解决此问题吗?
提前致谢!