我制作了一个读取 Excel 文档并检查第一行是否包含“UPDATED”的脚本。如果是这样,它将整行写入具有相同选项卡名称的另一个 Excel 文档。
我的Excel文档有23张纸,每张纸有1000行,现在需要15分钟以上才能完成。有没有办法加快这个速度?
我正在考虑多线程或多处理,但我不知道哪一个更好。
更新:我的程序运行了 15 分钟的事实是由只读模式引起的,当我删除它时,运行程序只花了 2 秒
import openpyxl
import os
from datetime import datetime
titles = ["Column1", "Column2", "Column3", "Column4", "Column5","Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16"]
def main():
oldFilePath= os.getcwd() + "\oldFile.xlsx"
newFilePath= os.getcwd() + "\newFile.xlsx"
wb = openpyxl.load_workbook(filename=oldFilePath, read_only=True)
wb2 = openpyxl.Workbook()
sheets = wb.get_sheet_names()
sheets2 = wb2.get_sheet_names()
#removes all sheets in newFile.xlsx
for sheet in sheets2:
temp = wb2.get_sheet_by_name(sheet)
wb2.remove_sheet(temp)
for tab in sheets:
print("Sheet: " + str(tab))
rowCounter = 2
sheet = wb[tab]
for row in range(sheet.max_row):
if sheet.cell(row=row + 1, column=1).value == "": #if cell is empty stop reading
break
elif sheet.cell(row=row + 1, column=1).value == "UPDATED":
if tab not in sheets2:
sheet2 = wb2.create_sheet(title=tab)
sheet2.append(titles)
for x in range(1, 17):
sheet2.cell(row=rowCounter, column=x).value = sheet.cell(row=row + 1, column=x).value
rowCounter += 1
sheets2 = wb2.get_sheet_names()
wb2.save(filename=newFilePath)
if __name__ == "__main__":
startTime = datetime.now()
main()
print("Script finished in: " + str(datetime.now() - startTime))