站点图标 AI技术聚合

python 将多个df写入xlsx文件的ExcelWriter

关于pandas.ExcelWriter用法的介绍,本文参考了python内置文档 Help on class ExcelWriter in module pandas.io.excel._base。通过查看ExcelWriter的py文件,可以看到该类的定义框架如下所示:

class ExcelWriter(builtins.object):
    def ExcelWriter(
        path: 'FilePathOrBuffer | ExcelWriter', 
        engine=None, 
        date_format=None, 
        datetime_format=None, 
        mode: 'str' = 'w', 
        storage_options: 'StorageOptions' = None, 
        if_sheet_exists: 'str | None' = None, 
        engine_kwargs: 'dict | None' = None, 
        **kwargs):

以下首先对ExcelWriter的帮助文档进行整合:

ExcelWriter是一个用于将DataFrame对象写入Excel工作表的类。默认的处理是对xls文件使用xlwt方法,对xlsx文件使用openpyxl方法,对ods文件使用odf方法。可以参见DataFrame.to_excel的文档查看这三种方法的典型用法。ExcelWriter应当视为一个上下文管理器(context manager),否则,call close()函数进行保存并关闭任何已打开的文件的处理(handles)

Class for writing DataFrame objects into excel sheets. Default is to use xlwt for xls, openpyxl for xlsx, odf for ods. See DataFrame.to_excel for typical usage.

参数:

path : str or typing.BinaryIO
    Path to xls or xlsx or ods file.

engine : str (optional)
    Engine to use for writing.
    If None, defaults to ``io.excel.<extension>.writer``. 

date_format : str, default None
    Format string for dates written into Excel files.
    (e.g. 'YYYY-MM-DD')

datetime_format : str, default None
    Format string for datetime objects written into Excel files.
    (e.g. 'YYYY-MM-DD HH:MM:SS')

mode : {'w', 'a'}, default 'w'
    File mode to use (write or append).
    Append does not work with fsspec URLs.

storage_options : dict, optional
    Extra options that make sense for a particular storage connection, 
    e.g. host, port, username, password, etc.,
    if using a URL that will be parsed by ``fsspec``,
    e.g., starting "s3://", "gcs://".

if_sheet_exists : {'error', 'new', 'replace'}, default 'error'
    How to behave when trying to write to a sheet that already exists 
    (append mode only).
    * error: raise a ValueError.
    * new: Create a new sheet, with a name determined by the engine.
    * replace: Delete the contents of the sheet before writing to it.

engine_kwargs : dict, optional
    Keyword arguments to be passed into the engine.

**kwargs : dict, optional
    Keyword arguments to be passed into the engine.
    deprecated:: 1.3.0 Use engine_kwargs instead.

NOTE:

  1. parameter ‘engine’ can only be passed as a keyword argument
  2. deprecated:: 1.2.0 As the `xlwt <https://pypi.org/project/xlwt/>`__ package is no longer maintained, the “xlwt“ engine will be removed in a future version of pandas.
  3. For compatibility with CSV writers, ExcelWriter serializes lists and dicts to strings before writing. 这一点使得ExcelWriter的效率很差

例子:

Default usage:

>>> df = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> with ExcelWriter("path_to_file.xlsx") as writer:
...     df.to_excel(writer)

To write to separate sheets in a single file:

>>> df1 = pd.DataFrame([["AAA", "BBB"]], columns=["Spam", "Egg"])
>>> df2 = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> with ExcelWriter("path_to_file.xlsx") as writer:
...     df1.to_excel(writer, sheet_name="Sheet1")
...     df2.to_excel(writer, sheet_name="Sheet2")

You can set the date format or datetime format:

>>> from datetime import date, datetime
>>> df = pd.DataFrame(
...     [
...         [date(2014, 1, 31), date(1999, 9, 24)],
...         [datetime(1998, 5, 26, 23, 33, 4), datetime(2014, 2, 28, 13, 5, 13)],
...     ],
...     index=["Date", "Datetime"],
...     columns=["X", "Y"],
... )
>>> with ExcelWriter(
...     "path_to_file.xlsx",
...     date_format="YYYY-MM-DD",
...     datetime_format="YYYY-MM-DD HH:MM:SS"
... ) as writer:
...     df.to_excel(writer)

You can also append to an existing Excel file:

>>> with ExcelWriter("path_to_file.xlsx", mode="a", engine="openpyxl") as writer:
...     df.to_excel(writer, sheet_name="Sheet3")

You can store Excel file in RAM:

>>> import io
>>> df = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> buffer = io.BytesIO()
>>> with pd.ExcelWriter(buffer) as writer:
...     df.to_excel(writer)

You can pack Excel file into zip archive:

>>> import zipfile
>>> df = pd.DataFrame([["ABC", "XYZ"]], columns=["Foo", "Bar"])
>>> with zipfile.ZipFile("path_to_file.zip", "w") as zf:
...     with zf.open("filename.xlsx", "w") as buffer:
...         with pd.ExcelWriter(buffer) as writer:
...             df.to_excel(writer)

实例用法

实践中常常使用path,engine,mode作为应用ExcelWriter的主要参数。其中再次强调,mode: ‘w’ for write , ‘a’ for append.default ‘w’: write

使用上下文管理器这样就不用担心忘记excel_writer.close()

import pandas as pd
from openpyxl import load_workbook
import os

if os.path.exists(output_file):
    with pd.ExcelWriter(output_file, engine='openpyxl') as excel_writer:
        book = load_workbook(excel_writer.path)
        excel_writer.book = book
        result_df.to_excel(excel_writer, sheet_name=sheet_name)
    else:
        result_df.to_excel(output_file, sheet_name=sheet_name)

总结

  1. path:xls或xlsx文件的路径;
  2. engine:可选参数,默认值为”xlwt”,对于xls文件该参数值为”xlwt”有效,对于xlsx文件该参数值为”openpyxl”有效;
  3. date_format:格式字符串,用于写入Excel文件的日期(例如“ YYYY-MM-DD”);
  4. datetime_format:写入Excel文件的日期时间对象的格式字符串。 (例如“ YYYY-MM-DD HH:MM:SS”)
  5. mode:可选参数,{‘w’, ‘a’}, 默认为 ‘w’,即擦除目标文件原内容,将dataframe覆盖式导出至目标Excel文件。当为’a’时,将dataframe追加导入至目标Excel文件,目标文件中原内容保留。此处需注意,如果为’a’,且导出的为xlsx文件,需将该参数修改为”openpyxl”。

文章出处登录后可见!

已经登录?立即刷新
退出移动版