Python pandas库|任凭弱水三千,我只取一瓢饮(7)

Python pandas库|任凭弱水三千,我只取一瓢饮(7)

上一篇链接:

Python pandas库|任凭弱水三千,我只取一瓢饮(6)_Hann Yang的博客-CSDN博客

to_系列函数:22个 (12~22)

Function12

to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'

Help on function to_numpy in module pandas.core.frame:

to_numpy(self, dtype: 'NpDtype | None' = None, copy: 'bool' = False, na_value=<no_default>) -> 'np.ndarray'
    Convert the DataFrame to a NumPy array.
    
    By default, the dtype of the returned array will be the common NumPy
    dtype of all types in the DataFrame. For example, if the dtypes are
    “float16“ and “float32“, the results dtype will be “float32“.
    This may require copying data and coercing values, which may be
    expensive.
    
    Parameters
    ———-
    dtype : str or numpy.dtype, optional
        The dtype to pass to :meth:`numpy.asarray`.
    copy : bool, default False
        Whether to ensure that the returned value is not a view on
        another array. Note that “copy=False“ does not *ensure* that
        “to_numpy()“ is no-copy. Rather, “copy=True“ ensure that
        a copy is made, even if not strictly necessary.
    na_value : Any, optional
        The value to use for missing values. The default value depends
        on `dtype` and the dtypes of the DataFrame columns.
    
        .. versionadded:: 1.1.0
    
    Returns
    ——-
    numpy.ndarray
    
    See Also
    ——–
    Series.to_numpy : Similar method for Series.
    
    Examples
    ——–
    >>> pd.DataFrame({"A": [1, 2], "B": [3, 4]}).to_numpy()
    array([[1, 3],
           [2, 4]])
    
    With heterogeneous data, the lowest common type will have to
    be used.
    
    >>> df = pd.DataFrame({"A": [1, 2], "B": [3.0, 4.5]})
    >>> df.to_numpy()
    array([[1. , 3. ],
           [2. , 4.5]])
    
    For a mix of numeric and non-numeric types, the output array will
    have object dtype.
    
    >>> df['C'] = pd.date_range('2000', periods=2)
    >>> df.to_numpy()
    array([[1, 3.0, Timestamp('2000-01-01 00:00:00')],
           [2, 4.5, Timestamp('2000-01-02 00:00:00')]], dtype=object)

Function13

to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'

Help on function to_parquet in module pandas.core.frame:

to_parquet(self, path: 'FilePathOrBuffer | None' = None, engine: 'str' = 'auto', compression: 'str | None' = 'snappy', index: 'bool | None' = None, partition_cols: 'list[str] | None' = None, storage_options: 'StorageOptions' = None, **kwargs) -> 'bytes | None'
    Write a DataFrame to the binary parquet format.
    
    This function writes the dataframe as a `parquet file
    <https://parquet.apache.org/>`_. You can choose different parquet
    backends, and have the option of compression. See
    :ref:`the user guide <io.parquet>` for more details.
    
    Parameters
    ———-
    path : str or file-like object, default None
        If a string, it will be used as Root Directory path
        when writing a partitioned dataset. By file-like object,
        we refer to objects with a write() method, such as a file handle
        (e.g. via builtin open function) or io.BytesIO. The engine
        fastparquet does not accept file-like objects. If path is None,
        a bytes object is returned.
    
        .. versionchanged:: 1.2.0
    
        Previously this was "fname"
    
    engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
        Parquet library to use. If 'auto', then the option
        “io.parquet.engine“ is used. The default “io.parquet.engine“
        behavior is to try 'pyarrow', falling back to 'fastparquet' if
        'pyarrow' is unavailable.
    compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
        Name of the compression to use. Use “None“ for no compression.
    index : bool, default None
        If “True“, include the dataframe's index(es) in the file output.
        If “False“, they will not be written to the file.
        If “None“, similar to “True“ the dataframe's index(es)
        will be saved. However, instead of being saved as values,
        the RangeIndex will be stored as a range in the metadata so it
        doesn't require much space and is faster. Other indexes will
        be included as columns in the file output.
    partition_cols : list, optional, default None
        Column names by which to partition the dataset.
        Columns are partitioned in the order they are given.
        Must be None if path is not a string.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to “urllib“ as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        “fsspec“. Please see “fsspec“ and “urllib“ for more details.
    
        .. versionadded:: 1.2.0
    
    **kwargs
        Additional arguments passed to the parquet library. See
        :ref:`pandas io <io.parquet>` for more details.
    
    Returns
    ——-
    bytes if no path argument is provided else None
    
    See Also
    ——–
    read_parquet : Read a parquet file.
    DataFrame.to_csv : Write a csv file.
    DataFrame.to_sql : Write to a sql table.
    DataFrame.to_hdf : Write to hdf.
    
    Notes
    —–
    This function requires either the `fastparquet
    <https://pypi.org/project/fastparquet>`_ or `pyarrow
    <https://arrow.apache.org/docs/python/>`_ library.
    
    Examples
    ——–
    >>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
    >>> df.to_parquet('df.parquet.gzip',
    …               compression='gzip')  # doctest: +SKIP
    >>> pd.read_parquet('df.parquet.gzip')  # doctest: +SKIP
       col1  col2
    0     1     3
    1     2     4
    
    If you want to get a buffer to the parquet content you can use a io.BytesIO
    object, as long as you don't use partition_cols, which creates multiple files.
    
    >>> import io
    >>> f = io.BytesIO()
    >>> df.to_parquet(f)
    >>> f.seek(0)
    0
    >>> content = f.read()

Function14

to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'

Help on function to_period in module pandas.core.frame:

to_period(self, freq: 'Frequency | None' = None, axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
    Convert DataFrame from DatetimeIndex to PeriodIndex.
    
    Convert DataFrame from DatetimeIndex to PeriodIndex with desired
    frequency (inferred from index if not passed).
    
    Parameters
    ———-
    freq : str, default
        Frequency of the PeriodIndex.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        The axis to convert (the index by default).
    copy : bool, default True
        If False then underlying input data is not copied.
    
    Returns
    ——-
    DataFrame with PeriodIndex

Function15

to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'

Help on function to_pickle in module pandas.core.generic:

to_pickle(self, path, compression: 'CompressionOptions' = 'infer', protocol: 'int' = 5, storage_options: 'StorageOptions' = None) -> 'None'
    Pickle (serialize) object to file.
    
    Parameters
    ———-
    path : str
        File path where the pickled object will be stored.
    compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None},         default 'infer'
        A string representing the compression to use in the output file. By
        default, infers from the file extension in specified path.
        Compression mode may be any of the following possible
        values: {¡®infer¡¯, ¡®gzip¡¯, ¡®bz2¡¯, ¡®zip¡¯, ¡®xz¡¯, None}. If compression
        mode is ¡®infer¡¯ and path_or_buf is path-like, then detect
        compression mode from the following extensions:
        ¡®.gz¡¯, ¡®.bz2¡¯, ¡®.zip¡¯ or ¡®.xz¡¯. (otherwise no compression).
        If dict given and mode is ¡®zip¡¯ or inferred as ¡®zip¡¯, other entries
        passed as additional compression options.
    protocol : int
        Int which indicates which protocol should be used by the pickler,
        default HIGHEST_PROTOCOL (see [1]_ paragraph 12.1.2). The possible
        values are 0, 1, 2, 3, 4, 5. A negative value for the protocol
        parameter is equivalent to setting its value to HIGHEST_PROTOCOL.
    
        .. [1] https://docs.python.org/3/library/pickle.html.
    
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to “urllib“ as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        “fsspec“. Please see “fsspec“ and “urllib“ for more details.
    
        .. versionadded:: 1.2.0
    
    See Also
    ——–
    read_pickle : Load pickled pandas object (or any object) from file.
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_sql : Write DataFrame to a SQL database.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    
    Examples
    ——–
    >>> original_df = pd.DataFrame({"foo": range(5), "bar": range(5, 10)})
    >>> original_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    >>> original_df.to_pickle("./dummy.pkl")
    
    >>> unpickled_df = pd.read_pickle("./dummy.pkl")
    >>> unpickled_df
       foo  bar
    0    0    5
    1    1    6
    2    2    7
    3    3    8
    4    4    9
    
    >>> import os
    >>> os.remove("./dummy.pkl")

Function16

to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'

Help on function to_records in module pandas.core.frame:

to_records(self, index=True, column_dtypes=None, index_dtypes=None) -> 'np.recarray'
    Convert DataFrame to a NumPy record array.
    
    Index will be included as the first field of the record array if
    requested.
    
    Parameters
    ———-
    index : bool, default True
        Include index in resulting record array, stored in 'index'
        field or using the index label, if set.
    column_dtypes : str, type, dict, default None
        If a string or type, the data type to store all columns. If
        a dictionary, a mapping of column names and indices (zero-indexed)
        to specific data types.
    index_dtypes : str, type, dict, default None
        If a string or type, the data type to store all index levels. If
        a dictionary, a mapping of index level names and indices
        (zero-indexed) to specific data types.
    
        This mapping is applied only if `index=True`.
    
    Returns
    ——-
    numpy.recarray
        NumPy ndarray with the DataFrame labels as fields and each row
        of the DataFrame as entries.
    
    See Also
    ——–
    DataFrame.from_records: Convert structured or record ndarray
        to DataFrame.
    numpy.recarray: An ndarray that allows field access using
        attributes, analogous to typed columns in a
        spreadsheet.
    
    Examples
    ——–
    >>> df = pd.DataFrame({'A': [1, 2], 'B': [0.5, 0.75]},
    …                   index=['a', 'b'])
    >>> df
       A     B
    a  1  0.50
    b  2  0.75
    >>> df.to_records()
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('index', 'O'), ('A', '<i8'), ('B', '<f8')])
    
    If the DataFrame index has no label then the recarray field name
    is set to 'index'. If the index has a label then this is used as the
    field name:
    
    >>> df.index = df.index.rename("I")
    >>> df.to_records()
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('I', 'O'), ('A', '<i8'), ('B', '<f8')])
    
    The index can be excluded from the record array:
    
    >>> df.to_records(index=False)
    rec.array([(1, 0.5 ), (2, 0.75)],
              dtype=[('A', '<i8'), ('B', '<f8')])
    
    Data types can be specified for the columns:
    
    >>> df.to_records(column_dtypes={"A": "int32"})
    rec.array([('a', 1, 0.5 ), ('b', 2, 0.75)],
              dtype=[('I', 'O'), ('A', '<i4'), ('B', '<f8')])
    
    As well as for the index:
    
    >>> df.to_records(index_dtypes="<S2")
    rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
              dtype=[('I', 'S2'), ('A', '<i8'), ('B', '<f8')])
    
    >>> index_dtypes = f"<S{df.index.str.len().max()}"
    >>> df.to_records(index_dtypes=index_dtypes)
    rec.array([(b'a', 1, 0.5 ), (b'b', 2, 0.75)],
              dtype=[('I', 'S1'), ('A', '<i8'), ('B', '<f8')])

Function17

to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'

Help on function to_sql in module pandas.core.generic:

to_sql(self, name: 'str', con, schema=None, if_exists: 'str' = 'fail', index: 'bool_t' = True, index_label=None, chunksize=None, dtype: 'DtypeArg | None' = None, method=None) -> 'None'
    Write records stored in a DataFrame to a SQL database.
    
    Databases supported by SQLAlchemy [1]_ are supported. Tables can be
    newly created, appended to, or overwritten.
    
    Parameters
    ———-
    name : str
        Name of SQL table.
    con : sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection
        Using SQLAlchemy makes it possible to use any DB supported by that
        library. Legacy support is provided for sqlite3.Connection objects. The user
        is responsible for engine disposal and connection closure for the SQLAlchemy
        connectable See `here                 <https://docs.sqlalchemy.org/en/13/core/connections.html>`_.
    
    schema : str, optional
        Specify the schema (if database flavor supports this). If None, use
        default schema.
    if_exists : {'fail', 'replace', 'append'}, default 'fail'
        How to behave if the table already exists.
    
        * fail: Raise a ValueError.
        * replace: Drop the table before inserting new values.
        * append: Insert new values to the existing table.
    
    index : bool, default True
        Write DataFrame index as a column. Uses `index_label` as the column
        name in the table.
    index_label : str or sequence, default None
        Column label for index column(s). If None is given (default) and
        `index` is True, then the index names are used.
        A sequence should be given if the DataFrame uses MultiIndex.
    chunksize : int, optional
        Specify the number of rows in each batch to be written at a time.
        By default, all rows will be written at once.
    dtype : dict or scalar, optional
        Specifying the datatype for columns. If a dictionary is used, the
        keys should be the column names and the values should be the
        SQLAlchemy types or strings for the sqlite3 legacy mode. If a
        scalar is provided, it will be applied to all columns.
    method : {None, 'multi', callable}, optional
        Controls the SQL insertion clause used:
    
        * None : Uses standard SQL “INSERT“ clause (one per row).
        * 'multi': Pass multiple values in a single “INSERT“ clause.
        * callable with signature “(pd_table, conn, keys, data_iter)“.
    
        Details and a sample callable implementation can be found in the
        section :ref:`insert method <io.sql.method>`.
    
    Raises
    ——
    ValueError
        When the table already exists and `if_exists` is 'fail' (the
        default).
    
    See Also
    ——–
    read_sql : Read a DataFrame from a table.
    
    Notes
    —–
    Timezone aware datetime columns will be written as
    “Timestamp with timezone“ type with SQLAlchemy if supported by the
    database. Otherwise, the datetimes will be stored as timezone unaware
    timestamps local to the original timezone.
    
    References
    ———-
    .. [1] https://docs.sqlalchemy.org
    .. [2] https://www.python.org/dev/peps/pep-0249/
    
    Examples
    ——–
    Create an in-memory SQLite database.
    
    >>> from sqlalchemy import create_engine
    >>> engine = create_engine('sqlite://', echo=False)
    
    Create a table from scratch with 3 rows.
    
    >>> df = pd.DataFrame({'name' : ['User 1', 'User 2', 'User 3']})
    >>> df
         name
    0  User 1
    1  User 2
    2  User 3
    
    >>> df.to_sql('users', con=engine)
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 1'), (1, 'User 2'), (2, 'User 3')]
    
    An `sqlalchemy.engine.Connection` can also be passed to `con`:
    
    >>> with engine.begin() as connection:
    …     df1 = pd.DataFrame({'name' : ['User 4', 'User 5']})
    …     df1.to_sql('users', con=connection, if_exists='append')
    
    This is allowed to support operations that require that the same
    DBAPI connection is used for the entire operation.
    
    >>> df2 = pd.DataFrame({'name' : ['User 6', 'User 7']})
    >>> df2.to_sql('users', con=engine, if_exists='append')
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 1'), (1, 'User 2'), (2, 'User 3'),
     (0, 'User 4'), (1, 'User 5'), (0, 'User 6'),
     (1, 'User 7')]
    
    Overwrite the table with just “df2“.
    
    >>> df2.to_sql('users', con=engine, if_exists='replace',
    …            index_label='id')
    >>> engine.execute("SELECT * FROM users").fetchall()
    [(0, 'User 6'), (1, 'User 7')]
    
    Specify the dtype (especially useful for integers with missing values).
    Notice that while pandas is forced to store the data as floating point,
    the database supports nullable integers. When fetching the data with
    Python, we get back integer scalars.
    
    >>> df = pd.DataFrame({"A": [1, None, 2]})
    >>> df
         A
    0  1.0
    1  NaN
    2  2.0
    
    >>> from sqlalchemy.types import Integer
    >>> df.to_sql('integers', con=engine, index=False,
    …           dtype={"A": Integer()})
    
    >>> engine.execute("SELECT * FROM integers").fetchall()
    [(1,), (None,), (2,)]

Function18

to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'

Help on function to_stata in module pandas.core.frame:

to_stata(self, path: 'FilePathOrBuffer', convert_dates: 'dict[Hashable, str] | None' = None, write_index: 'bool' = True, byteorder: 'str | None' = None, time_stamp: 'datetime.datetime | None' = None, data_label: 'str | None' = None, variable_labels: 'dict[Hashable, str] | None' = None, version: 'int | None' = 114, convert_strl: 'Sequence[Hashable] | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'None'
    Export DataFrame object to Stata dta format.
    
    Writes the DataFrame to a Stata dataset file.
    "dta" files contain a Stata dataset.
    
    Parameters
    ———-
    path : str, buffer or path object
        String, path object (pathlib.Path or py._path.local.LocalPath) or
        object implementing a binary write() function. If using a buffer
        then the buffer will not be automatically closed after the file
        data has been written.
    
        .. versionchanged:: 1.0.0
    
        Previously this was "fname"
    
    convert_dates : dict
        Dictionary mapping columns containing datetime types to stata
        internal format to use when writing the dates. Options are 'tc',
        'td', 'tm', 'tw', 'th', 'tq', 'ty'. Column can be either an integer
        or a name. Datetime columns that do not have a conversion type
        specified will be converted to 'tc'. Raises NotImplementedError if
        a datetime column has timezone information.
    write_index : bool
        Write the index to Stata dataset.
    byteorder : str
        Can be ">", "<", "little", or "big". default is `sys.byteorder`.
    time_stamp : datetime
        A datetime to use as file creation date.  Default is the current
        time.
    data_label : str, optional
        A label for the data set.  Must be 80 characters or smaller.
    variable_labels : dict
        Dictionary containing columns as keys and variable labels as
        values. Each label must be 80 characters or smaller.
    version : {114, 117, 118, 119, None}, default 114
        Version to use in the output dta file. Set to None to let pandas
        decide between 118 or 119 formats depending on the number of
        columns in the frame. Version 114 can be read by Stata 10 and
        later. Version 117 can be read by Stata 13 or later. Version 118
        is supported in Stata 14 and later. Version 119 is supported in
        Stata 15 and later. Version 114 limits string variables to 244
        characters or fewer while versions 117 and later allow strings
        with lengths up to 2,000,000 characters. Versions 118 and 119
        support Unicode characters, and version 119 supports more than
        32,767 variables.
    
        Version 119 should usually only be used when the number of
        variables exceeds the capacity of dta format 118. Exporting
        smaller datasets in format 119 may have unintended consequences,
        and, as of November 2020, Stata SE cannot read version 119 files.
    
        .. versionchanged:: 1.0.0
    
            Added support for formats 118 and 119.
    
    convert_strl : list, optional
        List of column names to convert to string columns to Stata StrL
        format. Only available if version is 117.  Storing strings in the
        StrL format can produce smaller dta files if strings have more than
        8 characters and values are repeated.
    compression : str or dict, default 'infer'
        For on-the-fly compression of the output dta. If string, specifies
        compression mode. If dict, value at key 'method' specifies
        compression mode. Compression mode must be one of {'infer', 'gzip',
        'bz2', 'zip', 'xz', None}. If compression mode is 'infer' and
        `fname` is path-like, then detect compression from the following
        extensions: '.gz', '.bz2', '.zip', or '.xz' (otherwise no
        compression). If dict and compression mode is one of {'zip',
        'gzip', 'bz2'}, or inferred as one of the above, other entries
        passed as additional compression options.
    
        .. versionadded:: 1.1.0
    
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to “urllib“ as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        “fsspec“. Please see “fsspec“ and “urllib“ for more details.
    
        .. versionadded:: 1.2.0
    
    Raises
    ——
    NotImplementedError
        * If datetimes contain timezone information
        * Column dtype is not representable in Stata
    ValueError
        * Columns listed in convert_dates are neither datetime64[ns]
          or datetime.datetime
        * Column listed in convert_dates is not in DataFrame
        * Categorical label contains more than 32,000 characters
    
    See Also
    ——–
    read_stata : Import Stata data files.
    io.stata.StataWriter : Low-level writer for Stata data files.
    io.stata.StataWriter117 : Low-level writer for version 117 files.
    
    Examples
    ——–
    >>> df = pd.DataFrame({'animal': ['falcon', 'parrot', 'falcon',
    …                               'parrot'],
    …                    'speed': [350, 18, 361, 15]})
    >>> df.to_stata('animals.dta')  # doctest: +SKIP

Function19

to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'

Help on function to_string in module pandas.core.frame:

to_string(self, buf: 'FilePathOrBuffer[str] | None' = None, columns: 'Sequence[str] | None' = None, col_space: 'int | None' = None, header: 'bool | Sequence[str]' = True, index: 'bool' = True, na_rep: 'str' = 'NaN', formatters: 'fmt.FormattersType | None' = None, float_format: 'fmt.FloatFormatType | None' = None, sparsify: 'bool | None' = None, index_names: 'bool' = True, justify: 'str | None' = None, max_rows: 'int | None' = None, min_rows: 'int | None' = None, max_cols: 'int | None' = None, show_dimensions: 'bool' = False, decimal: 'str' = '.', line_width: 'int | None' = None, max_colwidth: 'int | None' = None, encoding: 'str | None' = None) -> 'str | None'
    Render a DataFrame to a console-friendly tabular output.
    
    Parameters
    ———-
    buf : str, Path or StringIO-like, optional, default None
        Buffer to write to. If None, the output is returned as a string.
    columns : sequence, optional, default None
        The subset of columns to write. Writes all columns by default.
    col_space : int, list or dict of int, optional
        The minimum width of each column.
    header : bool or sequence, optional
        Write out the column names. If a list of strings is given, it is assumed to be aliases for the column names.
    index : bool, optional, default True
        Whether to print index (row) labels.
    na_rep : str, optional, default 'NaN'
        String representation of “NaN“ to use.
    formatters : list, tuple or dict of one-param. functions, optional
        Formatter functions to apply to columns' elements by position or
        name.
        The result of each function must be a unicode string.
        List/tuple must be of length equal to the number of columns.
    float_format : one-parameter function, optional, default None
        Formatter function to apply to columns' elements if they are
        floats. This function must return a unicode string and will be
        applied only to the non-“NaN“ elements, with “NaN“ being
        handled by “na_rep“.
    
        .. versionchanged:: 1.2.0
    
    sparsify : bool, optional, default True
        Set to False for a DataFrame with a hierarchical index to print
        every multiindex key at each row.
    index_names : bool, optional, default True
        Prints the names of the indexes.
    justify : str, default None
        How to justify the column labels. If None uses the option from
        the print configuration (controlled by set_option), 'right' out
        of the box. Valid values are
    
        * left
        * right
        * center
        * justify
        * justify-all
        * start
        * end
        * inherit
        * match-parent
        * initial
        * unset.
    max_rows : int, optional
        Maximum number of rows to display in the console.
    min_rows : int, optional
        The number of rows to display in the console in a truncated repr
        (when number of rows is above `max_rows`).
    max_cols : int, optional
        Maximum number of columns to display in the console.
    show_dimensions : bool, default False
        Display DataFrame dimensions (number of rows by number of columns).
    decimal : str, default '.'
        Character recognized as decimal separator, e.g. ',' in Europe.
    
    line_width : int, optional
        Width to wrap a line in characters.
    max_colwidth : int, optional
        Max width to truncate each column in characters. By default, no limit.
    
        .. versionadded:: 1.0.0
    encoding : str, default "utf-8"
        Set character encoding.
    
        .. versionadded:: 1.0
    
    Returns
    ——-
    str or None
        If buf is None, returns the result as a string. Otherwise returns
        None.
    
    See Also
    ——–
    to_html : Convert DataFrame to HTML.
    
    Examples
    ——–
    >>> d = {'col1': [1, 2, 3], 'col2': [4, 5, 6]}
    >>> df = pd.DataFrame(d)
    >>> print(df.to_string())
       col1  col2
    0     1     4
    1     2     5
    2     3     6

Function20

to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'

Help on function to_timestamp in module pandas.core.frame:

to_timestamp(self, freq: 'Frequency | None' = None, how: 'str' = 'start', axis: 'Axis' = 0, copy: 'bool' = True) -> 'DataFrame'
    Cast to DatetimeIndex of timestamps, at *beginning* of period.
    
    Parameters
    ———-
    freq : str, default frequency of PeriodIndex
        Desired frequency.
    how : {'s', 'e', 'start', 'end'}
        Convention for converting period to timestamp; start of period
        vs. end.
    axis : {0 or 'index', 1 or 'columns'}, default 0
        The axis to convert (the index by default).
    copy : bool, default True
        If False then underlying input data is not copied.
    
    Returns
    ——-
    DataFrame with DatetimeIndex

Function21

to_xarray(self)

Help on function to_xarray in module pandas.core.generic:

to_xarray(self)
    Return an xarray object from the pandas object.
    
    Returns
    ——-
    xarray.DataArray or xarray.Dataset
        Data in the pandas structure converted to Dataset if the object is
        a DataFrame, or a DataArray if the object is a Series.
    
    See Also
    ——–
    DataFrame.to_hdf : Write DataFrame to an HDF5 file.
    DataFrame.to_parquet : Write a DataFrame to the binary parquet format.
    
    Notes
    —–
    See the `xarray docs <https://xarray.pydata.org/en/stable/>`__
    
    Examples
    ——–
    >>> df = pd.DataFrame([('falcon', 'bird', 389.0, 2),
    …                    ('parrot', 'bird', 24.0, 2),
    …                    ('lion', 'mammal', 80.5, 4),
    …                    ('monkey', 'mammal', np.nan, 4)],
    …                   columns=['name', 'class', 'max_speed',
    …                            'num_legs'])
    >>> df
         name   class  max_speed  num_legs
    0  falcon    bird      389.0         2
    1  parrot    bird       24.0         2
    2    lion  mammal       80.5         4
    3  monkey  mammal        NaN         4
    
    >>> df.to_xarray()
    <xarray.Dataset>
    Dimensions:    (index: 4)
    Coordinates:
      * index      (index) int64 0 1 2 3
    Data variables:
        name       (index) object 'falcon' 'parrot' 'lion' 'monkey'
        class      (index) object 'bird' 'bird' 'mammal' 'mammal'
        max_speed  (index) float64 389.0 24.0 80.5 nan
        num_legs   (index) int64 2 2 4 4
    
    >>> df['max_speed'].to_xarray()
    <xarray.DataArray 'max_speed' (index: 4)>
    array([389. ,  24. ,  80.5,   nan])
    Coordinates:
      * index    (index) int64 0 1 2 3
    
    >>> dates = pd.to_datetime(['2018-01-01', '2018-01-01',
    …                         '2018-01-02', '2018-01-02'])
    >>> df_multiindex = pd.DataFrame({'date': dates,
    …                               'animal': ['falcon', 'parrot',
    …                                          'falcon', 'parrot'],
    …                               'speed': [350, 18, 361, 15]})
    >>> df_multiindex = df_multiindex.set_index(['date', 'animal'])
    
    >>> df_multiindex
                       speed
    date       animal
    2018-01-01 falcon    350
               parrot     18
    2018-01-02 falcon    361
               parrot     15
    
    >>> df_multiindex.to_xarray()
    <xarray.Dataset>
    Dimensions:  (animal: 2, date: 2)
    Coordinates:
      * date     (date) datetime64[ns] 2018-01-01 2018-01-02
      * animal   (animal) object 'falcon' 'parrot'
    Data variables:
        speed    (date, animal) int64 350 18 361 15

Function22

to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None'

Help on function to_xml in module pandas.core.frame:

to_xml(self, path_or_buffer: 'FilePathOrBuffer | None' = None, index: 'bool' = True, root_name: 'str | None' = 'data', row_name: 'str | None' = 'row', na_rep: 'str | None' = None, attr_cols: 'str | list[str] | None' = None, elem_cols: 'str | list[str] | None' = None, namespaces: 'dict[str | None, str] | None' = None, prefix: 'str | None' = None, encoding: 'str' = 'utf-8', xml_declaration: 'bool | None' = True, pretty_print: 'bool | None' = True, parser: 'str | None' = 'lxml', stylesheet: 'FilePathOrBuffer | None' = None, compression: 'CompressionOptions' = 'infer', storage_options: 'StorageOptions' = None) -> 'str | None'
    Render a DataFrame to an XML document.
    
    .. versionadded:: 1.3.0
    
    Parameters
    ———-
    path_or_buffer : str, path object or file-like object, optional
        File to write output to. If None, the output is returned as a
        string.
    index : bool, default True
        Whether to include index in XML document.
    root_name : str, default 'data'
        The name of root element in XML document.
    row_name : str, default 'row'
        The name of row element in XML document.
    na_rep : str, optional
        Missing data representation.
    attr_cols : list-like, optional
        List of columns to write as attributes in row element.
        Hierarchical columns will be flattened with underscore
        delimiting the different levels.
    elem_cols : list-like, optional
        List of columns to write as children in row element. By default,
        all columns output as children of row element. Hierarchical
        columns will be flattened with underscore delimiting the
        different levels.
    namespaces : dict, optional
        All namespaces to be defined in root element. Keys of dict
        should be prefix names and values of dict corresponding URIs.
        Default namespaces should be given empty string key. For
        example, ::
    
            namespaces = {"": "https://example.com"}
    
    prefix : str, optional
        Namespace prefix to be used for every element and/or attribute
        in document. This should be one of the keys in “namespaces“
        dict.
    encoding : str, default 'utf-8'
        Encoding of the resulting document.
    xml_declaration : bool, default True
        Whether to include the XML declaration at start of document.
    pretty_print : bool, default True
        Whether output should be pretty printed with indentation and
        line breaks.
    parser : {'lxml','etree'}, default 'lxml'
        Parser module to use for building of tree. Only 'lxml' and
        'etree' are supported. With 'lxml', the ability to use XSLT
        stylesheet is supported.
    stylesheet : str, path object or file-like object, optional
        A URL, file-like object, or a raw string containing an XSLT
        script used to transform the raw XML output. Script should use
        layout of elements and attributes from original output. This
        argument requires “lxml“ to be installed. Only XSLT 1.0
        scripts and not later versions is currently supported.
    compression : {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default 'infer'
        For on-the-fly decompression of on-disk data. If 'infer', then use
        gzip, bz2, zip or xz if path_or_buffer is a string ending in
        '.gz', '.bz2', '.zip', or 'xz', respectively, and no decompression
        otherwise. If using 'zip', the ZIP file must contain only one data
        file to be read in. Set to None for no decompression.
    storage_options : dict, optional
        Extra options that make sense for a particular storage connection, e.g.
        host, port, username, password, etc. For HTTP(S) URLs the key-value pairs
        are forwarded to “urllib“ as header options. For other URLs (e.g.
        starting with "s3://", and "gcs://") the key-value pairs are forwarded to
        “fsspec“. Please see “fsspec“ and “urllib“ for more details.
    
    Returns
    ——-
    None or str
        If “io“ is None, returns the resulting XML format as a
        string. Otherwise returns None.
    
    See Also
    ——–
    to_json : Convert the pandas object to a JSON string.
    to_html : Convert DataFrame to a html.
    
    Examples
    ——–
    >>> df = pd.DataFrame({'shape': ['square', 'circle', 'triangle'],
    …                    'degrees': [360, 360, 180],
    …                    'sides': [4, np.nan, 3]})
    
    >>> df.to_xml()  # doctest: +SKIP
    <?xml version='1.0' encoding='utf-8'?>
    <data>
      <row>
        <index>0</index>
        <shape>square</shape>
        <degrees>360</degrees>
        <sides>4.0</sides>
      </row>
      <row>
        <index>1</index>
        <shape>circle</shape>
        <degrees>360</degrees>
        <sides/>
      </row>
      <row>
        <index>2</index>
        <shape>triangle</shape>
        <degrees>180</degrees>
        <sides>3.0</sides>
      </row>
    </data>
    
    >>> df.to_xml(attr_cols=[
    …           'index', 'shape', 'degrees', 'sides'
    …           ])  # doctest: +SKIP
    <?xml version='1.0' encoding='utf-8'?>
    <data>
      <row index="0" shape="square" degrees="360" sides="4.0"/>
      <row index="1" shape="circle" degrees="360"/>
      <row index="2" shape="triangle" degrees="180" sides="3.0"/>
    </data>
    
    >>> df.to_xml(namespaces={"doc": "https://example.com"},
    …           prefix="doc")  # doctest: +SKIP
    <?xml version='1.0' encoding='utf-8'?>
    <doc:data xmlns:doc="https://example.com">
      <doc:row>
        <doc:index>0</doc:index>
        <doc:shape>square</doc:shape>
        <doc:degrees>360</doc:degrees>
        <doc:sides>4.0</doc:sides>
      </doc:row>
      <doc:row>
        <doc:index>1</doc:index>
        <doc:shape>circle</doc:shape>
        <doc:degrees>360</doc:degrees>
        <doc:sides/>
      </doc:row>
      <doc:row>
        <doc:index>2</doc:index>
        <doc:shape>triangle</doc:shape>
        <doc:degrees>180</doc:degrees>
        <doc:sides>3.0</doc:sides>
      </doc:row>
    </doc:data>

df.to_excel

与pd.read_excel()对应,找这个pd.DataFrame.to_excel()扩展学习:

 参数表及对应默认值:

to_excel(
	excel_writer,
	sheet_name: 'str' = 'Sheet1',
	na_rep: 'str' = '',
	float_format: 'Optional[str]' = None,
	columns=None,
	header=True,
	index=True,
	index_label=None,
	startrow=0,
	startcol=0,
	engine=None,
	merge_cells=True,
	encoding=None,
	inf_rep='inf',
	verbose=True,
	freeze_panes=None,
	storage_options: 'StorageOptions' = None)

注意点:

要将单个对象写入Excel的.xlsx文件,只需指定目标文件名。但要写入多个工作表,需要创建一个具有目标文件名的“ExcelWriter”对象,并在文件中指定要写入的工作表。

通过指定唯一的“sheet_name”,可以写入多个工作表。

将所有数据写入文件后,需要保存更改。

请注意,使用已存在的文件名创建“ExcelWriter”对象将导致删除现有文件的内容。

参数详解:

01. excel_writer

excel_writer:path like、file like或ExcelWriter对象文件路径或现有ExcelWriter。

02. sheet_name: 'str' = 'Sheet1'

sheet_name:str,默认为“Sheet1”

将包含DataFrame的工作表的名称。

03. na_rep: 'str' = ''

na_rep:str,默认“”

缺少数据表示。

04. float_format: 'Optional[str]' = None

float_format:str,可选

设置浮点数的字符串格式。例如

“float_format=“%.2f”“将设置0.1234到0.12的格式。

05. columns=None

str的序列或列表,可选

要写入的列。

06. header=True

header:bool或str列表,默认为True

写出列名。如果给定了字符串列表,则假定为列名的别名。

07. index=True

index:bool,默认为True

写入行名称(索引)。

08. index_label=None

index_label:str或sequence,可选

索引列的列标签(如果需要)。如果未指定,并且“header”和“index”为True,则使用索引名称。如果DataFrame使用MultiIndex,则应给出序列。

09. startrow=0

startrow:int,默认值0

要转储数据帧的左上单元格行。

10. startcol=0

startcol:int,默认值0

要转储数据帧的左上角单元格列。

11. engine=None

引擎:str,可选

要使用的写入引擎“openpyxl”或“xlsxwriter”。您也可以通过选项`io.exel.xlsx.writer“、`io.excel.xls.writer“和`io.exex.xlsm.writer`设置此选项。

..已弃用::1.2.0

作为`xlwt<https://pypi.org/project/xlwt/>`__包不再维护,“xlwt”引擎将在未来版本的panda中删除。

merge_cells=True

merge_cells:bool,默认为True

将多索引和分层行写入合并单元格。

12. encoding=None

编码:str,可选

生成的excel文件的编码。只有xlwt才需要,其他编写器本机支持unicode。

13. inf_rep='inf'

inf_rep:str,默认为“inf”

无限表示(Excel中没有无限的原生表示)。

14. verbose=True

verbose:bool,默认为True

在错误日志中显示更多信息。

15. freeze_panes=None

冷冻路径:整数元组(长度2),可选

指定要冻结的最底行和最右列。

16. storage_options: 'StorageOptions' = None

storage_options:dict,可选

对特定存储连接有意义的额外选项,例如。

主机、端口、用户名、密码等,如果使用将由`fsspec“解析的URL,例如,开始“s3://”、“gcs://”。如果使用非fsspec URL提供此参数,将引发错误。

请参阅fsspec和后端存储实现文档,以获取一组允许的键和值。

   待续……

文章出处登录后可见!

已经登录?立即刷新

共计人评分,平均

到目前为止还没有投票!成为第一位评论此文章。

(0)
心中带点小风骚的头像心中带点小风骚普通用户
上一篇 2023年3月5日 下午1:00
下一篇 2023年3月5日 下午1:01

相关推荐