fill row with previous row from another column

I have a dataframe grouped by Client-Equipment, Date and Closing_Date. I show example: Customer- Equipment Date Closing Date Customer1 – Equipment A 2023-01-01 2023-01-05 Customer1 – Equipment A …

问题描述:

I have a dataframe grouped by Client-Equipment, Date and Closing_Date. I show example:

Customer- Equipment Date Closing Date
Customer1 – Equipment A 2023-01-01 2023-01-05
Customer1 – Equipment A 2023-01-02 NaN
Customer1 – Equipment A 2023-01-03 NaN
Customer1 – Equipment A 2023-01-04 NaN
Customer1 – Equipment A 2023-01-05 NaN
Customer1 – Equipment A 2023-01-06 NaN
Customer2 – Equipment H 2023-01-01 2023-01-02
Customer2 – Equipment H 2023-01-02 NaN
Customer2 – Equipment H 2023-01-03 Nan

I need to fill in the Closing dates until the date is equal to the closing date. The expected result would be:

Customer- Equipment Date Closing Date
Customer1 – Equipment A 2023-01-01 2023-01-05
Customer1 – Equipment A 2023-01-02 2023-01-05
Customer1 – Equipment A 2023-01-03 2023-01-05
Customer1 – Equipment A 2023-01-04 2023-01-05
Customer1 – Equipment A 2023-01-05 2023-01-05
Customer1 – Equipment A 2023-01-06 NaN
Customer2 – Equipment H 2023-01-01 2023-01-02
Customer2 – Equipment H 2023-01-02 2023-01-02
Customer2 – Equipment H 2023-01-03 Nan

I’m trying codes like this:

df['test'] = df.groupby('Customer-Equipment').apply(
lambda x: x['Closing date'] if x['date'] <= x.at[row.index -1 ,'closing date'] else pd.NaT).fillna(method = 'ffill').reset_index(drop=True)

How could this be done in python?

解决方案 1[最佳方案][1]

If your dates are in increasing order, you could just groupby.ffill and mask with where:

s = df.groupby('Customer-Equipment')['Closing Date'].ffill()
df['Closing Date'] = s.where(s.ge(df['Date']))

Output:

        Customer-Equipment        Date Closing Date
0  Customer1 - Equipment A  2023-01-01   2023-01-05
1  Customer1 - Equipment A  2023-01-02   2023-01-05
2  Customer1 - Equipment A  2023-01-03   2023-01-05
3  Customer1 - Equipment A  2023-01-04   2023-01-05
4  Customer1 - Equipment A  2023-01-05   2023-01-05
5  Customer1 - Equipment A  2023-01-06          NaN
6  Customer2 - Equipment H  2023-01-01   2023-01-02
7  Customer2 - Equipment H  2023-01-02   2023-01-02
8  Customer2 - Equipment H  2023-01-03          Nan

解决方案 2:[2]

Try like this

import pandas as pd


data = {
    'Customer-Equipment': ['Customer1 - Equipment A'] * 6 + ['Customer2 - Equipment H'] * 3,
    'Date': pd.to_datetime(['2023-01-01', '2023-01-02', '2023-01-03', '2023-01-04', '2023-01-05', '2023-01-06',
                            '2023-01-01', '2023-01-02', '2023-01-03']),
    'Closing Date': [pd.to_datetime('2023-01-05')] * 5 + [pd.to_datetime('2023-01-02'), pd.NaT, pd.NaT]
}

df = pd.DataFrame(data)

def fill_closing_dates(group):
    closing_date = group['Closing Date'].iloc[0]
    group['Closing Date'] = group['Date'].apply(lambda x: closing_date if x <= closing_date else pd.NaT)
    return group

df = df.groupby('Customer-Equipment').apply(fill_closing_dates)

print(df)

参考链接:

Copyright Notice: This article follows StackOverflow’s copyright notice requirements and is licensed under CC BY-SA 3.0.

Article Source: StackOverflow

[1] mozway

[2] Mahboob Nur

共计人评分,平均

到目前为止还没有投票!成为第一位评论此文章。

(0)
乘风的头像乘风管理团队
上一篇 2023年12月14日
下一篇 2023年12月14日

相关推荐