从包含列表的 1 列创建列

xiaoxingxing nlp 205

原文标题Create columns from 1 column that contain lists

所以我有一个类似这样的熊猫数据框

text categories
First text [‘Tech’, ‘Business’]
Second text [‘Women’, ‘India’, ‘Tech’]

我想把它转换成

text Business India Tech Women
First text 1 0 1 0
Second text 0 1 1 1

我有 200 个独特的类别,所以不能选择手动。有人可以帮我吗?

原文链接:https://stackoverflow.com//questions/71448722/create-columns-from-1-column-that-contain-lists

回复

我来回复
  • Inde7的头像
    Inde7 评论
    import pandas as pd
    
    # Create a df similar to yours
    
    data = [['First text', "['Tech', 'Business']"], ['Second text', "['Women', 'India', 'Tech']"]]
    df = pd.DataFrame(data, columns=['text', 'categories'])
    
    # We get rid of the quotes from the lists
    
    df["categories"] = df["categories"].apply(eval)
    
    # New df with zeros
    
    new = df[['text']].copy()
    new[['Business', 'India', 'Tech', 'Women']] = 0
    
    # Loop
    
    for count, value in enumerate(df["categories"]):
        for j in value:
            if j == "Business":
                new.iloc[count, new.columns.get_loc('Business')] = 1
            elif j == "India":
                new.iloc[count, new.columns.get_loc('India')] = 1
            elif j == "Tech":
                new.iloc[count, new.columns.get_loc('Tech')] = 1
            elif j == "Women":
                new.iloc[count, new.columns.get_loc('Women')] = 1
    
    print(new.head())
    

    不是最漂亮的代码,但它应该可以工作。

    2年前 0条评论