如何在python中获取关键字之前或之后的两个单词的列表？

xiaoxingxing 2年前 nlp 547

原文标题 ：How to get a list of two words before or after a keyword in python?

我收集了这些数据，我试图确定关键字是否准确，它之前和之后的两个词是什么

数据 = pd.read_csv(‘jobs.csv’)

查看（数据）

Job	Discerption
Engineer	the job requires x,y,z…..
Driver	this job need a high-school and Communication skills

数据长度约为10k

比如关键词“Communication”我能不能找到Communication前后的词，让结果看起来像这样

Job	Discerption	after	before
Engineer	the job requires x,y,z	NA	NA
Driver	this job need a high-school and Communication skills	skills	high-school, and

Na，因为关键字不存在

我厌倦了熊猫和正则表达式，但没有什么对我有用：/

我非常感谢您的帮助

原文链接：https://stackoverflow.com//questions/71676195/how-to-get-a-list-of-two-words-before-or-after-a-keyword-in-python

我来回复

Stef 评论

您可以使用Series.map通过对每个元素应用函数来将一列映射到另一列。

如果一个元素是一个单词列表，你可以使用list.index找到你要查找的关键字的位置，然后列表切片sentence[i-2:i]获取给定索引之前的两个单词。

import pandas as pd

data = pd.DataFrame({
    'Job': ['Engineer', 'Driver'],
    'Description': ['the job requires x,y,z', 'this job need a high-school and Communication skills']
})

def get_two_words_before(sentence, word):
    sentence = sentence.split()
    if word in sentence:
        i = sentence.index(word)
        return sentence[i-2:i]
    else:
        return []

def get_two_words_after(sentence, word):
    sentence = sentence.split()
    if word in sentence:
        i = sentence.index(word)
        return sentence[i+1:i+3]
    else:
        return []

data['before'] = data['Description'].map(lambda x: get_two_words_before(x, 'Communication'))

data['after'] = data['Description'].map(lambda x: get_two_words_after(x, 'Communication'))

print(data)

输出：

        Job             Description              before     after
0  Engineer  the job requires x,y,z                  []        []
1    Driver  this job need a hig...  [high-school, and]  [skills]

2年前 0条评论

如何在python中获取关键字之前或之后的两个单词的列表？

回复

相关问题