如果它包含具有特定 dep_ 的标记，如何从 Spacy 结果中排除句子？

扎眼的阳光 2年前 nlp 486

原文标题 ：How to exclude sentences from Spacy results if it contains a token with a specific dep_?

我想否定过滤 Spacy 结果。实际上，我希望句子在依赖解析中仅包含“pobj”而不包含“dobj”。然而，由于带有 ‘dobj’ 的句子很可能包含 ‘pobj’ 但反之不然，Spacy 也列出了带有 ‘dobj’ 的句子。

例如;

“他把书从书架上推下来”：

He nsubj
pushed ROOT
the det
book dobj
off prep
the det
shelf pobj

“这本书从桌子上掉下来了”

The det
book nsubj
fell ROOT
off prep
the det
table pobj

在这两个句子中，prep是pobj的直接头部，因此；

doc = nlp('He pushed the book off the shelf.The book fell off the table')
for t in doc:
     if t.dep_ == 'pobj': 
         print(t.sent)

会给我这两个句子作为回报。如何正确过滤以不列出同时包含“dobj”和“pobj”的句子，而仅列出包含“pobj”的句子

原文链接：https://stackoverflow.com//questions/71683715/how-to-exclude-sentences-from-spacy-results-if-it-contains-a-token-with-a-specif

我来回复

Fatih Bozdağ 评论

经过多次尝试，我找到了以下解决方案；

for a in doc:
    if a.dep_ == "prep" and a.pos_ == "ADP" and a.head.pos_ == "VERB":
        for b in a.head.children:
            if b.dep_ == "nsubj":
                sents = [t.sent for t in a.sent]
                for n in sents:
                    for c in n:
                        if c.dep_ == 'dobj':
                            pattern2_sents = [c.sent]
                        if c.dep_ != 'pobj':
                            pattern4_sents = [c.sent]

但是我不确定为什么简单的迭代if token.dep_ != 'dobj‘在原始问题中不起作用。

2年前 0条评论

如果它包含具有特定 dep_ 的标记，如何从 Spacy 结果中排除句子？

回复

相关问题