python - nltk NER word extraction -
i have checked previous related threads, did not solve issue. have written code ner text.
text = "stallone jason's cinema rocky inducted national cinema registry having cinema props placed in smithsonian museum." tokenized = nltk.word_tokenize(text) tagged = nltk.pos_tag(tokenized) namedent = nltk.ne_chunk(tagged, binary = true) print namedent namedent = nltk.ne_chunk(tagged, binary = false)
which gives short of result
(s (ne stallone/nnp) jason/nn 's/pos film/nn (ne rocky/nnp) was/vbd inducted/vbn into/in the/dt (ne national/nnp film/nnp registry/nnp) as/in well/rb as/in having/vbg its/prp$ film/nn props/nns placed/vbn in/in the/dt (ne smithsonian/nnp museum/nnp) ./.)
while expect ne result, like
stallone rockey national cinema registry smithsonian museum
how accomplish this?
update
result = ' '.join([y[0] y in x.leaves()]) x in namedent.subtrees() if x.node == "ne" print result
gives syntext error, right way write this?
update2
text = "stallone jason's cinema rocky inducted national cinema registry having cinema props placed in smithsonian museum."
tokenized = nltk.word_tokenize(text) tagged = nltk.pos_tag(tokenized) namedent = nltk.ne_chunk(tagged, binary = true) print namedent np = [' '.join([y[0] y in x.leaves()]) x in namedent.subtrees() if x.node == "ne"] print np
error:
np = [' '.join([y[0] y in x.leaves()]) x in namedent.subtrees() if x.node == "ne"] file "/usr/local/lib/python2.7/dist-packages/nltk/tree.py", line 198, in _get_node raise notimplementederror("use label() access node label.") notimplementederror: utilize label() access node label.
so tried
np = [' '.join([y[0] y in x.leaves()]) x in namedent.subtrees() if x.label() == "ne"]
which gives emtpy result
the namedent
returned tree
object subclass of list
. can next parse it:
[' '.join([y[0] y in x.leaves()]) x in namedent.subtrees() if x.node == "ne"]
output:
['stallone', 'rocky', 'national cinema registry', 'smithsonian museum']
the binary
flag set true
indicate whether subtree ne or not, need above. when set false
give more info whether ne organization, person etc. reason, result flag on , off don't seem agree 1 another.
python regex nlp nltk
No comments:
Post a Comment