JSON.loads() ValueError Extra Data in Python -
i'm trying read individual values json feed. here illustration of feed data:
{ "sendtoken": "token1", "bytes_transferred": 0, "num_retries": 0, "timestamp": 1414395374, "queue_time": 975, "message": "internalerror", "id": "mailerx", "m0": { "binding_group": "domain.com", "recipient_domain": "hotmail.com", "recipient_local": "destination", "sender_domain": "domain.com", "binding": "mail.domain.com", "message_id": "c1/34-54876-d36fa645", "api_credential": "creds", "sender_local": "localstring" }, "rejecting_ip": "145.5.5.5", "type": "alpha", "message_stage": 3 } { "sendtoken": "token2", "bytes_transferred": 0, "num_retries": 0, "timestamp": 1414397568, "queue_time": 538, "message": "internal error, "id": "mailerx", "m0": { "binding_group": "domain.com", "recipient_domain": "hotmail.com", "recipient_local": "destination", "sender_domain": "domain.com", "binding": "mail.domain.com", "message_id": "c1/34-54876-d36fa645", "api_credential": "creds", "sender_local": "localstring" }, "rejecting_ip": "145.5.5.5", "type": "alpha", "message_stage": 3 }
i can't share actual url, above first 2 of 150 results displayed if run
print results
before the
json.loads()
line.
my code:
import urllib2 import json results = urllib2.urlopen(url).read() jsondata = json.loads(results) row in jsondata: print row['sendtoken'] print row['recipient_domain']
i'd output like
token1 hotmail.com
for each entry.
i'm getting error:
valueerror: data: line 2 column 1 - line 133 column 1 (char 583 - 77680)
i'm far python expert, , first time working json. i've spent quite bit of time looking on google , stack overflow, can't find solution works specific info format.
the problem info don't form json object, can't decode them json.loads
.
first, appears sequence of json objects separated spaces. since won't tell info come from, educated guess; whatever documentation or coworker or whatever told url told format is. let's assume educated guess correct.
the easiest way parse stream of json objects in python utilize raw_decode
method. this:*
import json def parse_json_stream(stream): decoder = json.jsondecoder() while stream: obj, idx = decoder.raw_decode(stream) yield obj stream = stream[idx:].lstrip()
however, there's error in sec json object in stream. @ part:
… "message": "internal error, "id": "mailerx", …
there's missing "
after "internal error
. if prepare that, function above iterate 2 json objects.
hopefully error caused trying manually "copy , paste" info rewriting it. if it's in original source data, you've got much bigger problem; need write "broken json" parser scratch can heuristically guess @ info intended be. or, of course, whoever's generating source generate properly.
* in general, it's more efficient utilize sec argument raw_decode
pass start index, instead of slicing off re-create of remainder each time. raw_decode
can't handle leading whitespace. it's little easier piece , strip write code skips on whitespace given index, if memory , performance costs of copies matter, should write more complicated code.
python json url urllib2
No comments:
Post a Comment