python 2.7 - How to store data crawled in scrapy in specfic order? -
i have crawl info web page in specfic order liked declared fields in item class , have set them in csv file.problem occuring there stores info not in specfic order scrapping info of field , putting in csv file want should store info declared in item class. newbie in python. can tell me how
for ex: item class class dmozitem(item): title = field() link = field() desc = field()
now when storing info in csv file storing first desc ,link , title "desc": [], "link": ["/computers/programming/"], "title": ["programming"]}
the reason order of info in csv file not declared because item dict info type. order of keys in dict decided alphabet order. logic of export items csv file implemented in
scrapy\contrib\exporter__init__.py
you can rewrite _get_serialized_fields method of baseitemexporter allow yield key-value pair in order of declaration. here illustration code
field_iter = ['title', 'link', 'desc'] field_name in field_iter: if field_name in item: field = item.fields[field_name] value = self.serialize_field(field, field_name, item[field_name]) else: value = default_value yield field_name, value
but remember, not universal solution.
python-2.7 scrapy web-crawler
No comments:
Post a Comment