Thursday, 15 September 2011

python - Print all elements in - - BeautifulSoap -



python - Print all elements in <li> - BeautifulSoap -

i web-scraping python , beautifulsoap.

i need scrape this

<li class="review-rating"> <h5 class="review-rating__title">location:</h5> <span class="review-rating__score">5</span> <h5 class="review-rating__title">value:</h5> <span class="review-rating__score">3</span> <h5 class="review-rating__title">facilities:</h5> <span class="review-rating__score">4</span> <h5 class="review-rating__title">service:</h5> <span class="review-rating__score">4</span> <h5 class="review-rating__title">cleanliness:</h5> <span class="review-rating__score">5</span> </li>

i have scraped markup code

for scores_of_this_customer in tt.select('li.review-rating'): print(scores_of_this_customer.select('h5.review-rating__title')[0].text +" "+scores_of_this_customer.select('span.review-rating__score')[0].text)

but prints location: 5

i want way print scores using loop.

i know can print other scores indexing them [1],[2]... , on don't want write 5 print statements

ps:

this code worked me.

if tt.select('li.review-rating'): soup = tt.select('li.review-rating').find("li", {"class", "review-rating"}) keys = soup.findall("h5", {"class" : "review-rating__title"}) values = soup.findall("span", {"class" : "review-rating__score"}) key, value in zip(keys, values): print(key.text + ": " + value.text)

i believe possible access them directly. seek this:

import urllib import bs4 url = "http://yoururl.com" html = urllib.urlopen(url).read() soup = bs4.beautifulsoup(html)

if need <li class="review-rating"> results, can set uncomment next part:

# soup = soup.find("li", {"class", "review-rating"})

then next part should nicely go through key / value combinations:

keys = soup.findall("h5", {"class" : "review-rating__title"}) values = soup.findall("span", {"class" : "review-rating__score"}) key, value in zip(keys, values): print(key.text + ": " + value.text)

this code worked op:

if tt.select('li.review-rating'): soup = tt.select('li.review-rating').find("li", {"class", "review-rating"}) keys = soup.findall("h5", {"class" : "review-rating__title"}) values = soup.findall("span", {"class" : "review-rating__score"}) key, value in zip(keys, values): print(key.text + ": " + value.text)

python python-3.x beautifulsoup

No comments:

Post a Comment