Monday, 15 March 2010

Python selenium screen capture not getting whole page -



Python selenium screen capture not getting whole page -

i trying create generic webcrawler go site , take screenshot. using python, selnium, , phantomjs. problem screenshot not capturing images on page. example, if go tube, doesn't capture images below main page image. (i don't have high plenty rep post screen shot) think may have dynamic content, have tried wait functions such implicitly wait , on set_page_load_timeout methods. because generic crawler can't wait specific event (i want crawl hundreds of sites).

is possible create generic webcrawler can screen capture trying do? code using is:

phantom = webdriver.phantomjs() phantom.set_page_load_timeout(30) phantom.get(response.url) img = phantom.get_screenshot_as_png() #64-bit encoded string phantom.quit

here image

your suggestion solved problem. used next code (stolen in part reply question):

driver = webdriver.phantomjs() driver.maximize_window() driver.get('http://youtube.com') scheight = .1 while scheight < 9.9: driver.execute_script("window.scrollto(0, document.body.scrollheight/%s);" % scheight) scheight += .01 driver.save_screenshot('screenshot.png')

python selenium scrapy phantomjs

No comments:

Post a Comment