biweeklybudget.screenscraper module

class biweeklybudget.screenscraper.ScreenScraper(savedir='./', screenshot=False)[source]

Bases: object

Base class for screen-scraping bank/financial websites.

do_screenshot()[source]

take a debug screenshot

doc_readystate_is_complete(foo)[source]

return true if document is ready/complete, false otherwise

error_screenshot(fname=None)[source]
get_browser(browser_name)[source]

get a webdriver browser instance

jquery_finished(foo)[source]

return true if jQuery.active == 0 else false

load_cookies(cookie_file)[source]

Load cookies from a JSON cookie file on disk. This file is not the format used natively by PhantomJS, but rather the JSON-serialized representation of the dict returned by selenium.webdriver.remote.webdriver.WebDriver.get_cookies().

Cookies are loaded via selenium.webdriver.remote.webdriver.WebDriver.add_cookie()

Parameters:cookie_file (str) – path to the cookie file on disk
save_cookies(cookie_file)[source]

Save cookies to a JSON cookie file on disk. This file is not the format used natively by PhantomJS, but rather the JSON-serialized representation of the dict returned by selenium.webdriver.remote.webdriver.WebDriver.get_cookies().

Parameters:cookie_file (str) – path to the cookie file on disk
wait_for_ajax_load(timeout=20)[source]

Function to wait for an ajax event to finish and trigger page load, like the Janrain login form.

Pieced together from http://stackoverflow.com/a/15791319

timeout is in seconds

xhr_get_url(url)[source]

use JS to download a given URL, return its contents

xhr_post_urlencoded(url, data, headers={})[source]

use JS to download a given URL, return its contents