Jump to content

Accessing buttons on websites where the url doesn't change (Python)

Featured Replies

Here is an example website that I am trying to access with Python automatically. (http://www.waynecounty.com/sheriff/1359.htm)

 

The problem here is that the url doesn't change at all, which means that it's probably running its own program behind the scenes. I need to automatically detect and click the accept button. Furthermore, at the next part, I need to detect the last name and first name part of the following webpage. From that point, I need to then insert the raw_input('last name: '), raw_input('first name: ') in the appropriate spots. Then, to make it even more complex, I need to click on the more info buttons associated with that particular inmate so I can find the text (which needs to be ordered as well) so I can find out their charges and their bond information which needs to be sent back to the program.

 

I've tried-

import splinter
import selenium
from splinter import Browser
with Browser() as browser:
	browser.visit('http://www.waynecounty.com/sheriff/1359.htm')
	browser.find_by_name('Accept').click()

Traceback (most recent call last):
  File "<pyshell#14>", line 3, in <module>
    browser.find_by_name('Accept').click()
  File "C:\Python27\lib\site-packages\splinter\element_list.py", line 75, in __getattr__
    self.__class__.__name__, name))
AttributeError: 'ElementList' object has no attribute 'click'

import time

with Browser() as browser:
	browser.visit('http://www.waynecounty.com/sheriff/1359.htm')
	time.sleep(10)
	browser.find_by_name('Accept').click()

	

Traceback (most recent call last):
  File "<pyshell#27>", line 4, in <module>
    browser.find_by_name('Accept').click()
  File "C:\Python27\lib\site-packages\splinter\element_list.py", line 75, in __getattr__
    self.__class__.__name__, name))
AttributeError: 'ElementList' object has no attribute 'click'

from selenium import webdriver

def SearchWayne(url):
	driver = webdriver.PhantomJS()
	driver.set_window_size(1024,768)
	driver.get(url)
	driver.save_screenshot('screen.png')
	sbtn = driver.find_element_by_css_selector('Accept')
	sbtn.click()

SearchWayne('http://www.waynecounty.com/sheriff/1359.htm')

Traceback (most recent call last):
  File "<pyshell#37>", line 1, in <module>
    SearchWayne('http://www.waynecounty.com/sheriff/1359.htm')
  File "<pyshell#36>", line 2, in SearchWayne
    driver = webdriver.PhantomJS()
  File "C:\Python27\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py", line 50, in __init__
    self.service.start()
  File "C:\Python27\lib\site-packages\selenium\webdriver\phantomjs\service.py", line 69, in start
    raise WebDriverException("Unable to start phantomjs with ghostdriver.", e)
WebDriverException: Message: 'Unable to start phantomjs with ghostdriver.' ; Screenshot: available via screen 



no dice.

  • Author

Thank you, I was just talking to my boss about this and he said to try and access the source code. I was worried because the url doesn't change. We need to scrape the data of all the inmates in that particular website. On the same topic, I'm going to be faced with another problem. My boss says that the people responsible for this website (http://itasw0aepv01.macombcountymi.gov/jil/faces/InmateSearch.jsp) are a little more attentive to data miners. So, I'm wondering if the same rule applies here.

 

P.S.: Very surprised to see the admin bump in on this one. I'm honored

  • 2 weeks later...
  • Author

I hate to get back to this subject because I've already completed a working program, but I need to know if there's any way I can make it more efficient. Right now, the code is epic. It does what we need it to do, but it takes approximately 6 hours to run through the entire thing. What I'm wondering is if there is a way that we can just access the data directly as opposed to using the actual website and AI to do the scraping.

That would typically be covered by some sort of an official API, or database access, which you would need to inquire from the websites you are trying to scrape. I would also not tell them that you are currently scraping them...

  • 2 weeks later...

Archived

This topic is now archived and is closed to further replies.

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.