WebDriver 101: Intro to (Python) Selenium WebDriver (Part 1)

- December 4, 2018

Selenium WebDriver is a popular open-source software testing framework for automating the use of a Web browser with Web applications. WebDriver is cross-platform, working on Windows, macOS, and Linux. It's cross-browser, with downloadable drivers for Google Chrome, Mozilla Firefox, Microsoft Internet Explorer, Microsoft Edge, Opera, and Apple Safari. Previously, WebDriver natively supported Firefox, but the current version doesn't. WebDriver can be programmed using any of several different programming languages. The officially supported languages are C#, Java, JavaScript (Node.js), Python, and Ruby. Other languages that can be used include PHP, Perl, Haskell, Objective-C, R, and Elixir.

This tutorial uses Selenium WebDriver to open the Mozilla Firefox browser on Microsoft Windows, type a search term into the DuckDuckGo search engine, submit the search to DuckDuckGo, and display the search results. It uses the Python programming language. [It briefly mentions how to use Google Chrome instead of Firefox.]

Installation

If you don't already have Mozilla Firefox installed, then the first step is to download it from the Download Firefox in your language page, and install it. [If you don't have Mozilla Firefox installed, but do have Google Chrome installed, you can use that instead.]

Next, download the latest version (currently 3.7.1) of Python for Windows from the Download Python page, and install it. If the installation prompts you, asking whether to add Python to your Windows path, answer "Yes".

Next, install the Python version of WebDriver by going to the Windows command prompt and typing:

pip install selenium

Then, download the latest version of Gecko, the browser engine for Mozilla Firefox (and Thunderbird). Download the Zip file via the Releases link on the Mozilla GeckoDriver - GitHub page. And extract the geckodriver.exe file to the Windows folder you want to run this script in. [If you'll be using Google Chrome, then download the latest Zip file from the ChromeDriver - WebDriver for Chrome page, and extract the chromedriver.exe file from it.]

Copy my WebDriver101.py file into that same folder.

Executing and Explaining WebDriver101.py

Open a Windows command prompt in that folder and type:

python WebDriver101.py

Python will then execute the WebDriver101.py script, which does the following:

from selenium import webdriver
import time

The 1st line tells Python we'll be using the Python Selenium WebDriver you installed with "pip". The 2nd line tells Python we'll be using functions [such as time.sleep()] from the "time" module in Python's standard library.

driver = webdriver.Firefox(executable_path="geckodriver.exe")

That line says to use the Gecko Webdriver (file 'geckodriver.exe') from the current folder, to open an instance of the Mozilla Firefox browser.

assert ("https://duckduckgo.com/" in driver.current_url), \
    "The current URL should include 'https://duckduckgo.com/', but it doesn't."
assert ("DuckDuckGo" in driver.title), \
    "The page title should include 'DuckDuckGo', but it doesn't."

Those lines "assert" things about the current URL and page title which should be True. If either of them is instead False, Python will print the corresponding message (such as "The current URL should include ...") to the screen.

driver.find_element_by_id("search_form_input_homepage").send_keys("python webdriver")

That line uses Selenium WebDriver to find the HTML element with an ID of "search_form_input_homepage", and send the keystrokes needed to type "python webdriver" into it. That HTML element is DuckDuckGo's search text box:

<input id="search_form_input_homepage" class="js-search-input search__input--adv" type="text" autocomplete="off" name="q" tabindex="1" value="" autocapitalize="off" autocorrect="off">


The next line:

time.sleep(5)  # Wait 5 seconds for the user to see what was typed

uses the sleep() function from Python's time module to do what its comment ("Wait 5 seconds for user to see what was typed") says.

The lines:

driver.find_element_by_id("search_form_input_homepage").send_keys(u'\ue004')
time.sleep(5)  # Wait 5 seconds for the user to see what was typed

send a TAB character to the search text box, and wait 5 seconds. It looks just like you'd typed the characters in yourself and then hit the TAB key:

Firefox (brought up by Selenium WebDriver) showing 'python webdriver' typed in DuckDuckGo search box


[For a list of the key codes for special characters like TAB, see the Python WebDriver API - Special Keys page in the unofficial documentation for Selenium (WebDriver) with Python.]

And the lines:

driver.find_element_by_id("search_form_input_homepage").submit()
time.sleep(5)  # Wait 5 seconds for the user to see the search results

submit the search text to DuckDuckGo, and wait 5 seconds.

The final line:

driver.quit()

closes the browser and the program.

For the sake of simplicity, this tutorial limits itself to finding 1 HTML element using WebDriver's find_element_by_id() function. WebDriver can also find elements using an element's name attribute, its attribute, its class attribute, its tag name (such as "input", "button", "a", etc.), its XPath (XML Path), or its CSS selector. For links (HTML <a> elements), it can also find them using link text or partial link text. I hope to cover some of those in future tutorials, but if you're looking for more information now please see the Selenium (WebDriver) with Python - Locating Elements page.