Understanding Web Browser Automation
So what did this code actually do?
NOTE: This section goes much deeper into how web browser automation works. Feel free to skip ahead and come back when you're ready to dig deeper.
How Thirtyfour Works
Well, the first thing to know is that thirtyfour
doesn't talk directly to the Web Browser.
It simply fires off commands to the webdriver server as HTTP Requests.
The webdriver then talks to the Web Browser and tells it to execute each command, and then
returns the response from the Web Browser to thirtyfour
. As long as the webdriver is
running, thirtyfour
can do just about anything a human can do in a web browser.
Explaining The Code
So let's go through the code and see what is going on.
#![allow(unused)] fn main() { let caps = DesiredCapabilities::chrome(); let driver = WebDriver::new("http://localhost:9515", caps).await?; }
This is where we actually make the initial connection to the webdriver and start a new session in the web browser. This will open the browser in a new profile (so it won't add anything to your history etc.) and navigate to the default start page.
Capabilities
The way we tell it what browser we want is by using DesiredCapabilities
. In this case,
we construct a new ChromeCapabilities
instance. Each *Capabilities
struct will have
additional helper methods for setting options like headless mode, proxy, and so on.
See the documentation for more details.
You may have heard of
selenium
.Selenium
is simply a proxy that forwards requests to one or more webdriver servers. UsingSelenium
you can interact with multiple browsers at once. You simply tellSelenium
where each of the webdrivers are, and some details of each browser, and then in your code you givethirtyfour
the address of the selenium server, not the webdriver, and use theDesiredCapabilities
to request a particular browser.
Element Queries
Next, we tell the browser to navigate to Wikipedia:
#![allow(unused)] fn main() { driver.goto("https://wikipedia.org").await?; }
And then we look for an element on the page:
#![allow(unused)] fn main() { let elem_form = driver.find(By::Id("search-form")).await?; }
We search for elements using what we call a selector
or locator
. In this case we are looking for
an element with the id of search-form
.
If you actually navigate to https://wikipedia.org and open your browser's devtools (F12 in most browsers), then go to the
Inspector
tab, you will see the raw HTML for the page. In the search box at the top of the inspector, if you type#search-form
(the # is how we specify an id) you will see that it highlights the search form element.
This is the container element that contains both the input field and the button itself.
But we want to type into the field, so we need to do another query:
#![allow(unused)] fn main() { let elem_text = elem_form.find(By::Id("searchInput")).await?; }
And again, if you search in the inspector for #searchInput
you get an <input />
element
which is the one we want to type into.
Typing Text Into An Element
To type into a field, we just tell thirtyfour
to send the keys we want to type:
#![allow(unused)] fn main() { elem_text.send_keys("selenium").await?; }
This will literally type the text into the input field.
Now we need to find the search button and click it. Finding the element means doing another query:
#![allow(unused)] fn main() { let elem_button = elem_form.find(By::Css("button[type='submit']")).await?; }
This time we use a CSS
selector to locate a <button>
element with an attribute type
that
has the value submit
. To learn more about selectors, click here.
Clicking The Button
Next, the call to click()
tells thirtyfour
to simulate a click on that element:
#![allow(unused)] fn main() { elem_button.click().await?; }
Dealing With Page Loading Times
The page now starts loading the result of our search on Wikipedia. This brings us to our first issue when automating a web browser. How does our code know when the page has finished loading? If we had a slow internet connection, we might try to find an element on the page only for it to fail because that element hasn't loaded yet.
How do we solve this?
Well, one option is to simply tell our code to wait for a few seconds and then try to find the element we are looking for. But this is brittle and likely to fail. Don't do this. There are cases where you are forced to use this approach, but it should be a last resort. Incidentally, from a website testing perspective, it probably also means a human is going to be unsure of when the page has actually finished loading, and this is an indicator of a poor user experience.
Instead, we usually want to look for an element on the page and wait until that element is visible.
#![allow(unused)] fn main() { driver.query(By::ClassName("firstHeading")).first().await?; assert_eq!(driver.title().await?, "Selenium - Wikipedia"); }
Better Element Queries
To have thirtyfour
"wait" until an element has loaded, we use the query()
method which provides
a "builder" interface for constructing more flexible queries. Again, this uses one of the By
selectors, and we tell it to return only the first matching element.
The query()
method will poll every half-second until it finds the element. If it cannot find the
element within 30 seconds, it will timeout and return an error. The polling time and timeout duration
can be changed using WebDriverConfig
, or you can also override them for a given query.
The query()
method is the recommended way to search for elements. The find()
and find_all()
methods exist only for compatibility with the webdriver spec. The query()
method uses these
under the hood, but adds polling and other niceties on top, including giving you more details about
your query if an element was not found.
See the ElementQuery
documentation for more details on the kinds of queries it supports.
Closing The Browser
Finally we need to close the browser window. If we don't, it will remain open after our application has exited.
#![allow(unused)] fn main() { driver.quit().await?; }
And this concludes the introduction.
Happy Browser Automation!