The majority of most people’s Selenium code involves working with web elements.
This is the multi-page printable view of this section. Click here to print.
Web elements
- 1: File Upload
- 2: Locator strategies
- 3: Finding web elements
- 4: Interacting with web elements
- 5: Information about web elements
1 - File Upload
Because Selenium cannot interact with the file upload dialog, it provides a way
to upload files without opening the dialog. If the element is an input
element with type file
,
you can use the send keys method to send the full path to the file that will be uploaded.
2 - Locator strategies
A locator is a way to identify elements on a page. It is the argument passed to the Finding element methods.
Check out our encouraged test practices for tips on locators, including which to use when and why to declare locators separately from the finding methods.
Traditional Locators
Selenium provides support for these 8 traditional location strategies in WebDriver:
Locator | Description |
---|---|
class name | Locates elements whose class name contains the search value (compound class names are not permitted) |
css selector | Locates elements matching a CSS selector |
id | Locates elements whose ID attribute matches the search value |
name | Locates elements whose NAME attribute matches the search value |
link text | Locates anchor elements whose visible text matches the search value |
partial link text | Locates anchor elements whose visible text contains the search value. If multiple elements are matching, only the first one will be selected. |
tag name | Locates elements whose tag name matches the search value |
xpath | Locates elements matching an XPath expression |
Creating Locators
To work on a web element using Selenium, we need to first locate it on the web page. Selenium provides us above mentioned ways, using which we can locate element on the page. To understand and create locator we will use the following HTML snippet.
class name
The HTML page web element can have attribute class. We can see an example in the above shown HTML snippet. We can identify these elements using the class name locator available in Selenium.
css selector
CSS is the language used to style HTML pages. We can use css selector locator strategy to identify the element on the page. If the element has an id, we create the locator as css = #id. Otherwise the format we follow is css =[attribute=value] . Let us see an example from above HTML snippet. We will create locator for First Name textbox, using css.
id
We can use the ID attribute of an element in a web page to locate it. Generally the ID property should be unique for each element on the web page. We will identify the Last Name field using it.
name
We can use the NAME attribute of an element in a web page to locate it. Generally the NAME property should be unique for each element on the web page. We will identify the Newsletter checkbox using it.
link text
If the element we want to locate is a link, we can use the link text locator to identify it on the web page. The link text is the text displayed of the link. In the HTML snippet shared, we have a link available, let’s see how will we locate it.
partial link text
If the element we want to locate is a link, we can use the partial link text locator to identify it on the web page. The link text is the text displayed of the link. We can pass partial text as value. In the HTML snippet shared, we have a link available, lets see how will we locate it.
tag name
We can use the HTML TAG itself as a locator to identify the web element on the page. From the above HTML snippet shared, lets identify the link, using its html tag “a”.
xpath
A HTML document can be considered as a XML document, and then we can use xpath which will be the path traversed to reach the element of interest to locate the element. The XPath could be absolute xpath, which is created from the root of the document. Example - /html/form/input[1]. This will return the male radio button. Or the xpath could be relative. Example- //input[@name=‘fname’]. This will return the first name text box. Let us create locator for female radio button using xpath.
Utilizing Locators
The FindElement makes using locators a breeze! For most languages,
all you need to do is utilize webdriver.common.by.By
, however in
others it’s as simple as setting a parameter in the FindElement function
By
ByChained
The ByChained
class enables you to chain two By locators together. For example, instead of having to locate a parent element,
and then a child element of that parent, you can instead combine those two FindElement()
functions into one.
ByAll
The ByAll
class enables you to utilize two By locators at once, finding elements that mach either of your By locators.
For example, instead of having to utilize two FindElement()
functions to find the username and password input fields
seperately, you can instead find them together in one clean FindElements()
Relative Locators
Selenium 4 introduces Relative Locators (previously called Friendly Locators). These locators are helpful when it is not easy to construct a locator for the desired element, but easy to describe spatially where the element is in relation to an element that does have an easily constructed locator.
How it works
Selenium uses the JavaScript function getBoundingClientRect() to determine the size and position of elements on the page, and can use this information to locate neighboring elements.
Relative locator methods can take as the argument for the point of origin, either a previously located element reference, or another locator. In these examples we’ll be using locators only, but you could swap the locator in the final method with an element object and it will work the same.
Let us consider the below example for understanding the relative locators.

Available relative locators
Above
If the email text field element is not easily identifiable for some reason, but the password text field element is, we can locate the text field element using the fact that it is an “input” element “above” the password element.
Below
If the password text field element is not easily identifiable for some reason, but the email text field element is, we can locate the text field element using the fact that it is an “input” element “below” the email element.
Left of
If the cancel button is not easily identifiable for some reason, but the submit button element is, we can locate the cancel button element using the fact that it is a “button” element to the “left of” the submit element.
Right of
If the submit button is not easily identifiable for some reason, but the cancel button element is, we can locate the submit button element using the fact that it is a “button” element “to the right of” the cancel element.
Near
If the relative positioning is not obvious, or it varies based on window size, you can use the near method to
identify an element that is at most 50px
away from the provided locator.
One great use case for this is to work with a form element that doesn’t have an easily constructed locator,
but its associated input label element does.
Chaining relative locators
You can also chain locators if needed. Sometimes the element is most easily identified as being both above/below one element and right/left of another.
3 - Finding web elements
One of the most fundamental aspects of using Selenium is obtaining element references to work with. Selenium offers a number of built-in locator strategies to uniquely identify an element. There are many ways to use the locators in very advanced scenarios. For the purposes of this documentation, let’s consider this HTML snippet:
First matching element
Many locators will match multiple elements on the page. The singular find element method will return a reference to the first element found within a given context.
Evaluating entire DOM
When the find element method is called on the driver instance, it returns a reference to the first element in the DOM that matches with the provided locator. This value can be stored and used for future element actions. In our example HTML above, there are two elements that have a class name of “tomatoes” so this method will return the element in the “vegetables” list.
Evaluating a subset of the DOM
Rather than finding a unique locator in the entire DOM, it is often useful to narrow the search to the scope of another located element. In the above example there are two elements with a class name of “tomatoes” and it is a little more challenging to get the reference for the second one.
One solution is to locate an element with a unique attribute that is an ancestor of the desired element and not an ancestor of the undesired element, then call find element on that object:
Java and C#WebDriver
, WebElement
and ShadowRoot
classes all implement a SearchContext
interface, which is
considered a role-based interface. Role-based interfaces allow you to determine whether a particular
driver implementation supports a given feature. These interfaces are clearly defined and try
to adhere to having only a single role of responsibility.
Evaluating the Shadow DOM
The Shadow DOM is an encapsulated DOM tree hidden inside an element. With the release of v96 in Chromium Browsers, Selenium can now allow you to access this tree with easy-to-use shadow root methods. NOTE: These methods require Selenium 4.0 or greater.
Optimized locator
A nested lookup might not be the most effective location strategy since it requires two separate commands to be issued to the browser.
To improve the performance slightly, we can use either CSS or XPath to find this element in a single command. See the Locator strategy suggestions in our Encouraged test practices section.
For this example, we’ll use a CSS Selector:
All matching elements
There are several use cases for needing to get references to all elements that match a locator, rather than just the first one. The plural find elements methods return a collection of element references. If there are no matches, an empty list is returned. In this case, references to all fruits and vegetable list items will be returned in a collection.
Get element
Often you get a collection of elements but want to work with a specific element, which means you need to iterate over the collection and identify the one you want.
Find Elements From Element
It is used to find the list of matching child WebElements within the context of parent element. To achieve this, the parent WebElement is chained with ‘findElements’ to access child elements
Get Active Element
It is used to track (or) find DOM element which has the focus in the current browsing context.
4 - Interacting with web elements
There are only 5 basic commands that can be executed on an element:
- click (applies to any element)
- send keys (only applies to text fields and content editable elements)
- clear (only applies to text fields and content editable elements)
- submit (only applies to form elements)
- select (see Select List Elements)
Additional validations
These methods are designed to closely emulate a user’s experience, so, unlike the Actions API, it attempts to perform two things before attempting the specified action.
- If it determines the element is outside the viewport, it scrolls the element into view, specifically it will align the bottom of the element with the bottom of the viewport.
- It ensures the element is interactable before taking the action. This could mean that the scrolling was unsuccessful, or that the element is not otherwise displayed. Determining if an element is displayed on a page was too difficult to define directly in the webdriver specification, so Selenium sends an execute command with a JavaScript atom that checks for things that would keep the element from being displayed. If it determines an element is not in the viewport, not displayed, not keyboard-interactable, or not pointer-interactable, it returns an element not interactable error.
Click
The element click command is executed on the center of the element. If the center of the element is obscured for some reason, Selenium will return an element click intercepted error.
Send keys
The element send keys command
types the provided keys into an editable element.
Typically, this means an element is an input element of a form with a text
type or an element
with a content-editable
attribute. If it is not editable,
an invalid element state error is returned.
Here is the list of possible keystrokes that WebDriver Supports.
Clear
The element clear command resets the content of an element.
This requires an element to be editable,
and resettable. Typically,
this means an element is an input element of a form with a text
type or an element
with acontent-editable
attribute. If these conditions are not met,
an invalid element state error is returned.
Submit
In Selenium 4 this is no longer implemented with a separate endpoint and functions by executing a script. As such, it is recommended not to use this method and to click the applicable form submission button instead.
5 - Information about web elements
There are a number of details you can query about a specific element.
Is Displayed
This method is used to check if the connected Element is
displayed on a webpage. Returns a Boolean
value,
True if the connected element is displayed in the current
browsing context else returns false.
This functionality is mentioned in, but not defined by the w3c specification due to the impossibility of covering all potential conditions. As such, Selenium cannot expect drivers to implement this functionality directly, and now relies on executing a large JavaScript function directly. This function makes many approximations about an element’s nature and relationship in the tree to return a value.
Is Enabled
This method is used to check if the connected Element is enabled or disabled on a webpage. Returns a boolean value, True if the connected element is enabled in the current browsing context else returns false.
Is Selected
This method determines if the referenced Element is Selected or not. This method is widely used on Check boxes, radio buttons, input elements, and option elements.
Returns a boolean value, True if referenced element is selected in the current browsing context else returns false.
Tag Name
It is used to fetch the TagName of the referenced Element which has the focus in the current browsing context.
Size and Position
It is used to fetch the dimensions and coordinates of the referenced element.
The fetched data body contain the following details:
- X-axis position from the top-left corner of the element
- y-axis position from the top-left corner of the element
- Height of the element
- Width of the element
Get CSS Value
Retrieves the value of specified computed style property of an element in the current browsing context.
Text Content
Retrieves the rendered text of the specified element.
Fetching Attributes or Properties
Fetches the run time value associated with a DOM attribute. It returns the data associated with the DOM attribute or property of the element.