What is Selenium WebDriver? [The Complete Guide]

Posted in

What is Selenium WebDriver? [The Complete Guide]

Sameeksha Medewar
Last updated on November 5, 2022

    Selenium is one of the most widely used automated testing frameworks that automate testing for web applications. It consists of four components: Selenium Remote Control (RC), Selenium Grid, Selenium IDE, and Selenium WebDriver. Each has its unique characteristics and satisfies different testing requirements.

    Jason Huggings, in 2004, created Selenium to fulfill the need for a tool that would automate web testing. Selenium is a suite of tools, and each tool is developed by a different developer. Here is the list of developers and their corresponding contributions to the Selenium suite:

    • Paul Hammant developed Selenium RC or Selenium 1, the first component of Selenium. It is a server that uses the HTTP protocol to accept commands for the browser.
    • Patrick Lightbody developed Selenium Grid, which enables multiple Selenium tests to run parallelly across different machines.
    • Shinya Kasatani developed Selenium IDE, an integrated development environment for running test scripts.
    • Simon Stewart developed Selenium WebDriver, a successor to Selenium RC.

    This article focuses on Selenium Webdriver, one of the most significant components of Selenium. We shall make you familiar with the different features of Selenium Webdriver, its architecture, and browser-specific web drivers. So, let us start with an explanation of Selenium WebDriver.

    What is Selenium WebDriver?

    Selenium WebDriver is a core component of Selenium and was introduced as a part of Selenium 2.0. It is an interface, enabling us to write instructions that can run across modern browsers. Selenium WebDriver accepts commands via a client API or those sent in Selenese and sends them to a browser. Selenese is a test domain-specific language provided by Selenium.

    It permits us to write test scripts in various programming languages, such as Python, Ruby, Groovy, Scala, C#, JavaScript, Java, Perl, and PHP.

    Selenium WebDriver supports all modern browsers, such as Internet Explorer, Microsoft Edge, Safari, Opera, HTMLUnit, Google Chrome, and Mozilla Firefox . It uses a browser-specific driver to send the commands to a browser and fetch results. Unlike Selenium RC, Selenium WebDriver does not require a dedicated server to run tests.

    Instead, it directly communicates with the web browser. Therefore, Selenium WebDriver is relatively faster than Selenium RC. Additionally, we can use Selenium Grid with Selenium WebDriver to run tests on remote machines.

    Features of Selenium WebDriver

    The following features of Selenium WebDriver make it a powerful and great choice for developers to run cross-browser tests:

    1. Multi-Browser Compatibility

    Selenium WebDriver supports almost all modern browsers, including Chrome, Firefox, Opera, and Safari. With Selenium WebDriver, you can launch any of the supported browsers using a simple command. For instance, you can launch Google Chrome using the following command: WebDriver driver = new ChromeDriver(); It also supports HTMLUnitDriver, and mobile drivers, such as AndroidDriver, and iPhoneDriver.

    2. Multiple Language Support

    Selenium WebDriver supports different programming languages, like JavaScript, PHP, Ruby, Perl, C#, Python, and Java. Therefore, we can choose any of these supported programming languages to write automation test scripts.

    3. Speed and Performance

    Selenium WebDriver offers high performance and speed. Unlike its predecessor, Selenium RC, it directly communicates with a browser without requiring an intermediate server, ensuring superior performance and speed.

    4. Handles Dynamic Web Elements

    Handling dynamic web elements while performing automation testing is considered one of the most challenging tasks. Nonetheless, Selenium WebDriver makes it easier for developers to handle dynamic web elements, such as checkboxes, alters, and dropdowns. It uses three different methods to handle such elements, namely Absolute XPath, Contains(), and Starts-with().

    5. Easy to Identify Web Elements

    There is a collection of locators provided by WebDriver that is helpful in finding web elements on a web page. The below is the list of some commonly used locators:

    • Name
    • ID
    • Xpath
    • DOM
    • ClassName
    • LinkText
    • PartialLinkText
    • CSS Selector
    • TagName

    Selenium WebDriver Architecture

    Selenium 2.0, the latest release of Selenium, is integrated with WebDriver API. This API facilitates communication between different languages and browsers. The Selenium WebDriver architecture consists of four major components, namely Selenium client libraries, JSON wire protocol, browser drivers, and browsers. Let us understand each one of them in detail below:

    1. Selenium Client Libraries (a.k.a. Selenium Language Bindings)

    With Selenium client libraries, you can use any supported programming languages, like C#, Java, JavaScript, Python, Ruby, and Perl to write Selenium test scripts. Therefore, this component of Selenium WebDriver enables us to choose the programming language of our choice.

    Selenium client libraries are different types of jar files that contain classes and methods required for creating automation test scripts. For example, if you choose PHP to write test scripts in your browser, you will need the PHP language bindings. You can download all the supported language bindings or client libraries from here .

    2. JSON Wire Protocol

    JavaScript Object Notation (JSON) is one of the most popular and commonly used data interchange formats. It facilitates data exchange between clients and servers on the web. JSON Wire Protocol in Selenium promotes the communication between different driver implementations and client libraries. It transfers information between a client and the HTTP server using a REST API .

    3. Browser Drivers

    Each browser has its own browser driver. For example, Google Chrome has ChromeDriver, and Firefox has FirefoxDriver. Without disclosing the internal logic of a browser’s functionality, these drivers establish a connection and communicate with their respective browsers. They act as a bridge between the browser and the test code.

    4. Browsers

    Selenium WebDriver supports almost all modern browsers.

    Selenium WebDriver vs Selenium RC

    Selenium WebDriver is the improved version of Selenium RC. Selenium RC had many limitations, which led to the development of Selenium WebDriver. The following comparison table illustrates how Selenium WebDriver outperforms Selenium RC:

    Parameters Selenium Remote Control Selenium WebDriver
    Architecture The architecture of Selenium RC is complex since it involves an intermediate server that facilitates communication between Selenium commands and a browser. You need to install the intermediate RC server before running the test scripts. The architecture of Selenium WebDriver is simple as it does not require any intermediate server.
    Speed Selenium RC is slower than Selenium WebDriver because it uses an intermediate server to communicate with a browser. It is faster than Selenium WebDriver as it communicates directly with the browser.
    Browser Support It does not support HtmlUnit . Selenium WebDriver supports the HtmlUnit browser, which is a headless and fast browser, accelerating the test execution cycles.
    API Selenium RC’s API has confusing and redundant commands, making it more complex. It has a simple API without any unnecessary commands.

    Types of Browser-specific Web Drivers

    To run automation test scripts in browsers, we require browser-specific drivers. Let us discuss a few browser -specific web drivers and learn to write commands to launch them in a browser using different programming languages.

    1. HtmlUnitDriver

    HtmlUnitDriver is a headless browser webdriver in Selenium for HtmlUnit. Although similar to other web drivers, it does not have a graphical user interface. Thus, we cannot see the text execution on the screen. Use the below commands to use HtmlUnitDriver:

    • Java
    WebDriver driver = new HtmlUnitDriver();
    • Csharp
    IWebDriver driver = new RemoteWebDriver(new Uri(""), DesiredCapbilities.HtmlUnit());
    • Perl
    my $driver = Selenium::Remote::Driver->new(browser_name=>'htmlunit', remote_server_addr=> 'localhost');
    • Ruby
    driver = Selenium::WebDriver.for :remote, :url => "http://localhost:4444/wd/hub", :desired_capabilities => :htmlunit
    • Python
    driver = webdriver.Remote("http://localhost:4444/wd/hub", webdriver.DesiredCapabilities.HTMLUNIT.copy())

    2. FirefoxDriver

    FirefoxDriver is a browser engine developed by Mozilla and is also known as Selenium GeckoDriver. It links Selenium test scripts with the Firefox browser. We have listed the commands in different programming languages for using the FirefoxDriver:

    • Java
    WebDriver driver = new FirefoxDriver();
    • Csharp
    IWebDriver driver = new FirefoxDriver();
    • Perl
    my $driver = Selenium::Remote::Driver->new;
    • Ruby
    driver = Selenium::WebDriver.for :firefox
    • Python
    driver = webdriver.Firefox();

    3. InternetExplorerDriver

    InternetExplorerDriver is used to run the Selenium test scripts on the Internet Explorer browser. It supports Internet Explorer 7 and later. The following are the commands in different programming languages to use the InternetExplorerDriver:

    • Java
    WebDriver driver = new InternetExplorerDriver();
    • Csharp
    IWebDriver driver = new InternetExplorerDriver();
    • Perl
    my $driver = Selenium::Remote::Driver->new(browser_name=>'internet explorer');
    • Ruby
    driver = Selenium::WebDriver.for :ie
    • Python
    driver = webdriver.Ie()

    4. ChromeDriver

    ChromeDriver works with Google Chrome to automate Selenium test scripts. Insert the below commands in your code to use the ChromeDriver:

    • Java
    WebDriver driver = new ChromeDriver();
    • Csharp
    IWebDriver driver = new ChromeDriver();
    • Perl
    my $driver = Selenium::Remote::Driver->(browser_name => 'chrome');
    • Ruby
    driver = Selenium::WebDriver.for chrome
    • Python
    driver = webdriver.Chrome()


    Selenium WebDriver is a popular choice for developers to write automated cross-browser test scripts. The ease of use, simple architecture, flexibility of language choices, compatibility with Android and iOS, and support for all modern browsers make it a powerful tool for developers and testers.

    Want to learn and champion Selenium WebDriver from scratch, you can sign up here for the course.

    People are also reading:


    There are four major components of Selenium WebDriver: 1. Selenium Client Libraries 2. JSON Wire Protocol 3. Browser Drivers 4. Browsers.

    Selenium WebDriver is an API that lets users interact with browsers on an operating system. The primary use of this API is to test web applications. However, you can use it for any task wherever there is a need for browser automation.

    Some major drivers available in Selenium WebDriver are ChromeDriver, OperaDriver, FirfoxDriver, SafariDriver, InternetExplorerDriver, EdgeDriver, EventFiringWebDriver, and RemoteWebDriver.

    Leave a Comment on this Post