Back

Technologies:

javascriptjavascript
avatar
Tolerim
3 days ago

How can I extract only numeric values from a string of text?

I need to fetch price information from a website for the purpose of comparing prices with my own website. However, the exported data is in the form of $XXX.XXX, whereas I need it in a pure numerical format. Here's my code:

import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

public class scrape {
    public static void main(String[] args) {
        System.setProperty("webdriver.chrome.driver","...driver\\chromedriver.exe");
        ChromeDriver driver = new ChromeDriver();
        driver.manage().window().maximize();
        driver.get("...");
        WebElement price = driver.findElement(By.xpath("//p[contains(@class, 'box-price-present')]"));
        System.out.println("Price: "+price.getText().replaceAll("[^\\d.]", ""));
        driver.quit();
}}
As the extracted data is in text form, I need to convert it into a pure numerical format.

Answers(1)
avatar
Tolerim
3 days ago
Verified Answer
To convert the price from text form to pure number form, you can use JavaScript's replace() method and regular expressions. Here's an example of how you can modify your code to achieve this:
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

public class scrape {
    public static void main(String[] args) {
        System.setProperty("webdriver.chrome.driver","...driver\\chromedriver.exe");
        ChromeDriver driver = new ChromeDriver();
        driver.manage().window().maximize();
        driver.get("...");
        WebElement price = driver.findElement(By.xpath("//p[contains(@class, 'box-price-present')]"));
        String priceText = price.getText();
        String priceNumber = priceText.replace(/\D/g,'');
        // /\D/g is a regular expression that matches all non-numeric characters
        // and replaces them with an empty string
        System.out.println("price: "+priceNumber);
        driver.quit();
    }
}
In this modified code, we first extract the text content of the price element and store it in the priceText variable. We then use the replace() method to replace all non-numeric characters in the priceText string with an empty string. This gives us the pure number form of the price, which we store in the priceNumber variable and print to the console.
;