2,970 questions
2
votes
1
answer
155
views
Extract tables from website with dynamic content with R
I'm trying to extract tables from this site:
https://www.dnb.com/business-directory/company-information.beverage_manufacturing.br.html
As you can see, the complete table has 14,387 rows and each page ...
1
vote
1
answer
73
views
I can't select the css element to data scrape in R. How do I format it to select the information?
I am learning to scrape data and using the website https://quotes.toscrape.com/ as a training dataset. When I try to collect the about section links, I get this error: Error in html_attr(html_elements(...
1
vote
2
answers
72
views
How to extract tables hierarchically (grouping by title) on a website using rvest?
A website 'https://www.environnement.gouv.qc.ca/eau/potable/distribution/resultats.asp', stores the data in 3 different tables : 1. region, 2. mrc and 3. reseau. Essentially, I'm trying to extract the ...
0
votes
1
answer
117
views
Trying to download pdfs in R
I am trying to get a links of pdfs from a site in R but the rvest read_html() function just sites there, seemingly making no progress.
Here is my code:
# Load required libraries
library(tidyverse)
...
0
votes
3
answers
100
views
how to turn html text into multiple different columns in r
this is the code i wrote to generate the data:
info <- html_nodes(manga, ".mt4") %>% html_text2() %>% strsplit("\n")
it returns 50 rows of lists that that look like this:
[...
0
votes
1
answer
65
views
WebScraping in nodes and elements in R
I am trying to scrape the name and location of the following https://www.casa.gov.au/search-centre/aerodromes but I get an empty DF - any help is appreciated !Ive tried using a CSS selector and xpath ...
4
votes
1
answer
90
views
How to extract text from a website that uses Javascript with rvest?
At this link (https://portraits.ouranos.ca/fr/spatial?a=0&c=0&discrete=1&e=CMIP6&i=tg_mean&p=50&r=mrc001&s=annual&scen=ssp370&w=0&yr=2071) there is a tag (...
1
vote
1
answer
254
views
I'm getting an error "cannot open the connection" when trying to scrape a website with rvest
library(tidyverse)
library(rvest)
fruits <- read_html("https://tmarketonline.bg/category/plodove-zelenchuci-i-yadki?page=1")
fruits_df <- fruits %>%
html_elements("._product&...
0
votes
1
answer
55
views
Web scraping on tipti page that requires login
I'm trying to extract the name and prices of the AKI supermarket in Ecuador. There is a page called tipti that gathers products from several supermarkets.
However, it requires login and the page seems ...
5
votes
1
answer
432
views
"Target position can only be set for new windows" in chromote in R
I'm guessing this is probably some strange confluence of the latest Chrome version and chromote, but since about 24 hours ago, I get "Error in callback(...) : code: -32602
message: Target ...
0
votes
1
answer
68
views
webscrape table using rvest
I am attempting to scrape the table on this page using rvest
https://www.nrl.com/ladder/?competition=111&round=27&season=2024
This is what I have tried so far
library(rvest)
page <- ...
1
vote
0
answers
62
views
Web scraping a tournament bracket from trackwrestling.com
I'm trying to use the rvest package in R to scrape a tournament bracket from the 2024 NCAA div 1 wrestling tournament. I've used the selector gadget tool to determine that the CSS selector for the ...
2
votes
2
answers
110
views
Rvest returns only some html_nodes
I'm trying to scrape the gbarbosa page, but it only returns 8 nodes, when the total number of products is 16 in page one.
Any suggestions?
library(rvest)
url <- "https://www.gbarbosa.com.br/...
1
vote
1
answer
143
views
rvest read_html_live - memory, google chrome helper
I am running a loop through a large number of pages and noticing that my mac is slowing down and sometimes crashes completely. After reviewing the activity monitor, there are dozens of "Google ...
0
votes
1
answer
194
views
Using R Studio to Scrape AirBnb Data, but receiving NA values
I'm new to scraping and could use some advice.
Using R Vest, I'm able to scrape information out of certain areas in Airbnb, but not in the important areas having to do with the actual rooms/homes ...