51,778 questions
-3
votes
0
answers
46
views
How to download product images by SKU from a website using Python?
I have an Excel (or CSV) file with a list of product SKUs (like 1806B, 1911B, HR2470, etc.), and I would like to write a Python script that does the following:
Use each SKU to search for the product ...
-6
votes
0
answers
183
views
How can I extract the caption/description from a single Instagram post URL in Python? [closed]
Iβm trying to write a simple program that can take one Instagram post link and extract the caption/description of that particular post.
For example, given a link like:
https://www.instagram.com/p/...
-4
votes
0
answers
61
views
How to decode utf-8 text from newspaper3k library
class ArticleScraper:
def __init__(self):
pass
def articleScraper(self, article_links):
article_content = []
for url in article_links:
url_i = ...
-8
votes
0
answers
84
views
Beginner in Python and Web Scraping β Looking for Feedback on My Script [closed]
Iβm a software engineering student currently doing an internship in the Business Intelligence area at a university. As part of a project, I decided to create a script that scrapes job postings from a ...
1
vote
1
answer
137
views
Trouble scraping dynamic lottery results table β inconsistent parsing
Iβve been trying to scrape lottery results from a website that shows draws. The data is presented in a results table, but I keep running into strange issues where sometimes the numbers are captured ...
-8
votes
0
answers
85
views
Python 3.9, get in MS excel ALL physical addresses from the URL -> https://www.sappi.com/en-gb/about-us/locations [closed]
Get all physical addresses in MS-excel from this url [https://www.sappi.com/en-gb/about-us/locations]. no output from the code.
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = ...
2
votes
1
answer
152
views
Extract tables from website with dynamic content with R
I'm trying to extract tables from this site:
https://www.dnb.com/business-directory/company-information.beverage_manufacturing.br.html
As you can see, the complete table has 14,387 rows and each page ...
0
votes
0
answers
52
views
Disable assignment of window.location in Selenium
I'm trying to extract data from a website using Selenium. On random occasions, the page will do a client-side redirect with window.location. How can I disable this?
I've tried redefining the property ...
-4
votes
0
answers
81
views
How to fetch real-time updates from an API without CDN-induced delays? [closed]
Iβm building a service that monitors announcements from Upbit.
The main announcements page is here:
https://upbit.com/service_center/notice
That page fetches its data from this API endpoint:
https://...
1
vote
1
answer
51
views
Firecrawl self-hosted crawler throws Connection violated security rules error
I set up a self-hosted Firecrawl instance and I want to crawl my internal intranet site (e.g. https://intranet.xxx.gov.tr/).
I can access the site directly both from the host machine and from inside ...
0
votes
1
answer
92
views
Python Selenium find nested element [closed]
on this page I want to parse few elements.
I would like to get text in circles and use attribute value to click sometimes.
That code returns anything. With this code I want to get all attribute ...
2
votes
1
answer
79
views
How to disable selenium logs AND run the browser in headless mode
This is my code as of now:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
options = webdriver....
1
vote
3
answers
132
views
How to scrape a website that has <span class="ellipsis">β¦</span> in between number on a dynamic table with sellenium python
I am trying to scrape dividend data for the stock "Vale" on the site https://investidor10.com.br/acoes/vale3/. The dividend table has 8 buttons (1, 2, 3, ..., 8) and "Next" and &...
0
votes
1
answer
108
views
Pytube consistently fails with HTTP Error 400: Bad Request also on latest version
I am trying to use pytube (v15.0.0) to fetch the titles of YouTube videos. However, for every video I try, my script fails with the same error: HTTP Error 400: Bad Request.
I have already updated ...
0
votes
0
answers
92
views
m3u8 HLS url VIdeo Not Playing with hls.js and Art Player
I have a node Scraper Which Scrapes the HLS streaming url using Playwright Browser which gives the master Playlist like:
https://example.com/master.m3u8
Then that Master Playlist does have a cors ...