The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data π₯
-
Updated
Sep 8, 2025 - TypeScript
The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data π₯
Python scraper based on AI
π·οΈ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!
AnyCrawl π: A Node.js/TypeScript crawler that turns websites into LLM-ready data and extracts structured SERP results from Google/Bing/Baidu/etc. Native multi-threading for bulk processing.
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
Lightweight library for scraping web-sites with LLMs
π₯ This repository contains complete application examples, including websites and other projects, developed using Firecrawl.
β Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
β¬οΈ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). π Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...
[Mirror] Self-hosted abuse detection and rule enforcement against low-effort mass AI scraping and bots.
AI web scraper built with Crawl4AI for extracting structured leads data from websites.
How to guides on web-crawling or scraping
Python, Javascript, and Rust libraries for the Spider Cloud API.
All Scrapers Resource Available Here! Give Us Starsπ
Fastest and cheapest distributed residential proxy network.
Extract Google Maps business leads and enrich contact details using AI & web scraping
Oxylabs AI Studio python SDK
AI Scraper : scrap and extract data from website in any format (CSV, JSON, HTML...) using Selenium or Crawl4ai, and using Ollama or Sambanova API, and using Streamlit for UI as chatbot
A CLI tool and REST API that converts web content to clean Markdown, bypassing anti-scraping measures using headless browsers. Perfect for AI/LLM applications
Use LLaMA 3 and Python to extract structured data from websites like Amazon, leveraging LLM-powered parsing for resilient, AI-driven web scraping.
Add a description, image, and links to the ai-scraping topic page so that developers can more easily learn about it.
To associate your repository with the ai-scraping topic, visit your repo's landing page and select "manage topics."