Skip to main content
Open on GitHub

ScrapeGraph AI

ScrapeGraph AI is a service that provides AI-powered web scraping capabilities. It offers tools for extracting structured data, converting webpages to markdown, and processing local HTML content using natural language prompts.

Installation and Setupโ€‹

Install the required packages:

pip install langchain-scrapegraph

Set up your API key:

export SGAI_API_KEY="your-scrapegraph-api-key"

Toolsโ€‹

See a usage example.

There are four tools available:

from langchain_scrapegraph.tools import (
SmartScraperTool, # Extract structured data from websites
MarkdownifyTool, # Convert webpages to markdown
LocalScraperTool, # Process local HTML content
GetCreditsTool, # Check remaining API credits
)

Each tool serves a specific purpose:

  • SmartScraperTool: Extract structured data from websites given a URL, prompt and optional output schema
  • MarkdownifyTool: Convert any webpage to clean markdown format
  • LocalScraperTool: Extract structured data from a local HTML file given a prompt and optional output schema
  • GetCreditsTool: Check your remaining ScrapeGraph AI credits