- Instant Scalability - Spin up hundreds of browser sessions in seconds without infrastructure headaches
- Simple Integration - Works seamlessly with popular tools like Puppeteer and Playwright
- Powerful APIs - Easy to use APIs for scraping/crawling any site, and much more
- Bypass Anti-Bot Measures - Built-in stealth mode, ad blocking, automatic CAPTCHA solving, and rotating proxies
Key Capabilities
Scrape
Hyperbrowser provides powerful scraping capabilities that allow you to extract data from any webpage. The scraping tool can convert web content into structured formats like markdown or HTML, making it easy to process and analyze the data.Crawl
The crawling functionality enables you to navigate through multiple pages of a website automatically. You can set parameters like page limits to control how extensively the crawler explores the site, collecting data from each page it visits.Extract
Hyperbrowser’s extraction capabilities use AI to pull specific information from webpages according to your defined schema. This allows you to transform unstructured web content into structured data that matches your exact requirements.Overview
Integration details
Tool | Package | Local | Serializable | JS support |
---|---|---|---|---|
Crawl Tool | langchain-hyperbrowser | ❌ | ❌ | ❌ |
Scrape Tool | langchain-hyperbrowser | ❌ | ❌ | ❌ |
Extract Tool | langchain-hyperbrowser | ❌ | ❌ | ❌ |
Setup
To access the Hyperbrowser web tools you’ll need to install thelangchain-hyperbrowser
integration package, and create a Hyperbrowser account and get an API key.
Credentials
Head to Hyperbrowser to sign up and generate an API key. Once you’ve done this set the HYPERBROWSER_API_KEY environment variable:Installation
Install langchain-hyperbrowser.Instantiation
Crawl Tool
TheHyperbrowserCrawlTool
is a powerful tool that can crawl entire websites, starting from a given URL. It supports configurable page limits and scraping options.
Scrape Tool
TheHyperbrowserScrapeTool
is a tool that can scrape content from web pages. It supports both markdown and HTML output formats, along with metadata extraction.
Extract Tool
TheHyperbrowserExtractTool
is a powerful tool that uses AI to extract structured data from web pages. It can extract information based predefined schemas.
Invocation
Basic Usage
Crawl Tool
Scrape Tool
Extract Tool
With Custom Options
Crawl Tool with Custom Options
Scrape Tool with Custom Options
Extract Tool with Custom Schema
Async Usage
All tools support async usage:Use within an agent
Here’s how to use any of the web tools within an agent:Configuration Options
Common Options
All tools support these basic configuration options:url
: The URL to processsession_options
: Browser session configurationuse_proxy
: Whether to use a proxysolve_captchas
: Whether to automatically solve CAPTCHAsaccept_cookies
: Whether to accept cookies
Tool-Specific Options
Crawl Tool
max_pages
: Maximum number of pages to crawlscrape_options
: Options for scraping each pageformats
: List of output formats (markdown, html)
Scrape Tool
scrape_options
: Options for scraping the pageformats
: List of output formats (markdown, html)
Extract Tool
schema
: Pydantic model defining the structure to extractextraction_prompt
: Natural language prompt for extraction