- Instant Scalability - Spin up hundreds of browser sessions in seconds without infrastructure headaches
- Simple Integration - Works seamlessly with popular tools like Puppeteer and Playwright
- Powerful APIs - Easy to use APIs for scraping/crawling any site, and much more
- Bypass Anti-Bot Measures - Built-in stealth mode, ad blocking, automatic CAPTCHA solving, and rotating proxies
Overview
Integration details
Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
HyperbrowserLoader | langchain-hyperbrowser | ❌ | ❌ | ❌ |
Loader features
Source | Document Lazy Loading | Native Async Support |
---|---|---|
HyperbrowserLoader | ✅ | ✅ |
Setup
To access Hyperbrowser document loader you’ll need to install thelangchain-hyperbrowser
integration package, and create a Hyperbrowser account and get an API key.
Credentials
Head to Hyperbrowser to sign up and generate an API key. Once you’ve done this set the HYPERBROWSER_API_KEY environment variable:Installation
Install langchain-hyperbrowser.Initialization
Now we can instantiate our model object and load documents:Load
Lazy Load
Advanced Usage
You can specify the operation to be performed by the loader. The default operation isscrape
. For scrape
, you can provide a single URL or a list of URLs to be scraped. For crawl
, you can only provide a single URL. The crawl
operation will crawl the provided page and subpages and return a document for each page.
params
argument. For more information on the supported params, visit docs.hyperbrowser.ai/reference/sdks/python/scrape#start-scrape-job-and-wait or docs.hyperbrowser.ai/reference/sdks/python/crawl#start-crawl-job-and-wait.