Scrape API
The Scrape API allows you to extract clean, structured content from any webpage in multiple formats including markdown, HTML, and JSON. It’s perfect for content extraction, data mining, and web automation tasks.Base URL
Authentication
All requests require authentication using your API key in the Authorization header:Single Page Scraping
POST /v1/scrape
Extract content from a single webpage with customizable options.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | Yes | The URL to scrape |
options | object | No | Scraping configuration options |
Options Object
| Parameter | Type | Default | Description |
|---|---|---|---|
format | string | markdown | Output format: markdown, html, text, json, structured |
engine | string | lightweight | Scraping engine: lightweight, playwright, puppeteer |
includeScreenshot | boolean | false | Capture a screenshot of the page |
includePdf | boolean | false | Generate a PDF of the page |
mobile | boolean | false | Use mobile user agent |
waitTime | number | 0 | Time to wait before scraping (0-30 seconds) |
javascript | boolean | false | Enable JavaScript rendering |
cookies | object | {} | Custom cookies to send with request |
headers | object | {} | Custom headers to send with request |
timeout | number | 30 | Request timeout in seconds (5-120) |
useCache | boolean | false | Use cached results if available |
cacheTtl | number | 300 | Cache time-to-live in seconds |
webhook | string | - | Webhook URL for completion notification |
Response
Code Examples
Batch Scraping
POST /v1/scrape/batch
Scrape multiple URLs simultaneously for efficient bulk operations.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
urls | array | Yes | Array of URLs to scrape (max 100) |
options | object | No | Global scraping options |
webhook | string | No | Webhook URL for batch completion |
Response
Error Handling
HTTP Status Codes
| Code | Description |
|---|---|
200 | Success |
400 | Bad Request - Invalid parameters |
401 | Unauthorized - Invalid API key |
402 | Payment Required - Insufficient credits |
429 | Too Many Requests - Rate limit exceeded |
500 | Internal Server Error |
Error Response Format
Rate Limits
Rate limits vary by plan:| Plan | Requests/Hour | Requests/Day | Concurrent Jobs |
|---|---|---|---|
| Free | 10 | 100 | 1 |
| Starter | 50 | 500 | 3 |
| Pro | 200 | 2000 | 10 |
| Enterprise | 1000 | 10000 | 50 |
Credit Costs
| Feature | Credits |
|---|---|
| Basic scraping | 1 credit per page |
| JavaScript rendering | +1 credit |
| Screenshot capture | +1 credit |
| PDF generation | +1 credit |
| AI extraction | +2 credits |
Use Cases
Content Aggregation
Perfect for news sites, blogs, and content platforms that need to aggregate content from multiple sources.Market Research
Extract product information, pricing data, and competitor analysis from e-commerce sites.SEO Analysis
Scrape meta tags, headings, and content structure for SEO optimization and analysis.Lead Generation
Extract contact information and business data from directories and websites.Best Practices
- Respect robots.txt - Always check and respect website robots.txt files
- Use appropriate delays - Set reasonable wait times between requests
- Handle errors gracefully - Implement proper error handling and retry logic
- Cache when possible - Use caching to reduce API calls and costs
- Monitor rate limits - Track your usage to avoid hitting rate limits