Web Archive API

The Web Archive API provides access to historical versions of websites through integration with the Internet Archive’s Wayback Machine. Retrieve archived content, search historical snapshots, and analyze website evolution over time.

Credits Required: Archive operations consume 1-2 credits per request (availability checks are free)

Authentication

All archive endpoints require authentication using either:

API Key: Include in Authorization: Bearer YOUR_API_KEY header
Session Token: Use Supabase session for dashboard access

Endpoints Overview

Search Snapshots

Find historical snapshots of a URL within date ranges

Retrieve Snapshot

Download content from a specific archived version

Check Availability

Verify if archived versions exist (FREE)

Generate Timeline

Analyze archive history with timeline visualization

Search Snapshots

Search for historical snapshots of a URL, with optional date filtering and result limiting.

curl -X POST "https://api.whizo.ai/v1/archive/search" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "dateRange": {
      "from": "20220101",
      "to": "20231231"
    },
    "limit": 50,
    "provider": "wayback",
    "includeMetadata": true
  }'

Request Body

url

string

required

The URL to search for archived snapshots

timestamp

string

Specific timestamp in YYYYMMDD format (searches around this date)

dateRange

object

Search within a date range

Show dateRange properties

from

string

required

Start date in YYYYMMDD format

string

required

End date in YYYYMMDD format

limit

number

default:50

Maximum number of snapshots to return (1-1000)

provider

string

default:"wayback"

Archive provider to search. Options: wayback, archive_today, memento

includeMetadata

boolean

default:false

Include additional metadata about each snapshot

fallbackToClosest

boolean

default:true

Return closest available snapshots if exact matches not found

Response

{
  "success": true,
  "data": {
    "url": "https://example.com",
    "provider": "wayback",
    "snapshots": [
      {
        "timestamp": "20230615120000",
        "url": "https://example.com",
        "mimetype": "text/html",
        "statuscode": "200",
        "digest": "sha1:ABCD1234...",
        "length": "15432"
      }
    ],
    "totalFound": 1,
    "query": {
      "dateRange": {
        "from": "20220101",
        "to": "20231231"
      },
      "limit": 50
    },
    "creditsUsed": 1
  },
  "metadata": {
    "searchTime": "2024-01-15T10:30:00Z",
    "provider": "wayback",
    "cached": false
  }
}

Retrieve Snapshot

Download the actual content from a specific archived snapshot.

curl -X POST "https://api.whizo.ai/v1/archive/snapshot" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "timestamp": "20230615120000",
    "provider": "wayback"
  }'

Request Body

url

string

required

The original URL of the archived page

timestamp

string

required

Exact timestamp in YYYYMMDDHHMMSS format

provider

string

default:"wayback"

Archive provider. Options: wayback, archive_today, memento

Response

{
  "success": true,
  "data": {
    "url": "https://example.com",
    "timestamp": "20230615120000",
    "provider": "wayback",
    "snapshot": {
      "content": "<!DOCTYPE html><html>...",
      "metadata": {
        "statusCode": 200,
        "contentType": "text/html"
      },
      "timestamp": "20230615120000",
      "url": "https://example.com",
      "available": true
    },
    "creditsUsed": 2
  },
  "metadata": {
    "retrievedAt": "2024-01-15T10:30:00Z",
    "provider": "wayback",
    "cached": false
  }
}

Check Availability

Check if archived versions exist for a URL without consuming credits.

curl -X POST "https://api.whizo.ai/v1/archive/availability" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "timestamp": "20230615"
  }'

Request Body

url

string

required

The URL to check for archived versions

timestamp

string

Check availability around specific date (YYYYMMDD format)

Response

{
  "success": true,
  "data": {
    "url": "https://example.com",
    "timestamp": "20230615",
    "availability": {
      "hasSnapshots": true,
      "closestDate": "20230615120000"
    },
    "creditsUsed": 0
  },
  "metadata": {
    "checkedAt": "2024-01-15T10:30:00Z",
    "cached": false
  }
}

Generate Timeline

Generate a timeline showing the archive history of a URL with customizable granularity.

curl -X POST "https://api.whizo.ai/v1/archive/timeline" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "dateRange": {
      "from": "20220101",
      "to": "20231231"
    },
    "granularity": "month"
  }'

Request Body

url

string

required

The URL to generate timeline for

dateRange

object

required

Date range for timeline analysis

Show dateRange properties

from

string

required

Start date in YYYYMMDD format

string

required

End date in YYYYMMDD format

granularity

string

default:"month"

Timeline granularity. Options: day, week, month, year

Response

{
  "success": true,
  "data": {
    "url": "https://example.com",
    "dateRange": {
      "from": "20220101",
      "to": "20231231"
    },
    "granularity": "month",
    "timeline": [
      {
        "period": "2022-01",
        "count": 3,
        "snapshots": ["20220105120000", "20220115080000", "20220128140000"]
      },
      {
        "period": "2022-02",
        "count": 2,
        "snapshots": ["20220207100000", "20220224160000"]
      }
    ],
    "totalSnapshots": 5,
    "creditsUsed": 1
  }
}

Error Handling

The Web Archive API returns standard HTTP status codes and structured error responses:

{
  "success": false,
  "error": "Insufficient credits",
  "details": {
    "required": 2,
    "available": 0
  }
}

Common Error Codes

400 - Bad Request

Invalid request parameters or malformed data

401 - Unauthorized

Invalid or missing authentication credentials

402 - Payment Required

Insufficient credits for the requested operation

404 - Not Found

URL not found in archive or no snapshots available

429 - Rate Limited

Too many requests - please slow down

500 - Server Error

Internal server error or Wayback Machine unavailable

Use Cases

Website Evolution Analysis

Track how a website has changed over time by analyzing archived snapshots

Content Recovery

Recover deleted content or previous versions of web pages

Competitor Research

Study competitor websites’ historical changes and strategies

SEO Analysis

Analyze historical SEO changes and their impact

Rate Limits

Search operations: 60 requests per minute
Snapshot retrieval: 30 requests per minute
Availability checks: 120 requests per minute (free operations)
Timeline generation: 20 requests per minute

Best Practices

Use availability checks before expensive snapshot operations
Implement proper error handling for archive service unavailability
Cache results when possible to reduce credit usage
Use appropriate date ranges to limit search scope

Archive content may be limited by robots.txt restrictions
Some snapshots may be incomplete or corrupted
Response times vary based on archive age and size

Core APIs

Job Management

User Management

Advanced Features

Web Archive API

Web Archive API

Authentication

Endpoints Overview

Search Snapshots

Retrieve Snapshot

Check Availability

Generate Timeline

Search Snapshots

Request Body

Response

Retrieve Snapshot

Request Body

Response

Check Availability

Request Body

Response

Generate Timeline

Request Body

Response

Error Handling

Common Error Codes

Use Cases

Website Evolution Analysis

Content Recovery

Competitor Research

SEO Analysis

Rate Limits

Best Practices

Core APIs

Job Management

User Management

Advanced Features

​Web Archive API

​Authentication

​Endpoints Overview

Search Snapshots

Retrieve Snapshot

Check Availability

Generate Timeline

​Search Snapshots

​Request Body

​Response

​Retrieve Snapshot

​Request Body

​Response

​Check Availability

​Request Body

​Response

​Generate Timeline

​Request Body

​Response

​Error Handling

​Common Error Codes

​Use Cases

Website Evolution Analysis

Content Recovery

Competitor Research

SEO Analysis

​Rate Limits

​Best Practices

Web Archive API

Authentication

Endpoints Overview

Search Snapshots

Request Body

Response

Retrieve Snapshot

Request Body

Response

Check Availability

Request Body

Response

Generate Timeline

Request Body

Response

Error Handling

Common Error Codes

Use Cases

Rate Limits

Best Practices