Skip to main content

Whizo Agent

The Whizo Agent is our premium AI-powered extraction system that provides intelligent content analysis, structured data extraction, and advanced processing capabilities. It offers superior performance compared to legacy systems while maintaining full backward compatibility.
Enhanced Capabilities: Whizo Agent includes NLP, entity recognition, ML schema inference, and intelligent extraction optimization

Key Features

Advanced NLP

Natural Language Processing with entity recognition and sentiment analysis

Schema Inference

Automatic detection and generation of optimal data extraction schemas

Intelligent Extraction

ML-powered extraction strategies that adapt to content structure

Legacy Compatibility

Full backward compatibility with existing integrations

Agent Parameter

The Whizo Agent is activated by including the agent parameter in your extraction and scraping requests:
{
  "agent": {
    "model": "Whizo-Agent",
    "prompt": "Extract the main title and description from this webpage"
  }
}

Agent Models

  • Whizo-Agent
  • Legacy Compatibility
Recommended: Our latest AI model optimized for web content extraction
  • Advanced NLP and entity recognition
  • Superior accuracy and speed
  • Optimized for web content structures
  • Enhanced schema inference capabilities

Enhanced Extraction API

Use Whizo Agent for intelligent data extraction from web pages with custom schemas and prompts.
curl -X POST "https://api.whizo.ai/v1/extract" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com"],
    "schema": {
      "type": "object",
      "properties": {
        "title": {"type": "string"},
        "description": {"type": "string"},
        "price": {"type": "number"},
        "availability": {"type": "boolean"}
      }
    },
    "agent": {
      "model": "Whizo-Agent",
      "prompt": "Extract product information including title, description, price, and availability status"
    }
  }'

Agent Parameters

agent
object
Whizo Agent configuration for intelligent extraction

Agent-Enhanced Scraping

Combine web scraping with Whizo Agent for intelligent content processing and analysis.
curl -X POST "https://api.whizo.ai/v1/scrape" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/article",
    "format": "markdown",
    "agent": {
      "model": "Whizo-Agent",
      "prompt": "Focus on extracting the main article content and remove navigation elements"
    }
  }'
When the agent parameter is included in scraping requests, Whizo Agent:
  • Intelligently filters content based on the provided prompt
  • Removes noise and irrelevant elements automatically
  • Enhances content structure for better readability
  • Applies advanced processing like entity recognition and content classification

Schema Inference

Whizo Agent can automatically infer optimal extraction schemas based on content analysis, reducing the need for manual schema design.
curl -X POST "https://api.whizo.ai/v1/extract" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": ["https://example.com/product"],
    "agent": {
      "model": "Whizo-Agent",
      "prompt": "Analyze this product page and extract all relevant information"
    }
  }'
Benefits of Schema Inference:
  • Automatically detects optimal data structures
  • Adapts to different page types and content formats
  • Reduces development time and complexity
  • Improves extraction accuracy through ML optimization

Advanced Capabilities

Natural Language Processing

Whizo Agent includes advanced NLP capabilities:
Automatically identifies and extracts entities like:
  • People: Names, roles, titles
  • Organizations: Companies, institutions
  • Locations: Addresses, cities, countries
  • Dates: Publication dates, events, deadlines
  • Financial: Prices, currencies, financial metrics
Analyzes content sentiment and emotional tone:
  • Positive/Negative/Neutral classification
  • Confidence scores for sentiment predictions
  • Emotion detection (joy, anger, fear, etc.)
  • Intent analysis for user-generated content
Intelligently classifies and categorizes content:
  • Content type detection (article, product, review, etc.)
  • Topic classification using ML models
  • Quality scoring based on content richness
  • Relevance assessment for extraction targets

Intelligent Content Processing

  • Automatically removes advertisements and promotional content
  • Filters out navigation elements and boilerplate text
  • Eliminates duplicate and redundant information
  • Preserves only relevant, high-quality content
  • Improves content hierarchy and organization
  • Standardizes formatting across different sources
  • Enhances readability through intelligent formatting
  • Maintains semantic relationships in extracted data
  • Understands content context and relationships
  • Maintains semantic meaning during extraction
  • Adapts extraction strategy based on page type
  • Preserves important contextual information

Migration Guide

From Legacy Systems

If you’re currently using legacy extraction systems, migrating to Whizo Agent is simple:
  • Immediate Compatibility
Zero Code Changes RequiredYour existing integrations will automatically work with Whizo Agent:
// This continues to work unchanged
{
  "agent": {
    "model": "FIRE-1",  // Routes to Whizo-Agent
    "prompt": "Extract title and content"
  }
}

Performance Improvements

When migrating to Whizo Agent, you’ll experience:

Speed Improvements

40% faster processing times through optimized ML models

Accuracy Gains

25% better extraction accuracy with advanced NLP

Cost Efficiency

Better value through intelligent processing and reduced retries

Enhanced Features

New capabilities like schema inference and sentiment analysis

Best Practices

Prompt Optimization

Effective Prompts for Whizo Agent:
  • Be specific about the data you want to extract
  • Include context about the expected content format
  • Specify any quality or filtering requirements
  • Use clear, concise language for instructions
Example Good Prompt:
"Extract product information including name, price in USD, description,
availability status, and customer rating. Focus on main product content
and ignore promotional banners or related product suggestions."

Schema Design

Schema Best Practices:
  • Use descriptive property names that match content semantics
  • Specify appropriate data types (string, number, boolean, array)
  • Include optional properties for flexible extraction
  • Consider nested objects for complex structured data
Example Optimized Schema:
{
  "type": "object",
  "properties": {
    "title": {"type": "string", "description": "Product name"},
    "pricing": {
      "type": "object",
      "properties": {
        "amount": {"type": "number"},
        "currency": {"type": "string"},
        "discount": {"type": "number"}
      }
    },
    "specifications": {
      "type": "array",
      "items": {"type": "string"}
    }
  }
}

Performance Optimization

Optimization Tips:
  • Use specific selectors when possible to reduce processing scope
  • Implement caching for frequently accessed content
  • Batch multiple URLs in single requests when applicable
  • Monitor credit usage and optimize extraction patterns
  • Use availability checks before expensive operations

Competitive Advantages

vs Legacy Systems

Superior AI Models

Latest transformer models optimized for web content

Advanced NLP

Entity recognition, sentiment analysis, and content classification

Schema Inference

Automatic schema detection reduces development overhead

Better Performance

Faster processing with higher accuracy rates

Technical Excellence

  • ML-Powered: Continuously improving through machine learning
  • Context-Aware: Understands content relationships and semantics
  • Adaptive: Adjusts extraction strategies based on content type
  • Scalable: Optimized for high-volume processing
  • Reliable: Enterprise-grade stability and error handling

Rate Limits & Credits

  • Basic extraction: 3 credits per page
  • Complex extraction: 6 credits per page
  • Agent-enhanced scraping: +1 credit per page
  • Schema inference: Included at no extra cost
  • Extract operations: 100 requests per minute
  • Scrape operations: 200 requests per minute
  • Batch processing: 50 requests per minute
  • Enterprise users: Custom limits available
Ready to experience the power of Whizo Agent? Start with our Quick Start Guide or explore the Extract API documentation.