n8n Integration

Overview

n8n is a powerful, self-hosted workflow automation platform. Combine it with WhizoAI for complete control over your web scraping workflows—no vendor lock-in, unlimited executions, and full data privacy.

Why n8n + WhizoAI?

Self-Hosted

Run on your own infrastructure for complete data control

Unlimited Executions

No execution limits unlike cloud-based alternatives

Visual Workflows

Build complex workflows with drag-and-drop interface

Open Source

Customize and extend to fit your exact needs

Installation

Quick Start with Docker

# Create docker-compose.yml
cat > docker-compose.yml <<EOF
version: '3'
services:
  n8n:
    image: n8nio/n8n
    ports:
      - "5678:5678"
    volumes:
      - n8n_data:/home/node/.n8n
    environment:
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=admin
      - N8N_BASIC_AUTH_PASSWORD=your_password

volumes:
  n8n_data:
EOF

# Start n8n
docker-compose up -d

# Access at http://localhost:5678

npm Installation

npm install -g n8n
n8n start

Setup WhizoAI in n8n

Method 1: HTTP Request Node (Recommended)

Add “HTTP Request” node
Configure:
- Method: POST
- URL: https://api.whizo.ai/v1/scrape
- Authentication: Generic Credential Type
- Add header: Authorization = Bearer whizo_YOUR-API-KEY
- Body: JSON with scraping options

Method 2: Webhook Node (For Webhooks)

Add “Webhook” node to receive WhizoAI webhook events
Copy the webhook URL
Register in WhizoAI dashboard

Common Workflows

1. Scheduled Website Monitoring

Use Case: Monitor website daily and alert on changes. Workflow:

Cron Node (Daily at 9 AM)
  ↓
HTTP Request (WhizoAI Scrape)
  ↓
Compare with Previous Data
  ↓
[If Changed] → Send Alert
  ↓
Store New Data

Implementation: Node 1: Cron

Mode: Every day
Hour: 9
Minute: 0

Node 2: HTTP Request (WhizoAI)

Method: POST
URL: https://api.whizo.ai/v1/scrape

Headers:

{
  "Authorization": "Bearer whizo_YOUR-API-KEY",
  "Content-Type": "application/json"
}

Body:

{
  "url": "https://competitor.com/pricing",
  "options": {
    "format": "markdown"
  }
}

Node 3: Set (Store Current Data)

Add field: currentContent = {{$json["content"]}}

Node 4: Read Binary File (Previous Data)

File Path: /data/previous_scrape.json

Node 5: IF (Compare)

Condition: {{$node["Set"].json["currentContent"]}} ≠ {{$node["Read Binary File"].json["content"]}}

Node 6a: Slack (Alert if Changed)

Channel: #alerts
Message: Website changed! Previous: {{$node["Read Binary File"].json["content"][:100]}}... Current: {{$node["Set"].json["currentContent"][:100]}}...

Node 6b: Write Binary File (Update Stored Data)

File Path: /data/previous_scrape.json
Data: {{$node["Set"].json}}

2. Form Submission → Research → CRM Update

Use Case: Auto-research companies submitted via form. Workflow:

Webhook (Form Submission)
  ↓
HTTP Request (Scrape Website)
  ↓
HTTP Request (AI Extract Company Data)
  ↓
HTTP Request (Update CRM)
  ↓
Email (Confirmation)

Node 1: Webhook

Path: /form-webhook

Node 2: HTTP Request (Scrape)

URL: https://api.whizo.ai/v1/scrape

Body:

{
  "url": "{{$json["body"]["company_website"]}}",
  "options": {
    "format": "markdown"
  }
}

Node 3: HTTP Request (AI Extract)

URL: https://api.whizo.ai/v1/extract

Body:

{
  "content": "{{$node["HTTP Request"].json["content"]}}",
  "schema": {
    "company_name": "Company name",
    "industry": "Industry",
    "employee_count": "Number of employees (number)",
    "description": "Brief description"
  },
  "options": {
    "model": "gpt-4"
  }
}

Node 4: HTTP Request (HubSpot CRM)

Method: POST
URL: https://api.hubapi.com/crm/v3/objects/companies
Body: Mapped from AI extraction

Node 5: Send Email

To: {{$node["Webhook"].json["body"]["email"]}}
Subject: Company Research Complete: {{$node["HTTP Request 1"].json["extractedData"]["company_name"]}}

3. RSS Feed → Scrape → Summarize → Publish

Use Case: Aggregate industry news, scrape full articles, summarize, publish to blog. Workflow:

RSS Feed Read
  ↓
Loop through Items
  ↓
Scrape Full Article
  ↓
AI Summarize
  ↓
Post to WordPress

Node 1: RSS Feed Read

URL: https://example.com/rss

Node 2: Split in Batches

Batch Size: 5

Node 3: HTTP Request (Scrape)

URL: https://api.whizo.ai/v1/batch

Body:

{
  "urls": {{$json["items"].map(item => item.link)}},
  "options": {
    "format": "markdown"
  }
}

Node 4: Wait (For Batch Completion)

Amount: 2
Unit: minutes

Node 5: HTTP Request (Get Results)

URL: https://api.whizo.ai/v1/jobs/{{$node["HTTP Request"].json["jobId"]}}/results

Node 6: HTTP Request (AI Summarize Each)

URL: https://api.whizo.ai/v1/extract

Body:

{
  "content": "{{$json["content"]}}",
  "schema": {
    "summary": "2-3 sentence summary",
    "key_points": "3-5 bullet points (array)"
  }
}

Node 7: WordPress

Operation: Create Post
Title: {{$json["title"]}}
Content: Formatted with summary and key points

4. Batch URL Processing

Use Case: Process list of URLs from CSV/database. Workflow:

Read CSV/Database
  ↓
Batch URLs (groups of 10)
  ↓
Scrape Batch
  ↓
Wait for Completion
  ↓
Get Results
  ↓
Save to Database/CSV

Node 1: Spreadsheet File (Read CSV)

File Path: /data/urls.csv

Node 2: Function (Create Batches)

const items = $input.all();
const batchSize = 10;
const batches = [];

for (let i = 0; i < items.length; i += batchSize) {
  batches.push({
    urls: items.slice(i, i + batchSize).map(item => item.json.url)
  });
}

return batches.map(batch => ({ json: batch }));

Node 3: HTTP Request (Batch Scrape)

URL: https://api.whizo.ai/v1/batch
Body: {{$json["urls"]}}

Node 4: Wait

Amount: {{Math.ceil($json["urls"].length / 5)}} minutes

Node 5: HTTP Request (Get Results)

URL: https://api.whizo.ai/v1/jobs/{{$node["HTTP Request"].json["jobId"]}}/results

Node 6: Spreadsheet File (Write Results)

File Path: /data/results.csv

Advanced Patterns

Error Handling

Add error handling to workflows: Node: IF (Check Status)

Condition 1: {{$node["HTTP Request"].statusCode}} = 200
- Success path
Condition 2: {{$node["HTTP Request"].statusCode}} ≠ 200
- Error path → Log → Alert → Retry

Retry Logic:

// In Function node
const maxRetries = 3;
const currentRetry = $json.retryCount || 0;

if (currentRetry < maxRetries) {
  return {
    json: {
      ...$ json,
      retryCount: currentRetry + 1
    }
  };
}

// Give up after max retries
throw new Error('Max retries exceeded');

Webhook-Driven Workflows

Receive WhizoAI job completion webhooks: Node 1: Webhook

Path: /whizoai-webhook
Authentication: Header Auth
- Name: X-WhizoAI-Signature
- Value: Verify with your secret

Node 2: Function (Verify Signature)

const crypto = require('crypto');

const signature = $node["Webhook"].json.headers["x-whizoai-signature"];
const payload = JSON.stringify($node["Webhook"].json.body);
const secret = '{{$credentials.whizoaiWebhookSecret}}';

const expectedSignature = crypto
  .createHmac('sha256', secret)
  .update(payload)
  .digest('hex');

if (signature !== expectedSignature) {
  throw new Error('Invalid signature');
}

return { json: $node["Webhook"].json.body };

Node 3: Switch (Event Type)

Route by {{$json["event"]}}
- job.completed → Success handler
- job.failed → Error handler
- credit.low → Alert handler

Data Transformation

Transform scraped data before saving: Node: Function (Transform Data)

const items = $input.all();

return items.map(item => ({
  json: {
    url: item.json.metadata.url,
    title: item.json.metadata.title,
    content: item.json.content,
    wordCount: item.json.content.split(' ').length,
    scrapedAt: item.json.metadata.extractedAt,
    summary: item.json.content.substring(0, 200) + '...'
  }
}));

Best Practices

Credential Management

Store API keys securely:

Go to n8n Settings → Credentials
Add “Header Auth” credential
Name: WhizoAI API Key
Value: Bearer whizo_YOUR-API-KEY
Use in HTTP Request nodes

Error Logging

Always log errors:

Add “Write Binary File” node to error paths
Log to file: /logs/errors-{{$now.format("YYYY-MM-DD")}}.json
Include full error context and request details

Performance Optimization

Use batch operations when scraping multiple URLs
Implement rate limiting with “Wait” nodes
Cache results to avoid duplicate scraping
Use lazy loading for large datasets

Workflow Organization

Use sticky notes to document complex logic
Group related nodes together
Name nodes descriptively
Version control your workflows (export JSON)

Example Workflow JSON

Basic Scraping Workflow

{
  "name": "WhizoAI Scrape Example",
  "nodes": [
    {
      "parameters": {
        "rule": {
          "interval": [{"field": "hours", "hoursInterval": 24}]
        }
      },
      "name": "Schedule",
      "type": "n8n-nodes-base.cron",
      "position": [250, 300]
    },
    {
      "parameters": {
        "url": "https://api.whizo.ai/v1/scrape",
        "authentication": "predefinedCredentialType",
        "nodeCredentialType": "headerAuth",
        "sendBody": true,
        "bodyParameters": {
          "parameters": [
            {
              "name": "url",
              "value": "https://example.com"
            },
            {
              "name": "options",
              "value": {"format": "markdown"}
            }
          ]
        }
      },
      "name": "WhizoAI Scrape",
      "type": "n8n-nodes-base.httpRequest",
      "position": [450, 300]
    }
  ],
  "connections": {
    "Schedule": {
      "main": [[{"node": "WhizoAI Scrape", "type": "main", "index": 0}]]
    }
  }
}

Import this into n8n to get started quickly!

Monitoring & Debugging

Enable Execution Logging

# In docker-compose.yml or environment
N8N_LOG_LEVEL=debug
N8N_LOG_OUTPUT=console,file

View Execution History

Click “Executions” in n8n sidebar
Filter by workflow
Click execution to see detailed logs
Review each node’s input/output

Common Issues

Issue	Solution
Authentication failed	Check API key format and credentials
Timeout errors	Increase timeout in HTTP Request node settings
Memory issues	Process data in smaller batches
Webhook not receiving	Verify webhook URL and check firewall rules

Zapier Integration

Cloud-based workflow automation

Webhooks Guide

Set up WhizoAI webhooks

Batch Processing

Efficiently process multiple URLs

API Reference

Complete API documentation

Getting Started

LLM Integrations

Workflow Automation

Overview

Why n8n + WhizoAI?

Self-Hosted

Unlimited Executions

Visual Workflows

Open Source

Installation

Quick Start with Docker

npm Installation

Setup WhizoAI in n8n

Method 1: HTTP Request Node (Recommended)

Method 2: Webhook Node (For Webhooks)

Common Workflows

1. Scheduled Website Monitoring

2. Form Submission → Research → CRM Update

3. RSS Feed → Scrape → Summarize → Publish

4. Batch URL Processing

Advanced Patterns

Error Handling

Webhook-Driven Workflows

Data Transformation

Best Practices

Example Workflow JSON

Basic Scraping Workflow

Monitoring & Debugging

Enable Execution Logging

View Execution History

Common Issues

Zapier Integration

Webhooks Guide

Batch Processing

API Reference

Getting Started

LLM Integrations

Workflow Automation

​Overview

​Why n8n + WhizoAI?

Self-Hosted

Unlimited Executions

Visual Workflows

Open Source

​Installation

​Quick Start with Docker

​npm Installation

​Setup WhizoAI in n8n

​Method 1: HTTP Request Node (Recommended)

​Method 2: Webhook Node (For Webhooks)

​Common Workflows

​1. Scheduled Website Monitoring

​2. Form Submission → Research → CRM Update

​3. RSS Feed → Scrape → Summarize → Publish

​4. Batch URL Processing

​Advanced Patterns

​Error Handling

​Webhook-Driven Workflows

​Data Transformation

​Best Practices

​Example Workflow JSON

​Basic Scraping Workflow

​Monitoring & Debugging

​Enable Execution Logging

​View Execution History

​Common Issues

​Related Resources

Zapier Integration

Webhooks Guide

Batch Processing

API Reference

Overview

Why n8n + WhizoAI?

Installation

Quick Start with Docker

npm Installation

Setup WhizoAI in n8n

Method 1: HTTP Request Node (Recommended)

Method 2: Webhook Node (For Webhooks)

Common Workflows

1. Scheduled Website Monitoring

2. Form Submission → Research → CRM Update

3. RSS Feed → Scrape → Summarize → Publish

4. Batch URL Processing

Advanced Patterns

Error Handling

Webhook-Driven Workflows

Data Transformation

Best Practices

Example Workflow JSON

Basic Scraping Workflow

Monitoring & Debugging

Enable Execution Logging

View Execution History

Common Issues

Related Resources