Back to Home

API Documentation

Complete reference for the Data From URL REST API

Single Extraction Batch Extraction Async Jobs Health Check
POST /api/extract

Extract structured data from a single URL

Request Body

Parameter Type Required Description
url string Required The URL to extract data from
options.contentType string Optional Expected content type: 'article', 'product', 'general', or 'auto'
options.preferBrowser boolean Optional Use browser rendering for JavaScript-heavy sites
options.includeMetadata boolean Optional Include additional metadata in response
options.timeout number Optional Request timeout in milliseconds (default: 8000)

Example Request

curl -X POST http://localhost:4000/api/extract \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/article", "options": { "contentType": "article", "includeMetadata": true } }'

Example Response

{ "success": true, "data": { "title": "Example Article Title", "description": "Article description...", "content": "Full article text...", "author": "John Doe", "publishedDate": "2025-01-15", "url": "https://example.com/article" }, "metadata": { "contentType": "article", "confidence": 0.95, "extractionTime": 1234 } }
POST /api/extract/batch

Extract data from multiple URLs at once (max 10)

Request Body

Parameter Type Required Description
urls string[] Required Array of URLs to extract (max 10)
options object Optional Same options as single extraction

Example Request

curl -X POST http://localhost:4000/api/extract/batch \ -H "Content-Type: application/json" \ -d '{ "urls": [ "https://example.com/page1", "https://example.com/page2" ], "options": { "includeMetadata": true } }'
POST /api/extract/async

Submit an async extraction job for long-running extractions

Example Request

curl -X POST http://localhost:4000/api/extract/async \ -H "Content-Type: application/json" \ -d '{ "url": "https://example.com/slow-page" }'

Example Response

{ "success": true, "jobId": "job_abc123", "status": "pending" }
GET /api/jobs/{jobId}

Check the status of an async job

GET /api/health

Check service health and available capabilities

Example Response

{ "status": "healthy", "capabilities": { "aiParsing": true, "browserRendering": true, "proxySupport": false } }