Update a crawler
  PATCH /api/v2/organizations/{organization}/projects/{project}/crawlers/{crawler}    
 Authorizations
Parameters
Path Parameters
Organization identifier
Project identifier
Request Body required
object
WAF operation mode
OWASP paranoia level
WAF rule IDs to allow/whitelist
IP addresses to allow
IP addresses to block
ASN numbers to block
User agent patterns to block
Referer patterns to block
Slack webhook URL for notifications
https://hooks.slack.com/services/XXXMinimum hits per minute to trigger Slack notification
100Email addresses for notifications
Project Honey Pot HTTP:BL configuration
object
Enable HTTP:BL
Block suspicious IPs
Block email harvesters
Block spam sources
Block search engines
HTTP:BL API key
Enable predefined block lists
object
Block known bad user agents
Block known bad referers
Block known bad IPs
Block AI crawlers
Rate limiting thresholds
object
Threshold type
Requests per second limit (for ip/header)
10Hit count limit (for waf_hit_by_ip)
10Time window in minutes (for waf_hit_by_ip)
5Cooldown period in seconds
30Threshold enforcement mode
Header name (for header type)
Slack webhook for this threshold
Crawler name
Test CrawlerDomain to crawl
test-domain.comEnable browser mode
Execute JavaScript during asset collection (only when browser_mode is enabled)
trueURLs to crawl
[  "/",  "/about",  "/contact"]Starting URLs for crawl
[  "/",  "/blog"]Custom headers
object
{  "Authorization": "Bearer token123",  "X-Custom-Header": "value"}URL patterns to exclude (regex)
[  "/admin/*",  "/private/*"]URL patterns to include (regex)
[  "/blog/*",  "/products/*"]Webhook URL for notifications
https://example.com/webhookAuthorization header for webhook
Bearer token123Extra variables for webhook
key1=value1&key2=value2Number of concurrent workers (verified domains only)
4Delay between requests in seconds (verified domains only)
0.25Maximum crawl depth, -1 for unlimited (verified domains only)
-1Maximum total requests, 0 for unlimited (verified domains only)
1000Maximum HTML pages, 0 for unlimited (verified domains only)
100HTTP status codes that will result in content being captured and pushed to Quant (verified domains only)
[  200,  201]Sitemap configuration (verified domains only)
object
[  {    "url": "/sitemap.xml",    "recursive": true  }]Allowed domains for multi-domain crawling, automatically enables merge_domains (verified domains only)
[  "example.com",  "assets.example.com"]Custom user agent, only when browser_mode is false (verified domains only)
Mozilla/5.0...Asset harvesting configuration (verified domains only)
object
{  "network_intercept": {    "enabled": true,    "timeout": 30  }}Maximum errors before stopping crawl (verified domains only)
1000Responses
200
The request has succeeded.
object
WAF operation mode
OWASP paranoia level
WAF rule IDs to allow/whitelist
IP addresses to allow
IP addresses to block
ASN numbers to block
User agent patterns to block
Referer patterns to block
Slack webhook URL for notifications
https://hooks.slack.com/services/XXXMinimum hits per minute to trigger Slack notification
100Email addresses for notifications
Project Honey Pot HTTP:BL configuration
object
Enable HTTP:BL
Block suspicious IPs
Block email harvesters
Block spam sources
Block search engines
HTTP:BL API key
Enable predefined block lists
object
Block known bad user agents
Block known bad referers
Block known bad IPs
Block AI crawlers
Rate limiting thresholds
object
Threshold type
Requests per second limit (for ip/header)
10Hit count limit (for waf_hit_by_ip)
10Time window in minutes (for waf_hit_by_ip)
5Cooldown period in seconds
30Threshold enforcement mode
Header name (for header type)
Slack webhook for this threshold
Crawler ID
456Crawler name
Test CrawlerProject ID
789Crawler UUID
550e8400-e29b-41d4-a716-446655440000Crawler configuration (YAML)
domain: test-domain.com\nconfig:\n  max_html: 100\n  browser_mode: falseCrawler domain
test-domain.comDomain verification status
1URLs list (YAML)
single_url:\n  - /\n  - /about\n  - /contactCreation timestamp
2024-01-20T09:15:00ZLast update timestamp
2024-10-11T16:45:00ZDeletion timestamp
400
The server could not understand the request due to invalid syntax.
object
Error message
The requested resource was not foundError flag
true403
Access is forbidden.
object
Error message
The requested resource was not foundError flag
true