Taint Analysis

Runtime DOM XSS detection via browser-based taint tracking. Traces data flow from attacker-controlled sources to dangerous sinks using CDP JavaScript injection.

🔗How It Works

  1. InjectTAINT_INSTRUMENTATION_JS is injected into the page via execute_js(). It hooks source reads and sink writes using Object.defineProperty, Proxy, and function patching.
  2. Navigate — the target URL loads with hooks active. Every time tainted data reaches a sink, a flow record is created.
  3. CollectTAINT_COLLECTOR_JS harvests window.__huginTaintFlows from the page and same-origin iframes.
  4. Classify — each flow gets a severity based on sink type and a CWE category.
  5. Report — confirmed flows are auto-created as FindingRecord entries in the store, deduped by URL + sink type.

🔗Source Types (7)

SourceWhat it tracks
url_paramlocation.search parameters
url_hashlocation.hash fragment
referrerdocument.referrer
cookiedocument.cookie reads
window_namewindow.name
postmessageevent.data from message events
storagelocalStorage.getItem() / sessionStorage.getItem()

🔗Sink Types (14)

SinkRiskSeverity
innerHTMLDOM XSSHigh
outerHTMLDOM XSSHigh
document.writeDOM XSSHigh
evalCode executionHigh
Function constructorCode executionHigh
setTimeout(string)Code executionHigh
setInterval(string)Code executionHigh
insertAdjacentHTMLDOM XSSHigh
jQuery.htmlDOM XSSHigh
dynamic_scriptScript injectionHigh
location.assignOpen redirectMedium
location.replaceOpen redirectMedium
location.hrefOpen redirectMedium
script.srcScript injectionMedium

🔗Usage

🔗MCP Tool: taint

Actions:

  • scan — quick URL scan (no full browser session, fastest path)
  • scan_browser — scan within an already-running browser session
  • analyze — full all-in-one: navigate, inject, optionally interact, collect
  • analyze_flow — re-analyze a previously captured flow
  • collect — harvest taint flows from an already-instrumented page

All-in-one analysis:

taint action:"analyze" url:"https://target.com/page?q=test" wait_ms:5000 interact:true

wait_ms defaults to 3000. interact:true triggers hashchange / popstate / message events to surface listener-based sinks.

Manual workflow:

taint action:"scan_browser"   # inject + collect against the live browser session

🔗Orchestrator (LLM Agent)

The LLM chains browser + JS tools:

  1. browser_navigate(url: "https://target.com") — open in proxied Chrome
  2. browser_exec_js(script: "<instrumentation JS>") — inject hooks
  3. browser_exec_js(script: "document.querySelector('input').value = '<img src=x>'") — simulate input
  4. browser_exec_js(script: "<collector JS>") — harvest taint flows
  5. findings_create(...) — record confirmed DOM XSS

🔗Auto-Findings

When analyze finds source→sink flows, it automatically creates findings:

  • Title: DOM XSS: url_param → innerHTML (example)
  • Severity: High (DOM XSS sinks) or Medium (redirect sinks)
  • CWE: CWE-79 (DOM XSS), CWE-601 (Open Redirect), CWE-829 (Cross-Origin)
  • Evidence: source value, sink element, JavaScript call stack
  • Dedup key: SHA-256 of URL + sink type (prevents duplicate findings)

🔗Implementation

  • Types: hugin-shared/src/taint.rsTaintFlow, TaintAnalysisResult, severity/category classifiers
  • JS: TAINT_INSTRUMENTATION_JS (~350 lines) + TAINT_COLLECTOR_JS in same file
  • MCP tool: hugin-mcp/src/tools/taint.rsanalyze, collect, inject actions
  • Orchestrator: browser tools in hugin-service/src/orchestration/tools.rs

🔗Resilience

  • Every hook wrapped in try/catch — instrumentation never breaks the page
  • Errors collected in window.__huginTaintErrors for debugging
  • Idempotent — re-injection checks window.__huginTaintInstrumented flag
  • Values truncated to 500 chars to prevent memory issues
  • Works with CSP since injection is via CDP (not external script tag)