Taint Analysis
Qryon's taint analysis engine tracks untrusted data from sources to sinks, detecting vulnerabilities that span multiple functions and files.
What is Taint Analysis?
Taint analysis tracks the flow of untrusted ("tainted") data through your code. It identifies when data from external sources (like user input) reaches sensitive operations (like SQL queries) without proper sanitization.
Key Concepts
| Term | Definition | Examples |
|---|---|---|
| Source | Origin of untrusted data | req.body, request.args, stdin |
| Sink | Security-sensitive operation | db.query(), shell(), DOM methods |
| Sanitizer | Function that makes data safe | escape(), parseInt(), validators |
| Propagator | Transfers taint to new values | concat(), slice(), + |
How It Works
1. Source Identification
Qryon identifies data sources that introduce untrusted data:
// JavaScript/TypeScript sources
req.params.id // URL parameters
req.query.search // Query string
req.body.username // POST body
req.headers['x-token'] // HTTP headers
req.cookies.session // Cookies
# Python sources
request.args.get('id') # Flask query params
request.form['username'] # Form data
request.json # JSON body
os.environ.get('INPUT') # Environment vars
// Java sources
request.getParameter("id") // Servlet params
request.getHeader("token") // Headers
System.getenv("INPUT") // Environment2. Taint Propagation
Taint flows through operations that use or transform the data:
const id = req.params.id; // id is tainted
const upper = id.toUpperCase(); // upper is tainted
const query = "SELECT * FROM " + id; // query is tainted
const parts = id.split('-'); // parts[0], parts[1] are tainted3. Sink Detection
Qryon alerts when tainted data reaches dangerous operations:
// VULNERABLE: Tainted data in SQL query
const id = req.params.id;
db.query(`SELECT * FROM users WHERE id = '${id}'`);
// Finding: sql-injection - Tainted data flows to db.query()4. Sanitizer Recognition
Qryon understands when sanitization makes data safe:
const id = req.params.id;
// SAFE: Sanitized with parseInt
const numId = parseInt(id, 10);
db.query('SELECT * FROM users WHERE id = ?', [numId]);
// SAFE: Using parameterized query
db.query('SELECT * FROM users WHERE id = ?', [id]);
// SAFE: Using allowlist validation
if (ALLOWED_IDS.includes(id)) {
db.query(`SELECT * FROM users WHERE id = '${id}'`);
}Cross-File Analysis
Qryon tracks tainted data across file boundaries using import resolution and call graph analysis:
// routes/users.js
import { findUser } from '../services/userService';
app.get('/user/:id', (req, res) => {
const user = findUser(req.params.id); // Taint flows to findUser
res.json(user);
});
// services/userService.js
import { db } from '../db';
export function findUser(id) { // id is tainted from caller
// VULNERABLE: Tainted data reaches SQL sink
return db.query(`SELECT * FROM users WHERE id = '${id}'`);
}Interprocedural Flow Report
[HIGH] sql-injection
Flow: req.params.id -> findUser(id) -> db.query()
Step 1: routes/users.js:4
Source: HTTP request parameter
Step 2: services/userService.js:4
Propagation: Function parameter
Step 3: services/userService.js:6
Sink: SQL queryTaint Sources by Language
JavaScript/TypeScript
// Express.js
req.params, req.query, req.body, req.headers, req.cookies
// Node.js
process.argv, process.env, fs.readFileSync()
// Browser
window.location, document.URL, document.cookie
document.getElementById().value, localStorage.getItem()Python
# Flask
request.args, request.form, request.json, request.headers
# Django
request.GET, request.POST, request.body
# General
sys.argv, os.environ, input(), open().read()Java
// Servlet
request.getParameter(), request.getHeader(), request.getCookies()
// Spring
@RequestParam, @PathVariable, @RequestBody, @RequestHeader
// General
System.getProperty(), System.getenv(), Scanner.next()Configuring Taint Analysis
# rma.toml
[taint]
# Enable cross-file analysis
interprocedural = true
# Maximum call depth for tracking
max_depth = 10
# Custom sources
[[taint.sources]]
pattern = "getUntrustedInput()"
languages = ["javascript", "typescript"]
# Custom sinks
[[taint.sinks]]
pattern = "dangerousOperation($ARG)"
languages = ["javascript", "typescript"]
sink_arg = "$ARG"
# Custom sanitizers
[[taint.sanitizers]]
pattern = "sanitize($INPUT)"
languages = ["javascript", "typescript"]
sanitizes = "$INPUT"Viewing Taint Flows
# Show all taint flows
rma scan . --show-flows
# Interactive TUI (recommended)
rma scan --interactive
# Press 'f' to view cross-file flows
# JSON output with flow details
rma scan . --format json --show-flows | jq '.findings[].flow'Limitations
- Dynamic code: Dynamic code generation and requires
- Reflection: Java reflection, Python getattr
- Callbacks: Complex callback chains may lose taint
- External libraries: Taint may not propagate through non-analyzed code
Best Practices
- Use parameterized queries instead of string interpolation
- Validate early - sanitize input at the boundary
- Use typed parsers - parseInt(), JSON.parse()
- Allowlist validation - prefer allowlists over blocklists
- Context-aware encoding - HTML encode for HTML, URL encode for URLs