Web Scanner allows users to target the body data of a webpage in a number of ways, allowing security teams to scan for similar websites based on hash values, similarity scores, JavaScript data, and darkweb content.
Here's a selection of useful data types. Click here for a full list of field names.
Data types at a glance
Data type | What it does | Use case | Field Names | Examples |
---|---|---|---|---|
SHA-256 Body Data | Generates exact-match hashes for webpage body, header, or footer content. | Detect identical pages (e.g., error or holding pages). |
| Two websites with identical 404 error pages share the same |
JavaScript Data | Tracks JavaScript files via SHA-256 (exact) and ssdeep (fuzzy) hashes. | Identify variations in JS files, even with different parameters (e.g., |
| A phishing site uses |
Language Data | Identifies the website’s language to reveal intended audience. | Flag mismatches (e.g., Chinese content on a |
| A |
Onion Data | Lists Tor | Connect clearnet sites to dark web activity (e.g., phishing kits). |
| A clearnet site links to |
Script Hash Value (SHV) | Creates a fingerprint of script names, ignoring parameters, for fuzzy matching. | Group similar websites using common scripts (e.g., phishing kits with jQuery). |
| Two phishing sites use |
HTML Body Similarity | Measures similarity (0–100) between current and previous webpage scans. | Track changes in content (e.g., 91 means 9% difference). |
| A site’s similarity score drops to 85, indicating a 15% content change, possibly a new phishing page. |