When you query a threat intelligence API about a domain, most tools return something like this:

{ "malicious": false, "score": 0 }

That is a deterministic answer: safe or not safe, scored on a simple scale against a known-bad list. It is useful for known threats. It is useless for everything else.

The problem is that most threats are not known-bad at query time. A phishing domain registered three hours ago will not appear in any blocklist. A typosquat set up for a targeted campaign may never be indexed by any threat feed. And a legitimate domain that has been compromised will score clean right up until someone notices.

Entropy0 takes a different approach. Instead of asking "is this domain on a list?", it asks "what does this domain's structure tell us?" The result is three continuous scores — not a flag.

The Three Signals

Trust Score (0–100)

Trust measures the cumulative weight of positive infrastructure signals. It answers: does this domain behave like a domain operated by a real, stable entity?

Contributing signals:

Domain age (older domains score higher, with diminishing returns after 5 years)
Registrar reputation (enterprise registrars like Amazon, CSC, MarkMonitor signal institutional operation)
SSL certificate chain (CA reputation, certificate age, OV/EV vs DV)
DNS configuration completeness (NS redundancy, MX presence, SPF/DKIM/DMARC)
Network infrastructure (ASN stability, hosting provider reputation, IP geolocation)
HTTP response coherence (redirect behavior, content reachability, header hygiene)

A trust score of 85+ means the domain has strong, consistent signals across most dimensions. A score of 30 means several dimensions are weak or missing — not necessarily malicious, but structurally thin.

Threat Score (0–100)

Threat measures the cumulative weight of negative intent signals. It answers: does this domain show signs of being purpose-built to deceive?

Contributing signals:

Typosquatting: Levenshtein distance from known brand domains
Combo-squatting: brand keyword injection (paypal-secure.com, netflix-billing.net)
IDN homograph: Unicode character substitution in domain labels
TLD swap: identical base domain on a non-standard TLD
Newly registered: age < 30 days, especially for brand-adjacent domains
DNSBL listings (weighted by trust score — shared hosting IPs on high-trust domains are suppressed)
URL intent signals: path patterns, gate phrases, redirect chains

Critically, threat score components are not additive without context. A newly registered .com is suspicious. A newly registered .com that is also a typosquat of a financial brand, with a Let's Encrypt cert and no WHOIS identity, is a compound signal. The scoring reflects that compounding.

Deviation Score (0–100)

Deviation is the most unusual of the three. It answers: how different is this domain from the baseline population of scanned domains, along the dimensions that matter most?

It is computed as a power-law weighted anomaly across the same signal dimensions — but instead of asking whether a signal is good or bad, it asks whether the signal combination is unusual.

This matters because some threats are sophisticated enough to look clean on every individual signal while being structurally anomalous in aggregate. A domain with a 5-year history, valid SSL, enterprise registrar, but no content reachability and a suspicious redirect chain has a high deviation score even though its trust score would otherwise be decent.

Deviation is your signal for "something is off here, even if I cannot name it".

Why Continuous Scores Beat Binary Flags

Binary classification gives you high-confidence answers about known threats and tells you nothing about unknown ones. The entire distribution between "definitely clean" and "definitely malicious" is collapsed into "safe."

Continuous scores let you tune your policy to the context:

| Use case | Policy | |----------|--------| | Blocking known phishing in email | threatScore >= 60 — high precision, accept false negatives | | Agent fetch gate in a financial app | threatScore >= 30 OR deviationFinal >= 50 — higher sensitivity | | Link preview in a consumer product | threatScore >= 50 — balance between UX and safety | | RAG pipeline source validation | trustScore >= 70 AND deviationFinal < 40 — positive framing | | Brand protection watchlist monitoring | Alert on any score change — not just threshold breach |

The same underlying scores, different thresholds, different appropriate actions. A binary API cannot support this without you running your own classifier on top of it.

The Compounding Problem

Here is a concrete example of why the combination matters more than any individual signal:

Domain A: stripe-payments.io

Age: 8 days
Registrar: NameCheap (consumer registrar)
SSL: Let's Encrypt DV, issued 7 days ago
WHOIS: fully redacted
DNS: A record resolves, no MX, no SPF, no DMARC
Content: simple landing page, no branding

Individual signals:

Age: suspicious (new domain)
Registrar: neutral (NameCheap is used by millions of legitimate domains)
SSL: neutral (Let's Encrypt is legitimate)
WHOIS privacy: slightly suspicious
No email hygiene: slightly suspicious

None of these individually triggers a blocklist. The combination — brand keyword injection on a fresh domain with a free cert, redacted WHOIS, and no email infrastructure — maps to a threat score of 72 and a deviation score of 88.

The scoring engine weights each signal by its evidential strength in this specific configuration. The compounding is what catches it.

The Epistemic Status Field

One field in the response that most users initially overlook:

{
  "epistemicStatus": "sufficient_signal"
}

This field tells you how much evidence the scoring engine had to work with. Possible values:

| Value | Meaning | |-------|---------| | sufficient_signal | High coverage across dimensions — scores are reliable | | partial_signal | Some dimensions missing — scores are directional | | minimal_signal | Most dimensions unavailable — treat scores as estimates | | warming_up | Domain is new enough that baseline comparison is limited |

A domain with trustScore: 90 and epistemicStatus: partial_signal is less reliable than a domain with trustScore: 75 and epistemicStatus: sufficient_signal. Exposing this uncertainty is a deliberate design decision — a system that hides its own confidence limits is a liability in production.

How the Decision Engine Uses All Three

The /decide endpoint synthesises all three scores into a single recommended action:

proceed           → trust ≥ 75, threat < 20, deviation < 30
proceed_with_caution → moderate scores, low-risk context
sandbox           → medium threat or high deviation — fetch but isolate
escalate_to_human → high deviation, ambiguous signals — needs review
deny              → threat ≥ 60, or compound signals above policy threshold

The action is always qualified with the policy profile (open, balanced, strict, critical) and the interaction context (kind: fetch/navigate, sensitivity: low/medium/high). The same domain can receive different decisions for different contexts — accessing it in read-only mode vs. submitting a form to it are different risk profiles.

The Moat

The design moat here is not the scores themselves. Any team can build a domain scorer. The moat is:

Continuous + explainable — not a black-box flag, a reasoned score with evidence arrays
Context-aware decisions — the same signal produces different actions for different interaction types
Uncertainty visibility — epistemic status and coverage metadata are first-class response fields
Population-relative deviation — signals are compared against a corpus of scanned domains, not just against a heuristic threshold

The goal is to be the trust infrastructure layer that sits under AI agents and security systems — not a tool you consult manually, but an API your stack calls before every external interaction.

See the full scoring model in the API docs. Get a free API key to run your first scan.

Deterministic vs Probabilistic: How Entropy0 Scores Domain Trust