Cookieless is not determinist

Clearing cookies used to feel like a reset.

It is not.

Modern tracking increasingly relies on measurement (fingerprinting) and side-channels (caches and protocols people rarely think about). To make this concrete, I tested my own browser with the EFF’s Cover Your Tracks tool. The result was a unique fingerprint among 303,650 browsers tested over 45 days, conveying at least 18.21 bits of identifying information.

This article is split into two parts:

Part 1: A detailed, practical catalog of the signals that can be used to track (or at least link) a browser and device.
Part 2: Documented real-world use cases where these techniques have been used in advertising, fraud prevention, and bot detection.


Part 1 — The Full Technical Surface of Browser Tracking (Detailed)

0) The mental model: storage vs measurement vs network behavior

Before listing signals, it helps to classify them:

  1. Storage-based persistence
    Cookies and “cookie-like” storage that can persist an identifier.
  2. Measurement-based fingerprinting
    Signals derived from how your device/browser behaves when asked to render, compute, or report capabilities.
  3. Network/protocol fingerprinting
    Signals derived from how your client negotiates and speaks over network protocols (TLS, HTTP/2, QUIC/HTTP/3).

A key point:
Fingerprinting often does not “identify you as a person” by itself. It creates a stable handle that becomes powerful when it touches a login, checkout, email click, or any account-linked event.


1) Web Headers (sent on every request)

These are transmitted with every page load. They are “free” for trackers because they arrive without any JavaScript.

1.1 User-Agent (UA)

What it is:
A header that reports your browser family, rendering engine, OS, and often a detailed browser version.

Why it matters:
UA strings can be very specific and can heavily narrow the population. Even when UA reduction is deployed, other signals often compensate.

How it’s used:

Limitations:

1.2 Accept / Accept-Encoding / Accept-Language (often summarized as “HTTP_ACCEPT headers”)

What it is:

Why it matters:
These values are surprisingly stable over time and vary between browsers and platforms.

How it’s used:

Limitations:

1.3 Do Not Track (DNT) header

What it is:
A header indicating a preference not to be tracked.

Why it matters:
Paradoxically, because it is relatively rare in some populations, it can add entropy.

How it’s used:

Limitations:


2) Time & Locale Signals

2.1 Timezone (IANA name) and timezone offset

What it is:
Your timezone expressed as an offset and/or a named timezone (e.g., Europe/Paris).

Why it matters:

How it’s used:

Limitations:

2.2 Language / Locale

What it is:
Your preferred language for content.

Why it matters:
Hard to change without breaking UX, and it adds entropy—especially when uncommon for your timezone.

How it’s used:

Limitations:


3) Screen, Window, and Pixel Geometry

3.1 Screen size, window size, and color depth

What it is:
Dimensions of your current browser window (or screen), plus pixel depth.

Why it matters:
It can be highly discriminating, but can be brittle because users resize windows.

How it’s used:

Limitations:

3.2 Device Pixel Ratio (DPR), zoom level, font scaling

What it is:
How CSS pixels map to physical pixels; user zoom and scaling preferences.

Why it matters:
Subtle differences here can be surprisingly identifying.

How it’s used:

Limitations:


4) Installed Fonts (Font Fingerprinting)

What it is:
A site infers which fonts are installed by rendering text in many candidate fonts and measuring layout changes (width/height differences).

Why it matters:
Font sets can be very unique, especially if you install niche fonts (design tools, brand fonts, language packs).

How it’s used:

Limitations:

Practical note:
If you want a low-entropy everyday browsing profile, the best move is often to keep fonts boring and separate “creative” browsing from “privacy-sensitive” browsing.


5) Canvas Fingerprinting (2D Graphics)

What it is:
A site draws shapes/text to an invisible HTML5 canvas, reads back pixel data, serializes it, and hashes the result.

Why it matters:
The final pixels depend on a complex intersection of:

How it’s used:

Limitations:


6) WebGL Fingerprinting (3D Graphics) + GPU Exposure

6.1 WebGL fingerprint hash

What it is:
Similar to canvas, but using WebGL rendering, which brings GPU and 3D pipeline characteristics into the signature.

Why it matters:
WebGL often provides even richer entropy than canvas alone.

How it’s used:

Limitations:

6.2 WebGL Vendor & Renderer

What it is:
Strings that can expose the GPU vendor and renderer path.

Why it matters:
Hardware-level uniqueness becomes visible.

How it’s used:

Limitations:


7) WebGPU (Newer GPU Surface)

What it is:
A modern API for high-performance GPU access in browsers.

Why it matters:
More capability often means more measurable traits (unless browsers proactively reduce exposed entropy). It is a new measurement surface as deployment grows.

How it’s used (emerging):

Limitations:


8) Audio Fingerprinting (AudioContext / WebAudio)

What it is:
A script generates an audio signal internally, processes it, and hashes the computed output.

Important clarification:
This is typically about internal audio computation behavior, not recording microphone input.

Why it matters:
Audio pipelines differ subtly due to:

How it’s used:

Why VPN/incognito doesn’t help:
It is neither storage nor network identity. It is local computation behavior.

Limitations:


9) Hardware Signals (Low Alone, Strong in Aggregate)

9.1 Platform / architecture

What it is:
A JS-exposed label indicating platform family (e.g., MacIntel).

How it’s used:

9.2 Hardware concurrency (CPU cores)

What it is:
Reported number of logical CPU cores available.

How it’s used:

9.3 Device memory

What it is:
Reported RAM (often rounded).

How it’s used:

Limitations:


10) Touch Support and Input Capabilities

What it is:
Reported touchpoints and whether certain touch events are supported.

How it’s used:

Limitations:


11) Cookies Enabled (Binary but Still Useful)

What it is:
Whether the browser allows cookies.

Why it matters:
Alone, it’s minimal information. Combined, it helps cluster your configuration and detect hardened modes.


12) “Supercookie” Storage Surfaces (Beyond Cookies)

Even when cookies are cleared, other storage may persist:

Why it matters:
A tracker can store identifiers in places users don’t intuitively clear.


13) Cache Supercookies (Favicon-based tracking)

What it is:
A site encodes an identifier into cache state (e.g., favicon requests), then “reads” the identifier later by checking which requests are missing because resources were cached.

Why it matters:
It exploits a gap between:

Mitigations:


14) Client Hints (Optimization Signal That Can Over-Disclose)

What it is:
Structured hints browsers can send to help servers optimize content delivery (device class, platform info, etc.).

Why it matters:
If over-detailed and not constrained, it expands fingerprint surface even as UA is reduced.

Mitigations:


15) Layout Measurement (ClientRects / Subpixel Geometry)

What it is:
Scripts measure how elements render down to subpixel differences. Layout depends on fonts, OS rendering, zoom, pixel ratio, and compositing behavior.

Why it matters:
Small differences add up when a tracker takes many measurements.

Mitigations:


16) WebRTC and DNS Leak Surfaces (Identity Adjacent)

These are not always “fingerprints,” but they can leak metadata that undermines privacy expectations:

Mitigations:


17) TLS / HTTP/2 / QUIC Fingerprinting (Network Layer)

What it is:
Even below browser APIs, clients have distinctive “handshakes” and protocol behaviors.

Examples:

Why it matters:
Changing IP does not necessarily change how your client negotiates protocols. Protocol behavior can become a stable handle and is heavily used in bot detection and security analytics.

Mitigation reality:
You often cannot fully eliminate protocol fingerprints as an end user. The pragmatic approach is:


Part 1 — Sources


Part 2 — Documented Use Cases: Where These Techniques Are Actually Used

Advanced tracking techniques show up in multiple domains. The same mechanism can serve different intentions:

Below are documented cases and “where it happens in practice.”


A) Advertising and Cross-Site Measurement

A1) AddThis and canvas fingerprinting (2014 backlash)

What happened:
EFF documented that AddThis began using canvas fingerprinting in 2014 and faced a strong negative reaction, after which the practice was reportedly stopped.

Why it matters:
This is a canonical example showing canvas fingerprinting was not merely theoretical—major widget/ad-tech providers experimented with it at scale.

A2) Canvas fingerprinting at scale (research + reporting)

What happened:
Investigations reported canvas fingerprinting being used across thousands of sites (including high-profile domains).

Why it matters:
It supports the broader claim: fingerprinting techniques have been deployed widely enough to trigger major public reporting and policy discussion.


B) ISP / Network-Level Identifier Injection

B1) Verizon UIDH “supercookie” header injection (2012–2016 era)

What happened:
Verizon injected a Unique Identifier Header (UIDH) into customer HTTP requests, enabling tracking beyond what users could control locally. The practice led to FCC action/settlement and public documentation.

Why it matters:
This demonstrates tracking can happen below the browser—even perfect cookie hygiene is insufficient if identifiers are injected at the network layer.


C) Fraud Prevention and Risk Scoring (Device Intelligence)

This is one of the most common real-world contexts for device fingerprinting.

C1) ThreatMetrix device fingerprinting (LexisNexis Risk)

What it is used for:
Device intelligence and risk scoring—detecting suspicious activity even when cookies are deleted or private browsing is used.

Why it matters:
It shows fingerprinting is an explicit commercial feature in fraud stacks (often positioned as security, not ads).

C2) TransUnion iovation (device reputation / intelligence)

What it is used for:
Online fraud and cybercrime detection services (device intelligence and reputation).

Why it matters:
It is a large-scale, long-lived category of “device intelligence” products used by many businesses for account security and abuse prevention.


D) Bot Detection and Anti-Abuse Infrastructure

In bot management, fingerprinting is often framed as “distinguishing humans from automation.”

D1) Cloudflare Bot Management — JA3/JA4 fingerprints

What it is used for:
Profiling SSL/TLS clients (JA3/JA4) for bot detection and request scoring.

Why it matters:
It is an explicit, documented example of protocol-level fingerprinting being part of an enterprise bot product.

D2) Akamai Bot Manager — browser fingerprinting and TLS observations

What it is used for:
Bot detection using behavior analysis and browser fingerprinting, with additional technical discussion around TLS fingerprinting and bot evasion.

Why it matters:
It shows two things:


E) TLS Fingerprinting in Security Analytics (JA3)

E1) JA3 / JA3S in practice

What it is used for:
Security teams use JA3/JA3S to fingerprint TLS clients/servers for detection and correlation.

Why it matters:
This is a “beyond the browser” fingerprint layer: even if you block JS and clear storage, your TLS client behavior can still be profiled by network observers and defense systems.


F) Cross-Device / Cross-Screen Device Graph Products

F1) BlueCava (cross-device / cross-screen identity products)

What happened:
BlueCava operated in cross-screen marketing and device identification, and later merged with Qualia (2016), with coverage describing cross-screen conversion/identity use cases.

Why it matters:
It illustrates an industry category built around probabilistic device identity—often positioned for marketing outcomes (cross-device linkage).


Part 2 — Sources (by case cluster)