Argon2id benchmark heatmap

Options
Filters
loading…
within ±50% of target faster than target (darker = more so) slower than target (darker = more so) estimated OOM on some shown devices (marginal) OOM on every shown device no data

About this data

Each cell shows how a device (or filtered set of devices) performs the Argon2id hash at one (memory hardness × time hardness) parameter point. Memory hardness sets how much RAM the hash allocates per attempt; time hardness sets how many passes it makes over that memory. The displayed value is either hashes per second or the time per hash — toggle with the Display option.

Filtering

Each filter narrows the set of devices used in the heatmap. Filters compose by intersection: picking year=2020 and a device that wasn't released in 2020 will yield an empty heatmap. Within each dropdown, options that would yield zero devices (given your other selections) are greyed out, with a sample count in parentheses so you can see how much data each option represents. The "Reset filters" button clears them all.

Why some cells are striped (estimated)

Running every parameter combination on every device would take far too long. Once an individual hash on a device exceeds a configured wall-clock cutoff, the app stops actually computing further (more expensive) combinations and instead extrapolates an estimated time from prior measurements for the same device. Those rows are recorded with stopped_reason = "estimated_timeout" and the cell is shown with a diagonal-stripe overlay. If both real and estimated measurements exist for a cell, the real ones are used and the overlay is suppressed.

Why some cells are blank

As memory hardness grows, the Android OS eventually kills the benchmark process before the hash completes (out of memory). To minimise crashes, the app tests lower memory parameters first and works upward; once a memory level OOMs we stop pushing further on that device. So missing cells at higher memory typically mean "the OS killed us before we got there." Missing cells at higher time hardness usually mean we hit the estimated-timeout threshold for that device.

Why some cells are hashed in red or orange (OOM)

When a hash attempt is killed by the OS for running out of memory we record an OOM event for that param set. OOMs are bucketed per memory column (collapsing across time hardness, since memory cost is what drives peak allocation). The marker is then chosen by counting devices in the current filter that have any OOM event in that column:

No marker is drawn for columns where none of the shown devices have OOMed, even if the column is past the highest measured point — missing data alone is no longer promoted to a red "wall".

Aggregation, target, and colors

When multiple devices match the current filters, each cell aggregates across all matching runs — best shows the fastest device, worst the slowest, plus median and average. The target value defines the dividing line: cells within ±50% of target render blue, faster ones green (darker = much faster), slower ones red (darker = much slower). Switching between hashes/sec and hash time converts the target value automatically and relabels the aggregation options (the meaning of your selection is preserved — e.g. "max (best)" becomes "min (best)").

Measurement precision (ms vs ns)

Early Android measurements were recorded only in whole milliseconds — the benchmark client used System.currentTimeMillis()-style timing, so a hash that actually took 2.4 ms was logged as 2. Newer Android builds and all GPU runs record a ns_per_hash field with full nanosecond precision.

The heatmap surfaces this honestly: a cell rendered as an integer (e.g. 5) was sourced from ms-only samples and could be anywhere in [5, 6) ms; a cell with decimals (e.g. 5.123) was sourced from at least one ns-precision sample. When a cell aggregates across a mix of ms- and ns-precision samples, the ns reading wins for display since it's the more accurate measurement of that point. Hashes faster than 1 ms with no ns data available show as <1.

The "device capability over time" chart

Each dot is one device's "max m=t at target" — the largest matching memory/time point (with m=t along the heatmap diagonal) at which it could still finish one hash in the target time. Computed by log-log interpolation between the device's two diagonal cells that bracket the target (one with measured time ≤ target, one with ≥). Devices whose measurements don't bracket the target — e.g. a phone whose only samples are far below it — are excluded, because extrapolating the slope outside the measured range is unreliable (ms-rounded measurements often look much shallower than the true Argon2 cost curve and would predict wildly large m=t values). The bigger dots are the geometric mean of the surviving per-device values, grouped by release year and platform.

The trend line is a least-squares fit of (year, log₂(m=t)) — in other words, an exponential fit, which is what we expect for hardware improvement (every device generation roughly multiplies capability rather than adding to it). The legend shows the fit's implied doubling cadence and R² so you can judge how much the line is actually saying. The dotted section is the same fit extended 10 years into the future, with a faint vertical line marking where the data ends.

Unlike the heatmap, this chart ignores the filter dropdowns — it always uses every device in the dataset. It does follow the target value, since "max m=t at target" only has a meaning relative to a target.

Context & raw data

This dataset is being compiled while we research realistic Argon2id parameter choices for the Who Technologies mTOTP protocol; we're publishing the heatmap because the data is likely useful to the broader community working with Argon2 on mobile. The full raw dataset (one row per measured hash) will be released free for anyone to use once collection is complete.