erikinkinen/AES

Fork 0

Detect and surface overlapping / identical strategy curves in analysis plots #80

New issue

Closed

opened 2026-03-06 13:44:26 +01:00 by erikinkinen · 0 comments

erikinkinen commented

2026-03-06 13:44:26 +01:00

Owner

Problem

Several AES result plots contain strategy curves that visually overlap. At the moment, this creates two problems:

it is hard to tell whether the overlap is only a plotting artifact
it is hard to tell whether two strategies are actually producing identical metric values

This makes the graphs harder to interpret and weakens the empirical clarity of the evaluation.

Goal

Amend the analysis pipeline so that overlapping curves are treated as an explicit result rather than a plotting ambiguity.

The system should automatically identify which strategies are:

exactly identical on a plotted metric
numerically equivalent within a defined tolerance
visibly close but distinct

Scope

This issue applies to the main AES comparison plots, especially:

depth sensitivity: semantic total cost
fan-out sensitivity: semantic total cost
post-revoke hot-path cumulative cost
residual risk vs revoke latency proxy
any future multi-strategy line plot or scatter plot where overlap can occur

Requirements

1. Automatic tie-group detection

For each plot dataset, group strategies by their plotted value vectors.

Examples:

for a line plot: compare full (x, y) series by strategy
for a scatter plot: compare plotted point coordinates by strategy
for aggregated plots: compare the final plotted values, not raw run logs

Support two modes:

exact equality
tolerance-based equality

Tolerance should be configurable and default to a small value appropriate for floating-point outputs.

2. Tie-group summary output

For every relevant figure, emit a machine-readable and human-readable tie-group summary.

Example format:

depth_sensitivity_total_cost
- exact groups:
  - {direct, snapshot_direct}
  - {eager_bfs}
  - {eager_dfs}
  - {lineage_basic}
  - {epoch_indirection}

This summary should be available:

in analysis artifacts
in console/log output
optionally in a sidecar JSON file

3. Delta-to-baseline plots

For selected line plots, generate an additional plot relative to a baseline strategy.

Default baseline:

direct

Example derived metric:

delta_semantic_total_cost = strategy_cost - direct_cost

This should make equal strategies appear as a flat zero line and expose small deviations clearly.

4. Plot readability improvements

Amend plotting defaults so near-overlapping curves are easier to distinguish:

assign stable marker shapes per strategy
assign stable line styles per strategy
place markers at each x-value
support optional log-scale y-axis for cost plots
support optional small visual offsets for exactly coincident series

If visual offsets are used, they must be display-only and clearly documented in the caption or metadata.

5. Optional pairwise equality matrix

Add an analysis artifact that reports pairwise equality or maximum difference between strategies for each figure.

Example outputs:

boolean equality matrix
max absolute difference matrix
tolerance-based equivalence matrix

This is especially useful for thesis appendix tables.

Non-goals

This issue does not require:

inventing new revocation metrics
changing simulator semantics
changing workload generation
changing the underlying experiment results

The purpose is to improve interpretability of existing and future plots.

Implementation notes

Suggested comparison logic:

line plot equality:
- strategies are equal if they share the same ordered x-values and their y-vectors are equal within tolerance
scatter plot equality:
- strategies are equal if their plotted point sets are equal within tolerance
tolerance:
- use configurable absolute tolerance
- optionally support relative tolerance if needed later

Suggested outputs per figure:

<figure_name>.png
<figure_name>.tie_groups.json
<figure_name>.delta.png where applicable
optional <figure_name>.pairwise_diff.csv

Deliverables

tie-group detection utility in analysis code
tie-group summaries for main figures
delta-to-baseline plot generation for core line plots
updated plotting styles for overlapping series
documentation describing how overlapping curves are handled

Acceptance criteria

Running the analysis pipeline on current AES experiment outputs produces tie-group summaries for the main figures
At least one main line plot includes a delta-to-baseline companion figure
Overlapping strategies such as direct and snapshot_direct, when identical, are automatically reported as a tie group
Near-overlapping but non-identical curves are visually distinguishable via markers and/or line style
Cost plots support log-scale rendering to improve visibility when one strategy dominates the axis
The analysis output makes it unambiguous whether two strategies overlap because they are equal or because the plot is visually crowded

Rationale

This change turns overlap from a plotting nuisance into an empirical result.

If two strategies remain indistinguishable under a workload and metric, AES should say so explicitly. If they differ slightly, the analysis should reveal that difference clearly. This improves both thesis readability and experimental rigor.

### Problem Several AES result plots contain strategy curves that visually overlap. At the moment, this creates two problems: 1. it is hard to tell whether the overlap is only a plotting artifact 2. it is hard to tell whether two strategies are actually producing identical metric values This makes the graphs harder to interpret and weakens the empirical clarity of the evaluation. ### Goal Amend the analysis pipeline so that overlapping curves are treated as an explicit result rather than a plotting ambiguity. The system should automatically identify which strategies are: * exactly identical on a plotted metric * numerically equivalent within a defined tolerance * visibly close but distinct ### Scope This issue applies to the main AES comparison plots, especially: * depth sensitivity: semantic total cost * fan-out sensitivity: semantic total cost * post-revoke hot-path cumulative cost * residual risk vs revoke latency proxy * any future multi-strategy line plot or scatter plot where overlap can occur ### Requirements #### 1. Automatic tie-group detection For each plot dataset, group strategies by their plotted value vectors. Examples: * for a line plot: compare full `(x, y)` series by strategy * for a scatter plot: compare plotted point coordinates by strategy * for aggregated plots: compare the final plotted values, not raw run logs Support two modes: * **exact equality** * **tolerance-based equality** Tolerance should be configurable and default to a small value appropriate for floating-point outputs. #### 2. Tie-group summary output For every relevant figure, emit a machine-readable and human-readable tie-group summary. Example format: * `depth_sensitivity_total_cost` * exact groups: * `{direct, snapshot_direct}` * `{eager_bfs}` * `{eager_dfs}` * `{lineage_basic}` * `{epoch_indirection}` This summary should be available: * in analysis artifacts * in console/log output * optionally in a sidecar JSON file #### 3. Delta-to-baseline plots For selected line plots, generate an additional plot relative to a baseline strategy. Default baseline: * `direct` Example derived metric: * `delta_semantic_total_cost = strategy_cost - direct_cost` This should make equal strategies appear as a flat zero line and expose small deviations clearly. #### 4. Plot readability improvements Amend plotting defaults so near-overlapping curves are easier to distinguish: * assign stable marker shapes per strategy * assign stable line styles per strategy * place markers at each x-value * support optional log-scale y-axis for cost plots * support optional small visual offsets for exactly coincident series If visual offsets are used, they must be display-only and clearly documented in the caption or metadata. #### 5. Optional pairwise equality matrix Add an analysis artifact that reports pairwise equality or maximum difference between strategies for each figure. Example outputs: * boolean equality matrix * max absolute difference matrix * tolerance-based equivalence matrix This is especially useful for thesis appendix tables. ### Non-goals This issue does not require: * inventing new revocation metrics * changing simulator semantics * changing workload generation * changing the underlying experiment results The purpose is to improve interpretability of existing and future plots. ### Implementation notes Suggested comparison logic: * line plot equality: * strategies are equal if they share the same ordered x-values and their y-vectors are equal within tolerance * scatter plot equality: * strategies are equal if their plotted point sets are equal within tolerance * tolerance: * use configurable absolute tolerance * optionally support relative tolerance if needed later Suggested outputs per figure: * `<figure_name>.png` * `<figure_name>.tie_groups.json` * `<figure_name>.delta.png` where applicable * optional `<figure_name>.pairwise_diff.csv` ### Deliverables * tie-group detection utility in analysis code * tie-group summaries for main figures * delta-to-baseline plot generation for core line plots * updated plotting styles for overlapping series * documentation describing how overlapping curves are handled ### Acceptance criteria * Running the analysis pipeline on current AES experiment outputs produces tie-group summaries for the main figures * At least one main line plot includes a delta-to-baseline companion figure * Overlapping strategies such as `direct` and `snapshot_direct`, when identical, are automatically reported as a tie group * Near-overlapping but non-identical curves are visually distinguishable via markers and/or line style * Cost plots support log-scale rendering to improve visibility when one strategy dominates the axis * The analysis output makes it unambiguous whether two strategies overlap because they are equal or because the plot is visually crowded ### Rationale This change turns overlap from a plotting nuisance into an empirical result. If two strategies remain indistinguishable under a workload and metric, AES should say so explicitly. If they differ slightly, the analysis should reveal that difference clearly. This improves both thesis readability and experimental rigor.

erikinkinen added this to the Phase 1 milestone

2026-03-06 13:44:26 +01:00

erikinkinen added the

figure

phase-1

bug

labels

2026-03-06 13:44:26 +01:00

erikinkinen self-assigned this

2026-03-06 13:44:26 +01:00

erikinkinen added this to the AES — Active Workboard project

2026-03-06 13:44:26 +01:00