Update to data
This commit is contained in:
163
paper/index.qmd
163
paper/index.qmd
@@ -22,11 +22,13 @@ abstract: |
|
||||
into a single prototype and validate the combined system on an industrial
|
||||
Raspberry Pi CM5 (Cortex-A76) receiving real QUIC traffic from a dedicated
|
||||
traffic generator. An empirical sweep across 50k--200k asset instances and
|
||||
0--5\% packet loss confirms that ECS tick rate remains stable under network
|
||||
loss, that cross-tier head-of-line blocking isolation holds end-to-end
|
||||
through both the QUIC transport layer and the ECS ingest layer, and that
|
||||
memory scales linearly at less than 0.2~MB per 1{,}000 entities on target edge
|
||||
hardware. Finally, the prototype functions as an active edge controller rather
|
||||
0--5\% packet loss confirms that the ECS tick rate remains an order of
|
||||
magnitude above the cadence required for industrial DT operation under all
|
||||
tested conditions, that cross-tier head-of-line blocking isolation holds
|
||||
end-to-end -- the lossy datagram tier surfaces no measurable loss-induced
|
||||
latency while the reliable bidirectional tier absorbs the expected QUIC
|
||||
retransmit cost -- and that memory scales linearly at less than $0.2$~MB
|
||||
per 1,000 entities on target edge hardware. Finally, the prototype functions as an active edge controller rather
|
||||
than a passive telemetry pipeline, executing end-to-end closed-loop actuation
|
||||
triggered directly from a standard Grafana observability dashboard.
|
||||
|
||||
@@ -52,7 +54,6 @@ from pathlib import Path
|
||||
|
||||
# Paths relative to paper/
|
||||
DATA_TWO_MACHINE = Path("../data/two_machine")
|
||||
DATA_LOCAL = Path("../data/local")
|
||||
FIGURES = Path("figures")
|
||||
FIGURES.mkdir(exist_ok=True)
|
||||
|
||||
@@ -63,25 +64,24 @@ def load_csv(path: Path) -> pd.DataFrame:
|
||||
return pd.DataFrame()
|
||||
|
||||
# CM5 sweep (M4 Max generator → CM5 substrate, 1 Gbps direct Ethernet).
|
||||
# Holds both per-tier latency and per-entity-count throughput / RSS.
|
||||
# The 10k-entity rows are dropped as warmup: their per-connection clock-offset
|
||||
# baseline differs from the larger sweeps by ~18 ms, dominating the loss signal.
|
||||
# Holds T1 P99, T3 RTT P99, per-entity-count throughput / RSS.
|
||||
# The 10k-entity rows are dropped: the across-row clock-offset baseline drift
|
||||
# (~17 ms) dominates the loss signal at the smallest entity count.
|
||||
df_sweep = load_csv(DATA_TWO_MACHINE / "final_table.csv")
|
||||
if len(df_sweep):
|
||||
df_sweep = df_sweep.query("entities >= 50000").reset_index(drop=True)
|
||||
df_latency = df_sweep
|
||||
df_throughput = df_sweep
|
||||
|
||||
# Cross-tier isolation sweep (local; T1 rate swept, T3 held at 100 Hz).
|
||||
df_isolation = load_csv(DATA_LOCAL / "cross_tier.csv")
|
||||
# Per-cell value lookups for the result tables.
|
||||
def _t1(e, l): return float(df_latency.query(f"entities=={e} and loss_pct=={l}")["t1_p99_us"].iloc[0]) / 1000.0
|
||||
def _t3(e, l): return float(df_latency.query(f"entities=={e} and loss_pct=={l}")["t3_rtt_us"].iloc[0]) / 1000.0
|
||||
def _hz(e, l): return int(round(float(df_throughput.query(f"entities=={e} and loss_pct=={l}")["hz"].iloc[0])))
|
||||
def _rss(e): return float(df_throughput.query(f"entities=={e}")["rss_mb"].mean())
|
||||
|
||||
# Key scalars used inline in the prose.
|
||||
hz_at_100k_0pct = float(
|
||||
df_throughput.query("entities == 100000 and loss_pct == 0")["hz"].iloc[0]
|
||||
)
|
||||
hz_at_100k_5pct = float(
|
||||
df_throughput.query("entities == 100000 and loss_pct == 5")["hz"].iloc[0]
|
||||
)
|
||||
hz_at_100k_0pct = _hz(100000, 0)
|
||||
hz_at_100k_5pct = _hz(100000, 5)
|
||||
rss_at_100k = float(
|
||||
df_throughput.query("entities == 100000 and loss_pct == 0")["rss_mb"].iloc[0]
|
||||
)
|
||||
@@ -134,7 +134,7 @@ for DT sensor transport [@plantevin2026quic]. The present paper asks: do they
|
||||
compose? Does integrating real QUIC traffic into the ECS ingest path introduce
|
||||
coupling that degrades either substrate's claimed properties?
|
||||
|
||||
This paper makes three primary contributions. First, we provide a formal argument that ECS and QUIC are *complementary* substrates whose system boundary maps cleanly onto the DT runtime architecture (@sec-architecture). Second, we present an integrated prototype connecting a QUIC server (Quinn/Rust) to a Bevy ECS world via a three-tier channel bridge. This prototype functions not just as a telemetry pipeline, but as an active edge controller with continuous export to, and closed-loop actuation triggered from, a Grafana/Victoria Metrics observability stack (@sec-implementation). Finally, we conduct an empirical sweep on an industrial Raspberry Pi CM5 (Cortex-A76) confirming that the ECS tick rate remains stable under 0--5\% network loss. The sweep demonstrates that cross-tier QUIC isolation holds end-to-end through the ECS ingest layer and that the integration overhead remains negligible relative to independent substrate costs (@sec-evaluation).
|
||||
This paper makes three primary contributions. First, we provide a formal argument that ECS and QUIC are *complementary* substrates whose system boundary maps cleanly onto the DT runtime architecture (@sec-architecture). Second, we present an integrated prototype connecting a QUIC server (Quinn/Rust) to a Bevy ECS world via a three-tier channel bridge. This prototype functions not just as a telemetry pipeline, but as an active edge controller with continuous export to, and closed-loop actuation triggered from, a Grafana/Victoria Metrics observability stack (@sec-implementation). Finally, we conduct an empirical sweep on an industrial Raspberry Pi CM5 (Cortex-A76) confirming that the ECS tick rate stays an order of magnitude above the cadence required for industrial DT operation across 0--5\% packet loss, and that cross-tier head-of-line blocking isolation holds end-to-end --- the lossy datagram tier surfaces no measurable loss-induced latency while the reliable bidirectional tier absorbs the expected QUIC retransmit cost (@sec-evaluation).
|
||||
|
||||
# Background {#sec-background}
|
||||
|
||||
@@ -292,82 +292,29 @@ accuracy; throughput measurements used the two-machine setup.
|
||||
|
||||
## Results
|
||||
|
||||
```{python}
|
||||
#| label: tbl-latency
|
||||
#| tbl-cap: "T1 datagram P99 latency (ms) on the CM5 across entity counts
|
||||
#| and packet loss rates. Cross-host one-way timestamps include a
|
||||
#| clock-offset component between the M4 Max generator and the
|
||||
#| CM5 substrate; the additional latency induced by 1\\% and 5\\%
|
||||
#| loss is within $\\pm 2$~ms of the 0\\%-loss baseline at all
|
||||
#| entity counts, confirming that QUIC datagram delivery is not
|
||||
#| measurably delayed by loss at the operational scale tested."
|
||||
| Entities | 0% loss | 1% loss | 5% loss |
|
||||
|---:|---:|---:|---:|
|
||||
| 50k | `{python} f"{_t1(50000,0):.1f}"` | `{python} f"{_t1(50000,1):.1f}"` | `{python} f"{_t1(50000,5):.1f}"` |
|
||||
| 100k | `{python} f"{_t1(100000,0):.1f}"` | `{python} f"{_t1(100000,1):.1f}"` | `{python} f"{_t1(100000,5):.1f}"` |
|
||||
| 200k | `{python} f"{_t1(200000,0):.1f}"` | `{python} f"{_t1(200000,1):.1f}"` | `{python} f"{_t1(200000,5):.1f}"` |
|
||||
|
||||
from IPython.display import Markdown, display
|
||||
: T1 datagram P99 latency (ms) on the CM5 across entity counts and packet loss rates. Cross-host one-way timestamps include a clock-offset component between the M4 Max generator and the CM5 substrate; the across-row baseline drop from $\sim 47$~ms at 50k entities to $\sim 28$~ms at 200k entities reflects NTP convergence over the bench duration and is not an entity-count effect. The load-bearing signal is within-row: the additional latency induced by 1\% and 5\% loss is within $\pm 3$~ms of the 0\%-loss baseline at every entity count, confirming that the lossy T1 tier absorbs datagram drops without surfacing retransmit latency. {#tbl-latency}
|
||||
|
||||
wide = df_latency.pivot_table(
|
||||
index="entities", columns="loss_pct",
|
||||
values="t1_p99_us", aggfunc="mean"
|
||||
).sort_index()
|
||||
wide.columns = [f"{int(c)}% loss" for c in wide.columns]
|
||||
wide = (wide / 1000.0).round(1) # µs → ms
|
||||
wide.insert(0, "Entities",
|
||||
[f"{int(n/1000)}k" for n in wide.index])
|
||||
tbl_lat = wide.reset_index(drop=True)
|
||||
display(Markdown(tbl_lat.to_markdown(index=False)))
|
||||
```
|
||||
| Entities | Hz (0% loss) | Hz (1% loss) | Hz (5% loss) | RSS (MB) |
|
||||
|---:|---:|---:|---:|---:|
|
||||
| 50k | `{python} f"{_hz(50000,0):,}"` | `{python} f"{_hz(50000,1):,}"` | `{python} f"{_hz(50000,5):,}"` | `{python} f"{_rss(50000):.1f}"` |
|
||||
| 100k | `{python} f"{_hz(100000,0):,}"` | `{python} f"{_hz(100000,1):,}"` | `{python} f"{_hz(100000,5):,}"` | `{python} f"{_rss(100000):.1f}"` |
|
||||
| 200k | `{python} f"{_hz(200000,0):,}"` | `{python} f"{_hz(200000,1):,}"` | `{python} f"{_hz(200000,5):,}"` | `{python} f"{_rss(200000):.1f}"` |
|
||||
|
||||
```{python}
|
||||
#| label: tbl-throughput
|
||||
#| tbl-cap: "ECS DT runtime throughput under real QUIC traffic on the CM5
|
||||
#| (two-machine, performance governor, 5,000 ticks).
|
||||
#| Tick rate remains within 3% of the synthetic-ingest baseline
|
||||
#| at all entity counts and loss rates."
|
||||
: ECS DT runtime throughput and RSS under real QUIC traffic on the CM5 (two-machine, performance governor, 50~s measurement window per cell). Tick rate degrades 19--32\% from 0\% to 5\% loss but remains an order of magnitude above the cadence required for industrial DT operation across the full sweep. RSS grows linearly with entity count (slope $\sim 0.12$~MB per 1,000 entities). {#tbl-throughput}
|
||||
|
||||
from IPython.display import Markdown, display
|
||||
| Entities | 0% loss | 1% loss | 5% loss |
|
||||
|---:|---:|---:|---:|
|
||||
| 50k | `{python} f"{_t3(50000,0):.1f}"` | `{python} f"{_t3(50000,1):.1f}"` | `{python} f"{_t3(50000,5):.1f}"` |
|
||||
| 100k | `{python} f"{_t3(100000,0):.1f}"` | `{python} f"{_t3(100000,1):.1f}"` | `{python} f"{_t3(100000,5):.1f}"` |
|
||||
| 200k | `{python} f"{_t3(200000,0):.1f}"` | `{python} f"{_t3(200000,1):.1f}"` | `{python} f"{_t3(200000,5):.1f}"` |
|
||||
|
||||
tbl = df_throughput.pivot_table(
|
||||
index="entities", columns="loss_pct",
|
||||
values="hz", aggfunc="mean"
|
||||
).sort_index()
|
||||
tbl.columns = [f"Hz ({int(c)}% loss)" for c in tbl.columns]
|
||||
tbl = tbl.round(0).astype(int)
|
||||
|
||||
rss_by_n = df_throughput.groupby("entities")["rss_mb"].mean().round(1)
|
||||
tbl.insert(len(tbl.columns), "RSS (MB)", rss_by_n)
|
||||
tbl.insert(0, "Entities", [f"{int(n/1000)}k" for n in tbl.index])
|
||||
display(Markdown(tbl.reset_index(drop=True).to_markdown(index=False)))
|
||||
```
|
||||
|
||||
```{python}
|
||||
#| label: fig-isolation
|
||||
#| fig-cap: "Cross-tier isolation: T3 bidirectional-stream P99 latency
|
||||
#| (reliable tier, held at a constant 100 Hz baseline) as the
|
||||
#| concurrent T1 datagram rate sweeps three orders of magnitude
|
||||
#| on the same QUIC connection. T3 latency remains flat at
|
||||
#| ~150–220 µs regardless of T1 load, confirming that QUIC
|
||||
#| head-of-line blocking isolation composes with the ECS ingest
|
||||
#| layer end-to-end."
|
||||
#| fig-width: 6
|
||||
#| fig-height: 3.2
|
||||
|
||||
iso = df_isolation.sort_values("rate_hz")
|
||||
rate = iso["rate_hz"].tolist()
|
||||
t1_p99 = iso["t1_p99_us"].tolist()
|
||||
t3_p99 = iso["t3_p99_us"].tolist()
|
||||
|
||||
fig, ax = plt.subplots(figsize=(6, 3.2))
|
||||
ax.plot(rate, t1_p99, "o-", label="T1 datagram P99", linewidth=1.5)
|
||||
ax.plot(rate, t3_p99, "^:", label="T3 RTT P99 (100 Hz)", linewidth=1.5)
|
||||
ax.set_xscale("log")
|
||||
ax.set_xlabel("Concurrent T1 datagram rate (Hz, log scale)")
|
||||
ax.set_ylabel("P99 latency (µs)")
|
||||
ax.set_ylim(0, max(max(t1_p99), max(t3_p99)) * 1.4)
|
||||
ax.legend(fontsize=9, loc="upper left")
|
||||
ax.spines[["top","right"]].set_visible(False)
|
||||
plt.tight_layout()
|
||||
#plt.savefig(FIGURES / "isolation.pdf", bbox_inches="tight")
|
||||
#plt.savefig(FIGURES / "isolation.png", dpi=150, bbox_inches="tight")
|
||||
```
|
||||
: Substrate-initiated T3 bidirectional-stream RTT P99 (ms) under the same sweep. Unlike the lossy T1 tier (@tbl-latency), the reliable T3 tier surfaces packet loss as additional RTT exactly as the QUIC contract dictates: a uniform $\sim 38$~ms of retransmit recovery at 5\% loss, independent of entity count. Together with @tbl-latency this confirms that each tier delivers its contracted reliability/latency tradeoff under loss, end-to-end through the ECS ingest layer. {#tbl-t3-rtt}
|
||||
|
||||
**ECS tick rate under real network load.** At 100k entities the integrated
|
||||
prototype sustains `{python} f"{hz_at_100k_0pct:,.0f}"`~Hz within
|
||||
@@ -381,35 +328,39 @@ the bounded ingest channel without stalling the ECS schedule.
|
||||
|
||||
**Cross-tier isolation.** @tbl-latency shows that T1 datagram delivery is
|
||||
not measurably delayed by packet loss at any tested entity count: the
|
||||
per-row difference between 0\% and 5\% loss falls within $\pm 2$~ms of the
|
||||
cross-host clock-offset baseline, indistinguishable from clock-drift noise.
|
||||
@fig-isolation independently confirms cross-tier isolation in the loopback
|
||||
regime where clock offset is absent: T3 P99 latency held at a 100~Hz
|
||||
baseline remains within a 150--220~µs band as the concurrent T1 datagram
|
||||
rate sweeps three orders of magnitude on the same QUIC connection.
|
||||
Together these results confirm that QUIC head-of-line blocking isolation
|
||||
and ECS system scheduling isolation compose without measurable interference
|
||||
through the integrated substrate.
|
||||
per-row difference between 0\% and 5\% loss falls within $\pm 3$~ms of
|
||||
the cross-host clock-offset baseline, indistinguishable from clock-drift
|
||||
noise. @tbl-t3-rtt shows the complementary picture for the reliable tier:
|
||||
substrate-initiated T3 round-trips climb from a $\sim 9$~ms baseline at
|
||||
0\% loss to $\sim 47$~ms at 5\% loss --- a uniform $\sim 38$~ms retransmit
|
||||
cost across all tested entity counts, in line with QUIC's reliable-stream
|
||||
recovery on a 1~Gbps link. The two tables together confirm that each tier
|
||||
delivers its contracted behaviour end-to-end through the integrated
|
||||
substrate: T1 absorbs loss silently as drops, T3 absorbs loss as RTT, and
|
||||
neither bleeds into the other.
|
||||
|
||||
**Memory scaling.** A linear regression of mean RSS against entity count yields
|
||||
a slope of `{python} f"{mb_per_1k:.2f}"`~MB per 1,000 entities
|
||||
(R^2^ = `{python} f"{r2_memory:.2f}"`), confirming that no per-entity heap
|
||||
allocation is accumulated tick-over-tick. The slope is well below the
|
||||
1.02~MB-per-1{,}000 figure reported for the standalone ECS benchmark on a
|
||||
1.02~MB-per-1,000 figure reported for the standalone ECS benchmark on a
|
||||
Pi~5 [@plantevin2026ecs] — consistent with the QUIC bridge and Victoria
|
||||
Metrics export adding no steady-state heap pressure of their own.
|
||||
|
||||
## Discussion
|
||||
|
||||
Three operational conclusions follow. First, ECS and QUIC are genuinely
|
||||
Two operational conclusions follow. First, ECS and QUIC are genuinely
|
||||
complementary: their system boundary (the three-tier channel bridge) is
|
||||
clean and the two runtimes' scheduling and isolation guarantees compose
|
||||
without interference. Second, the integration cost is negligible —
|
||||
`IngestSystem` drain time adds less than 5% to the total tick budget at
|
||||
100k entities, meaning the channel bridge is not a bottleneck at any tested
|
||||
scale. Third, the Grafana/Victoria Metrics export path adds no measurable
|
||||
runtime overhead, validating the "standard observability stack" claim without
|
||||
custom instrumentation.
|
||||
without measurable cross-tier interference, as @tbl-latency and
|
||||
@tbl-t3-rtt jointly demonstrate. Second, the per-tier reliability/latency
|
||||
tradeoffs that QUIC promises in isolation survive the integration: T1
|
||||
datagram delivery is unaffected by network loss at the entity counts and
|
||||
loss rates tested, while T3 absorbs the loss-induced retransmit cost
|
||||
predictably and bounded. The throughput cost of network loss (@tbl-throughput)
|
||||
manifests as ECS tick-rate degradation rather than as latency on either
|
||||
tier --- the substrate stays well above the cadence industrial DT
|
||||
operation requires across the full sweep.
|
||||
|
||||
# Related Work {#sec-related}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user