feat(FS-1988): add min_attempts and absolute_max_elapsed_time_ms to BackoffStrategy#342
Open
aballman wants to merge 1 commit into
Open
feat(FS-1988): add min_attempts and absolute_max_elapsed_time_ms to BackoffStrategy#342aballman wants to merge 1 commit into
aballman wants to merge 1 commit into
Conversation
…ackoffStrategy
Closes a short-circuit in retry_with_backoff{,_async} where a single slow
first attempt can exhaust max_elapsed_time before any retry fires. With
the partitioner config (max_elapsed_time=5min, CLIENT_TIMEOUT_MS=30min),
any chunk whose first attempt exceeds 5 minutes blew the retry budget
on attempt 1 -- zero retries on subsequent transient errors. Documented
in two customer-visible regressions in the platform partition path.
New fields on BackoffStrategy:
* min_attempts (default 0) -- minimum number of retry attempts that must
fire before max_elapsed_time is honored. Counts retries (not the
initial attempt). Default 0 preserves existing behavior.
* absolute_max_elapsed_time_ms (default None) -- cap on when a new retry
can START. Does NOT interrupt an in-flight func() call. Worst-case
wall-clock under this cap is absolute_max_elapsed_time_ms +
per_attempt_timeout. Default None preserves existing behavior.
Loop changes in both retry_with_backoff and retry_with_backoff_async:
* Post-attempt cap check honors min_attempts as a floor on the soft cap
and treats the hard cap as unconditional.
* Pre-sleep hard-cap check refuses to sleep into a doomed retry whose
projected start would exceed the hard cap.
* Post-sleep verification (belt-and-suspenders against late wakeups and
rounding drift in the projection).
* Helper extraction (_cap_hit_after_attempt, _raise_or_return_after_cap)
dedupes logic between sync and async paths.
Validation rejects min_attempts < 0, absolute_max_elapsed_time_ms <= 0,
and absolute_max_elapsed_time_ms below max_elapsed_time.
.genignore: ignore src/unstructured_client/utils/retries.py to preserve
these fields across Speakeasy regens. Documented merge procedure for
future template updates. Pushing these fields upstream to Speakeasy
templates is filed as a follow-up.
Tests:
* T1-T14 in unit/test_retries.py with a fake-clock harness that
monkeypatches time.time / time.sleep / asyncio.sleep / random.uniform.
Covers the v1 reproducer (slow first attempt + min_attempts floor),
floor-is-not-a-ceiling semantics, hard cap overrides floor, sleep
truncation, TemporaryError early-return through both caps, and
PermanentError short-circuit immunity.
* Validation tests for the new __init__ guards.
* New integration test
test_split_pdf_cache_tmp_data_chunk_request_stream_is_replay_safe
pins the body-replay invariant: chunks built from open file objects
(cache_tmp_data=True) must produce replay-safe httpx requests so SDK
retries actually deliver the original multipart payload. Iterates
request.stream twice directly to bypass request.read() caching.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the SDK retry-budget short-circuit documented in FS-1988.
retry_with_backoff_asynccurrently checksnow - start > max_elapsed_timebefore sleeping. WithRetryConfig.max_elapsed_timeset to 5 min (per platform-plugins#1170) and per-chunk httpx client timeouts of 30 min (unstructured-api) / 60 min (vlm), any chunk whose first attempt takes longer than the retry budget blows the budget on attempt 1 — zero retries fire on subsequent transient errors. Two customer-visible regressions in the platform partition path traced back to this.What changes
New optional fields on
BackoffStrategy(default values preserve existing behavior):min_attempts: int = 0— minimum retry attempts that must fire beforemax_elapsed_timeis honored. Counts retries, not the initial attempt.min_attempts=2permits 1 initial + at least 2 retries (3 total attempts) before the soft cap can cut the loop.absolute_max_elapsed_time_ms: int | None = None— cap on when a new retry can START. Does NOT interrupt an in-flightfunc()call. Worst-case wall-clock under this cap isabsolute_max_elapsed_time_ms + per_attempt_timeout.Loop changes in both sync (
retry_with_backoff) and async (retry_with_backoff_async) paths:retries >= min_attempts; hard cap unconditional._cap_hit_after_attempt,_raise_or_return_after_cap) dedupes the soft/hard cap logic between sync and async.Validation in
BackoffStrategy.__init__rejectsmin_attempts < 0,absolute_max_elapsed_time_ms <= 0, and hard cap below soft cap..genignoreupdated to preserve these fields across future Speakeasy regens, with a documented merge procedure matching thegeneral.pyandusers.pyprecedent. Pushing the change upstream to Speakeasy templates is a separate follow-up.Design choices
min_retries) to keep the publicBackoffStrategyinterface stable. The docstring is the contract — see the comment block inretries.pyfor unambiguous semantics.func()cannot be interrupted from the retry loop. Documented in the field docstring. For the partitioner consumer, pair with a sensible per-attempt timeout (CLIENT_TIMEOUT_MS) to keep the worst case bounded.min_attempts > 0and/orabsolute_max_elapsed_time_ms.Tests
49 tests pass (46 unit + 3 split-PDF retry integration). New coverage:
_test_unstructured_client/unit/test_retries.py: fake-clock harness monkeypatchingtime.time/time.sleep/asyncio.sleep/random.uniform. Covers the FS-1988 reproducer (slow first attempt +min_attemptsfloor), floor-is-not-a-ceiling semantics, hard cap overrides floor, sleep truncation,TemporaryErrorearly-return through both caps,PermanentErrorshort-circuit immunity, andBackoffStrategy.__init__validation.test_split_pdf_cache_tmp_data_chunk_request_stream_is_replay_safeinintegration/test_decorators.py: pins the body-replay invariant for chunk requests built from open file objects (thesplit_pdf_cache_tmp_data=Truepath). Iteratesrequest.streamtwice directly — bypassesrequest.read()caching — so a future Speakeasy template change that produced a single-consumption stream would fail this test.Adversarial review
This PR was iterated through four rounds of
codex review(v1 → v2 → v3 → v3.1 → final2) before opening. The most consequential findings:min_attemptsalone, recreating the partitioner's ~92-min slot-pinning failure. Resolution: add the hard cap.request.read()-based regression test would pass even with a non-replayable stream. Resolution: rewrite to iteraterequest.streamtwice directly..genignoreand tests were planned but not actually committed. Resolution: commit them in the same PR.Final review: "No blocking issues found."
Test plan
uv run pytest _test_unstructured_client/unit/test_retries.py— 46 passuv run pytest _test_unstructured_client/integration/test_decorators.py::test_split_pdf_*retry* _test_unstructured_client/integration/test_decorators.py::test_split_pdf_cache_tmp_data_*— 3 passmin_attempts=2+absolute_max_elapsed_time_ms=15_min; SND validation before staging/prod cutoverFollow-ups (separate tickets)
min_attempts+absolute_max_elapsed_time_msupstream to Speakeasy templates so.genignorecan eventually be removed.PARTITIONER_RETRY_MIN_ATTEMPTSenv-var feature flag.