Commit Graph

2497 Commits

Author SHA1 Message Date
Ayaan Zaidi
6268ed57ea fix(agents): stop param shadowing in auth failure marker 2026-02-22 21:00:17 +05:30
Peter Steinberger
d0b59270a7 refactor: dedupe auth-profile failure marking and rotation test setup 2026-02-22 15:44:10 +01:00
Glucksberg
53adae9cec fix(telegram): add dnsResultOrder=ipv4first default on Node 22+ to fix fetch failures (#5405)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 71366e9532b6c67f0413b65a9ac8623eae000e9b
Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus
2026-02-22 20:07:51 +05:30
Peter Steinberger
3e2849c578 fix: align timeout cooldown behavior docs/tests (#22622) (thanks @vageeshkumar) 2026-02-22 15:34:20 +01:00
Vageesh Kumar
71d0b86352 fix(agents): skip auth profile cooldown for timeout failures
A timeout is model/network-specific, not an auth issue. Marking the
auth profile as failed on timeout poisons fallback models on the same
provider (e.g. gpt-5.3 timeout would block gpt-5.2 via shared profile
cooldown). The prompt-phase path already guards against this; this
aligns the post-response timeout path to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 15:34:20 +01:00
Peter Steinberger
4c355a28a3 refactor: centralize tool-error visibility policy 2026-02-22 15:30:53 +01:00
Peter Steinberger
835be4392e fix: gate tool error details behind verbose 2026-02-22 15:26:47 +01:00
Peter Steinberger
d116bcfb14 refactor(runtime): consolidate followup, gateway, and provider dedupe paths 2026-02-22 14:08:51 +00:00
Peter Steinberger
adfbbcf1f6 chore: merge origin/main into main 2026-02-22 13:42:52 +00:00
Peter Steinberger
1becebe188 fix: harden session lock contention and cleanup 2026-02-22 13:40:55 +00:00
Peter Steinberger
2c40a20737 test: trim background hold duration in abort coverage 2026-02-22 12:38:57 +00:00
Peter Steinberger
5b23159c4c test: create homedir before sandbox image mkdtemp 2026-02-22 12:35:38 +00:00
Peter Steinberger
96515a5729 test: merge duplicate read-tool content coverage cases 2026-02-22 12:32:05 +00:00
Peter Steinberger
c8a4977378 test: replace mtime sleep with explicit utimes bump 2026-02-22 12:29:53 +00:00
Peter Steinberger
dc356ae1c2 test: remove duplicate workspace path-resolution case 2026-02-22 12:27:55 +00:00
Peter Steinberger
c7a4346e4d test: remove sharp dependency from read-tool metadata test 2026-02-22 12:27:10 +00:00
Peter Steinberger
60a0291bf8 test: dedupe workspace path-resolution scenarios 2026-02-22 12:25:57 +00:00
Peter Steinberger
07527e22ce refactor(auth-profiles): centralize active-window logic + strengthen regression coverage 2026-02-22 13:23:19 +01:00
Peter Steinberger
1152b25866 fix(gateway): guard trim crashes in subagent flow 2026-02-22 13:21:26 +01:00
Peter Steinberger
0d0f4c6992 refactor(exec): centralize safe-bin policy checks 2026-02-22 13:18:25 +01:00
Artale
51e9c54f09 fix(agents): skip bootstrap files with undefined path (#22698)
* fix(agents): skip bootstrap files with undefined path

buildBootstrapContextFiles() called file.path.replace() without checking
that path was defined. If a hook pushed a bootstrap file using 'filePath'
instead of 'path', the function threw TypeError and crashed every agent
session — not just the misconfigured hook.

Fix: add a null-guard before the path.replace() call. Files with undefined
path are skipped with a warning so one bad hook can't take down all agents.

Also adds a test covering the undefined-path case.

Fixes #22693

* fix: harden bootstrap path validation and report guards (#22698) (thanks @arosstale)

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-22 13:17:07 +01:00
Peter Steinberger
7c3c406a35 fix: keep auth-profile cooldown windows immutable in-window (#23536) (thanks @arosstale) 2026-02-22 13:14:02 +01:00
artale
dc69610d51 fix(auth-profiles): never shorten cooldown deadline on retry
When the backoff saturates at 60 min and retries fire every 30 min
(e.g. cron jobs), each failed request was resetting cooldownUntil to
now+60m.  Because now+60m < existing deadline, the window kept getting
renewed and the profile never recovered without manually clearing
usageStats in auth-profiles.json.

Fix: only write a new cooldownUntil (or disabledUntil for billing) when
the new deadline is strictly later than the existing one.  This lets the
original window expire naturally while still allowing genuine backoff
extension when error counts climb further.

Fixes #23516

[AI-assisted]
2026-02-22 13:14:02 +01:00
Peter Steinberger
47c3f742b6 fix(exec): require explicit safe-bin profiles 2026-02-22 12:58:55 +01:00
Peter Steinberger
29cc7f431f test: share runtime scan filters and cached test scans 2026-02-22 12:44:44 +01:00
Peter Steinberger
3a65e4b523 test: make snapshot env override assertion independent of host env 2026-02-22 12:40:30 +01:00
Peter Steinberger
a4607277a9 test: consolidate sessions_spawn and guardrail helpers 2026-02-22 12:34:55 +01:00
Peter Steinberger
c343132dbb fix(agents): harden bash tool and reply directive handling 2026-02-22 11:29:31 +00:00
Peter Steinberger
50c7aef22f test: stabilize session lock tests and move out of e2e 2026-02-22 11:28:20 +00:00
Peter Steinberger
401106b963 fix: harden flaky tests and cover native google thought signatures (#23457) (thanks @echoVic) 2026-02-22 12:24:53 +01:00
echoVic
9176571ec1 fix(gemini): sanitize thoughtSignatures for native Google provider
Native Google Gemini provider was accumulating 2K-8K tokens of Base64
thoughtSignature blobs per turn, causing premature context overflow.

The sanitizer was only enabled for OpenRouter Gemini, not native Google.

Fixes #23392
2026-02-22 12:24:53 +01:00
Peter Steinberger
78c3c2a542 fix: stabilize flaky tests and sanitize directive-only chat tags 2026-02-22 12:19:33 +01:00
Peter Steinberger
7d09a9e74d test: update agent tool assertions and reclassify suites 2026-02-22 11:18:50 +00:00
Peter Steinberger
fcb86408fd test: move embedded and tool agent suites out of e2e 2026-02-22 11:17:47 +00:00
Peter Steinberger
e441390fd1 test: reclassify agent local suites out of e2e 2026-02-22 11:16:37 +00:00
Peter Steinberger
713e2928b2 test: move duplicate local scenario suites out of agents e2e 2026-02-22 10:56:58 +00:00
Peter Steinberger
bfada9e425 test: move more local agents helper suites out of e2e 2026-02-22 10:55:22 +00:00
Peter Steinberger
4267fc8593 test: reclassify pi embedded helper suites out of agents e2e 2026-02-22 10:53:50 +00:00
Peter Steinberger
adace58505 test: reclassify local helper suites out of agents e2e 2026-02-22 10:53:40 +00:00
Peter Steinberger
1d4e9ad8d1 test: reclassify remaining bash suites as unit tests 2026-02-22 10:48:32 +00:00
Peter Steinberger
ab38e1e6b2 test: reclassify image tool suite as unit test 2026-02-22 10:47:16 +00:00
Peter Steinberger
aa487bd4f3 test: reclassify bash pty suites as unit tests 2026-02-22 10:47:10 +00:00
Peter Steinberger
3c9f98452e test: reclassify tool-result persist hook suite as unit test 2026-02-22 10:46:02 +00:00
Peter Steinberger
047e18693e test: reclassify exec approval-id suite as unit test 2026-02-22 10:45:23 +00:00
Peter Steinberger
17a65a6f4c test: split pure docker exec arg checks from bash e2e suite 2026-02-22 10:44:40 +00:00
Peter Steinberger
239963ac44 perf(test): shrink bash command fixtures and polling windows 2026-02-22 10:43:22 +00:00
Peter Steinberger
1d7dbd8cd9 test: reclassify web fetch/readability suites as unit tests 2026-02-22 10:41:29 +00:00
Peter Steinberger
304eef575b test: reclassify sandbox and web/image tool suites as unit tests 2026-02-22 10:40:40 +00:00
Peter Steinberger
3b09a0d2d0 perf(test): trim bash e2e log fixtures and abort wait bounds 2026-02-22 10:39:18 +00:00
Peter Steinberger
c68bb8d6d5 test: stabilize bash e2e suites with explicit exec approvals mode 2026-02-22 10:37:44 +00:00