Commit Graph

2361 Commits

Author SHA1 Message Date
Vignesh Natarajan
68b92e80f7 Agents: log lifecycle error text for embedded run failures 2026-02-21 19:24:45 -08:00
Vignesh Natarajan
35fe33aa90 Agents: classify Anthropic api_error internal server failures for fallback 2026-02-21 19:22:16 -08:00
Vignesh Natarajan
c45a5c551f Agents: preserve unsafe integer tool args in Ollama stream 2026-02-21 19:08:31 -08:00
Peter Steinberger
8083cb8e0b test(web-fetch): dedupe blocked-url SSRF assertions 2026-02-21 23:58:33 +00:00
Peter Steinberger
a97992fcf2 test(pi-tools): share safeBins e2e setup and teardown 2026-02-21 23:58:33 +00:00
Peter Steinberger
012654c7c5 test(sandbox): table-drive dangerous docker config rejection cases 2026-02-21 23:58:33 +00:00
Peter Steinberger
a353dae14f test(image-tool): share temp agent dirs and table-drive validation cases 2026-02-21 23:58:33 +00:00
Peter Steinberger
f589295a0a test(actions): table-drive discord presence mappings 2026-02-21 23:44:01 +00:00
Peter Steinberger
0afd5d38c5 test(actions): table-drive discord reaction and permission cases 2026-02-21 23:43:01 +00:00
Peter Steinberger
2595690a4d test(actions): table-drive slack and telegram action cases 2026-02-21 23:43:01 +00:00
Peter Steinberger
8922cb4085 test(sandbox): share sandbox-root setup across path cases 2026-02-21 23:38:43 +00:00
Peter Steinberger
d3991d6aa9 fix: harden sandbox tmp media validation (#17892) (thanks @dashed) 2026-02-22 00:31:21 +01:00
Alberto Leal
0bb81f7294 fix(media): allow os.tmpdir() paths in sandbox media source validation
resolveSandboxedMediaSource() rejected all paths outside the sandbox
workspace root, including /tmp. This blocked sandboxed agents from
sending locally-generated temp files (e.g. images from Python scripts)
via messaging actions.

Add an os.tmpdir() prefix check before the strict sandbox containment
assertion, consistent with buildMediaLocalRoots() which already
includes os.tmpdir() in its default allowlist. Path traversal through
/tmp (e.g. /tmp/../etc/passwd) is prevented by path.resolve()
normalization before the prefix check.

Relates-to: #16382, #14174
2026-02-22 00:31:21 +01:00
Alberto Leal
4cf5c3e109 test: add unit tests for resolveSandboxedMediaSource
Add baseline test coverage for the previously untested
resolveSandboxedMediaSource() function, covering sandbox-relative
path resolution, rejection of paths outside the sandbox root,
path traversal prevention, file:// URL handling, HTTP URL
passthrough, and empty input edge cases.
2026-02-22 00:31:21 +01:00
Peter Steinberger
c9593c4c87 test(sandbox): table-drive bind and network validation cases 2026-02-21 23:28:07 +00:00
Peter Steinberger
780bbbd062 fix: restore CI checks after #23012 (thanks @druide67) 2026-02-22 00:16:15 +01:00
Peter Steinberger
843a037532 fix(test): repair readonly case table typing 2026-02-22 00:10:07 +01:00
Peter Steinberger
f3d4045c03 test: matrix owner and timezone system-prompt cases 2026-02-21 23:02:44 +00:00
Gustavo Madeira Santana
738e2c21dd chore(tests): properly check logging in tests 2026-02-21 17:21:48 -05:00
Harry Cui Kepler
ffa63173e0 refactor(agents): migrate console.warn/error/info to subsystem logger (#22906)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: a806c4cb2700564096ce8980a8d7f839f8a0d388
Co-authored-by: Kepler2024 <166882517+Kepler2024@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-21 17:11:47 -05:00
Peter Steinberger
861718e4dc test: group remaining suite cleanups 2026-02-21 21:44:57 +00:00
Peter Steinberger
544a1142b0 test(agents): dedupe skill helper fixtures and cover empty-body rendering 2026-02-21 21:40:39 +00:00
Peter Steinberger
a418c6db06 test(agents): dedupe agent-path fixtures and cover env override precedence 2026-02-21 21:40:39 +00:00
Peter Steinberger
8f1b467646 test(agents): dedupe exec preflight fixtures and cover quoted-path skip 2026-02-21 21:40:39 +00:00
Peter Steinberger
e978297c28 test(agents): dedupe workspace template temp roots and cover fallback resolution 2026-02-21 21:40:38 +00:00
Val Alexander
b703ea3675 fix: prevent compaction "prompt too long" errors (#22921)
* includes: prompt overhead in compaction safeguard calculation.

Subtracts SUMMARIZATION_OVERHEAD_TOKENS from maxChunkTokens in both the main summarization path and the dropped-messages summarization path.

This ensures the chunk budget leaves room for the prompt overhead that generateSummary wraps around each chunk.

* adds: budget for overhead tokens to use an effectiveMax instead of maxTokens naïvely.

- Added `SUMMARIZATION_OVERHEAD_TOKENS = 4096` — a budget for the tokens that `generateSummary` adds on top of the serialized conversation (system prompt, `<conversation>` tags, summarization instructions, `<previous-summary>` block, and reasoning: "high" thinking budget).
- `chunkMessagesByMaxTokens` now divides `maxTokens` by `SAFETY_MARGIN` (1.2) before comparing against estimated token counts. Previously, the safety margin was only used in `computeAdaptiveChunkRatio` and `isOversizedForSummary` but not in the actual chunking loop — so chunks could be built that fit the estimated budget but exceeded the real budget once the API tokenized them properly.
2026-02-21 14:42:18 -06:00
Peter Steinberger
0a207b9860 refactor(test): share temp workspace helper in compact skill path tests 2026-02-21 19:16:22 +00:00
Peter Steinberger
324922f804 refactor(test): dedupe temp dir lifecycle in agents skills directory e2e 2026-02-21 19:16:22 +00:00
Peter Steinberger
b3c7fd6c69 refactor(test): dedupe temp dirs and skill writer in snapshot e2e 2026-02-21 19:16:22 +00:00
Peter Steinberger
85c768d3d2 refactor(test): dedupe temp workspace setup in skills load entries e2e 2026-02-21 19:16:22 +00:00
Peter Steinberger
0401762144 refactor(test): dedupe temp root setup in identity avatar e2e 2026-02-21 19:16:22 +00:00
Peter Steinberger
9ead79937e refactor(test): dedupe temp session path setup in file repair e2e 2026-02-21 19:16:22 +00:00
Peter Steinberger
70fdab6e95 test(agents): add coverage for shared skill writer helper 2026-02-21 19:16:21 +00:00
Peter Steinberger
0876fbde19 refactor(test): reuse shared skill writer in skills e2e 2026-02-21 19:16:21 +00:00
Peter Steinberger
f086245afe refactor(test): reuse shared skill writer in sandbox and bundled tests 2026-02-21 19:16:21 +00:00
Peter Steinberger
96ef00ec38 refactor(test): drop redundant env snapshots in skill download suites 2026-02-21 19:16:21 +00:00
Peter Steinberger
603e28648b refactor(test): centralize temp workspace env handling for skill install tests 2026-02-21 19:16:21 +00:00
Peter Steinberger
61817c90e7 refactor(test): share temp workspace helper for skill download suites 2026-02-21 19:16:21 +00:00
Peter Steinberger
a814cce359 refactor(test): share temp command dir helper in shell utils e2e 2026-02-21 19:16:21 +00:00
Peter Steinberger
3fd7dc5046 refactor(test): snapshot shell/path env in bash tools e2e 2026-02-21 19:16:21 +00:00
Peter Steinberger
272bf2d8bc refactor(test): dedupe env override assertions in skills e2e 2026-02-21 19:16:21 +00:00
Peter Steinberger
7ba09e414f refactor(test): snapshot env in shell utils e2e 2026-02-21 19:13:47 +00:00
Peter Steinberger
5dc1b5a8db refactor(test): reuse env helper in workspace skill sync gating 2026-02-21 19:13:47 +00:00
Peter Steinberger
c0706b7799 refactor(test): reuse env helper in workspace skill status tests 2026-02-21 19:13:47 +00:00
Peter Steinberger
cf371fde6d refactor(test): use env helper in workspace skills prompt gating 2026-02-21 19:13:47 +00:00
Peter Steinberger
8745964142 refactor(test): snapshot PATH env in bash tools exec path e2e 2026-02-21 19:13:47 +00:00
Peter Steinberger
af66e3103a test(agents): cover bundled skills env override and dedupe setup 2026-02-21 19:13:47 +00:00
Peter Steinberger
ae06dbb794 refactor(test): snapshot tar.bz2 skills install env 2026-02-21 19:13:47 +00:00
Peter Steinberger
b44aa5b1f7 refactor(test): snapshot skills install state dir env 2026-02-21 19:13:47 +00:00
Peter Steinberger
884166c7af refactor(test): snapshot telegram action env in e2e suite 2026-02-21 19:13:47 +00:00