Ai Agent: Agentzero

capote · May 21, 2026, 5:28pm

I have now gained some experience. For cost reasons, I am only using Deepseek v4 Pro and Flash.

Notable findings:

Token consumption is 10 times higher than with the Hermes Agent.

1. AgentZero’s disastrous caching behaviour (the biggest single factor)

AgentZero on flash: ICR 0.3:1 — 75% of all input tokens are cache_miss.

Hermes on Flash: ICR 9.9:1 — only 9% cache_miss.

This means: AgentZero rebuilds the context for every request instead of using the cached system prompt.

2. AgentZero uses pro-model for 187.8M tokens

The pro prices are 3–155× more expensive than flash (output: $0.87 vs $0.28; cache_miss: $0.435 vs $0.14; cache_hit: $0.0036 vs $0.0028).

AgentZero’s pro-cache_miss (28.2M tokens) alone costs $12.27.

In bot-to-bot chat via Telegram, A0 tends towards pointless, unproductive status messages and nudging. Despite loop prevention, it always ‘pushes’ itself to the forefront.

It has received the following instructions from me and repeatedly disregards them.

# SECURITY RULESET: BOT-TO-BOT COMMUNICATION (Telegram)

## PRIMARY DIRECTIVE — ANTI-LOOP & ANTI-FLOODING

You are communicating with other bots via Telegram. The following rules have ABSOLUTE PRIORITY
and must not be overridden under any circumstances:

---

## RULE 1: EOC DETECTION (End of Conversation)

If an incoming message ends with or contains the string `//EOC//`:
- IMMEDIATELY stop all further processing of this message.
- Send NO response.
- Terminate all reasoning and processing for this thread.
- Log internally: `[EOC received — processing terminated]`

---

## RULE 2: RESPONSE EXPECTATION CHECK

Before responding to any incoming message, execute this check:

### STEP A — Automatic rejection for the following message types:
Send NO response if the message:
- Is a pure status notification (e.g. "Bot started", "Task completed", "OK", "Done", "✓")
- Is a pure informational message with no recognizable question
- Consists only of a single period (`.`), an emoji, or a single character
- Is an empty string or whitespace-only
- Ends with `//EOC//` (→ Rule 1 applies)
- Contains a loop indicator (repeated identical messages within 60 seconds)

### STEP B — Respond ONLY to PRODUCTIVE messages:
Respond ONLY if the message:
- Contains a direct question (question mark or implicit question structure)
- Contains a concrete task or assignment
- Explicitly requests a response or action
- Provides data or input whose processing requires an output

---

## RULE 3: SELF-APPLIED EOC

If you send a message that does NOT expect or require a further response:
- Append `//EOC//` to the end of your message.

Examples requiring EOC on your own output:
- Completion notices after finished tasks
- Status updates without need for action
- Information that requires no reaction
- Error reports that need no follow-up question
- Confirmations ("Done", "Saved", "Transmitted")

---

## RULE 4: ANTI-LOOP PROTECTION

### Loop Detection:
- Maintain an internal counter for identical or near-identical messages.
- If the same or >80% similar message occurs more than 2× in a row
  or within 60 seconds → STOP, send `//EOC//`, and log the loop.
- Circular conversation patterns (A replies to B, B replies identically to A)
  are automatically terminated with `//EOC//` after 3 cycles.

### Loop Resolution:
If a loop is detected, send once:
`[SYSTEM: Loop protection active — conversation terminated] //EOC//`

---

## RULE 5: CONTENT QUALITY GATE

Before sending any message, verify:
- [ ] Does the message contain substantial, productive information?
- [ ] Is the message necessary for the progress of the task?
- [ ] Would a non-response (with EOC) be better than this response?

BLOCKED — NEVER send these messages:
- Single periods (`.`)
- Single emojis (👍, ✓, 😊, etc.)
- Empty acknowledgements without content ("OK", "Good", "Understood", "👌")
- Hollow filler sentences ("I have taken note of this.")
- Messages that only repeat the question

---

## RULE 6: TARGETED RESPONSE REQUESTS

When you need a response from another bot, formulate the request as follows:
- Ask a clear, answerable question.
- Specify what format or type of response is expected.
- If NO response is needed, append `//EOC//`.

Example CORRECT:
`Analyze the following data and return the result as JSON: [DATA]`

Example INCORRECT:
`I have the data.` ← No EOC, no question → Loop risk!

CORRECT with EOC:
`Data has been transmitted. //EOC//`

---

## SUMMARY — DECISION TREE

Incoming message received
│
▼
Contains //EOC// ?
YES → STOP (no response)
NO ↓
▼
Loop detected (>2 repetitions)?
YES → Send “[SYSTEM: Loop protection active] //EOC//” → STOP
NO ↓
▼
Is it a productive message (question / task / request)?
NO → STOP (no response)
YES ↓
▼
Process and generate response
↓
Does the response require a reaction?
NO → Append //EOC//
YES → Send without //EOC//


---

## PRIORITY ORDER

1. EOC Detection (Rule 1) — highest priority
2. Loop Protection (Rule 4)
3. Response Expectation Check (Rule 2)
4. Content Quality Gate (Rule 5)
5. Normal task processing

oneitonitram · May 21, 2026, 7:41pm

great analysis, thank.s been using token plans with it, so never noticed the cost diferences.

capote · May 21, 2026, 8:03pm

may you could eanable caching by default settings an a removal of dynamic EXTRAS blocks from the prompt (reduction of input tokens from 18k to 5k per request)

The file /a0/agent.py needs to be modified. The EXTRAS block in this response should only contain static content (loaded_skills, memories, solutions). The four cache-killers (current_datetime, agent_info, project_file_structure, remote_file_structure) should no longer be sent.

capote · May 21, 2026, 9:09pm

witch LLM / AI-provider do you use?

oneitonitram · May 22, 2026, 7:43am

I have a $39 Kimi Sub, because their built in Agent is great, for business related documents and some prototyping, but also has generous API usage.
I fallback to crof, when my 5hour limits or weekly limits elapse

I have $20 CrofAI - Pricing subscription, which charges per request. gives me great GLM 5.1, Deepseek and other OSS inference.

and have an On/Off Codex Subscription. these more than meet my usage needs.

For IDE, i have a Trae Subscription using their IDE. it has BYOK, so i can use all my other subs in it, has default generous Limits and has a nifty SOLO(codex like App) that also works with BYOK.

both the IDE< and SOLO share same configurations, skills, slash commands etc. and i can define custom agents, so i always get consistent result no matter what model i use.

capote · May 22, 2026, 1:29pm

How do you switch between LLMs? Have you configured this in A0 so that it switches automatically as needed, or do you specify the LLM in the prompt?

I haven’t quite got to grips with the options yet.

For example, in my case, I’d prefer to use Deepseek v4 Chat for simple tasks and reserve Deepseek Pro for more demanding ones.

Hermes Agent apparently has LLM detection and deployment control built into its framework.

oneitonitram · May 22, 2026, 3:40pm

in agent zero, t the bottom, we have a preset

the Presets, define which model combinations can be used, upon selection.

Also, within the Agent context,

Each Agent, can define its Own Models, to be using. so a researcher can be using a different model, while hacker different model

Equally, Developer or any custom can use different depending.

Also, note that, for development purposes, Agent zero can control a full codex, or Opencode, to do developemnt, updates and tracking of the progres on development and update whats happening.

oneitonitram · May 22, 2026, 6:39pm

@capote cache seems to be a known issue with openai compatible api

github.com/agent0ai/agent-zero

Fix prompt caching for OpenAI-compatible providers

opened 07:04PM - 19 May 26 UTC

NocturnusCoder

## Problem Agent Zero defaults to `explicit_caching=True` (`agent.py:796`) and …injects Anthropic's `cache_control: {"type": "ephemeral"}` into messages (`models.py:374-380`). This field is **silently ignored** by OpenAI-compatible providers like NanoGPT and OpenCode Go, so users pay full token costs with no visibility that caching isn't working. ## Current Behavior by Provider | Provider | Caching Works? | Why | |----------|---------------|-----| | Anthropic (Claude) | ✅ Yes | `cache_control` is native | | OpenAI direct | ❌ No | Expects automatic prefix-matching, no special fields | | NanoGPT / OpenCode Go / OpenRouter | ❌ No | Silently drops unknown `cache_control` key | | DeepSeek | ❌ No | Uses different metadata structure | ## Proposed Solution Add a `caching_strategy` field to model presets: | Strategy | Behavior | |----------|----------| | `anthropic_breakpoint` | Current behavior (default) | | `openai_prefix` | Clean payload, no `cache_control`, relies on provider auto-caching | | `disabled` | Strips all cache-related fields | | `custom` | Passthrough via `extra_body` for power users | ## Minimal Short-Term Fix A boolean `enable_anthropic_caching` toggle per model (default `false`) would prevent the silent failure for most non-Anthropic users without requiring a full strategy system. ## Quantifiable Gains - **Cost:** 20–40% API spend reduction on long-context conversations when caching actually works - **Latency:** Faster multi-turn conversations via cached prefix hits - **Transparency:** Cache hit/miss logging instead of silent failures ## Related Code - `agent.py:796` — `explicit_caching=True` hardcoded - `models.py:374-380` — Anthropic-only `cache_control` injection - `helpers/tokens.py` — No cached token tracking

Observe this issue for when it’s fixed.

capote · May 22, 2026, 9:17pm

New version available

A new version (v1.16) of Agent Zero has been released.

capote · May 22, 2026, 9:58pm

I fixed it with the help of A0.

Based on GitHub Issue #1655, the Anthropic-specific cache_control field has been disabled:

Problem: Agent Zero sent cache_control: {“type”: “ephemeral”} to all providers – DeepSeek silently ignores this field
Solution: Set explicit_caching in /a0/agent.py (line 801) from True to False
Effect: Clean request payload without unnecessary metadata; DeepSeek continues to cache automatically via prefix detection

Cache Hit/Miss Logging

To ensure transparency regarding cache efficiency, a logging system has been implemented in /a0/helpers/tokens.py:

CacheUsage data class with prompt_tokens, completion_tokens, cached_prompt_tokens, cache_hit
parse_cache_from_response(): Extracts cache data from API responses (DeepSeek, OpenAI and Anthropic formats)
log_cache_usage(): Logs cache hit/miss information in the system log for monitoring and cost transparency

capote · May 24, 2026, 1:26pm

A new version (v1.17) of Agent Zero has been released.

LayLow · May 24, 2026, 9:05pm

What is the consensus on release schdule/policy of apps. For does the community need to ask with every update of the core component or is it a ‘responsibility’ of the author of the app to update a release a new version?

This hits the same sweetspot of mentioning and action on BaseOS security issues.

The most dull answer could be ‘It is free as in Beer’. No rant intended, just asking out loud.

oneitonitram · May 26, 2026, 7:47am

@LayLow we try to ship as fast as possible updates to apps, usually after internal testing. If we phased challenges we usually ask community to help test.

Some apps can be a pain to update, while sometime we also tend to miss updates released by core apps themselves.

If a community member notices an update they can send a shout-out .

Regarding the beer @LayLow I wouldn’t mind receiving some , they surely do help somewhere.

We have been building an NS8 specific update scheduler that would fetch apps update, update, trigger action to install and configure in server, test health of app then do a full release…

It’s easier said than done, but we have some initial works on GitHub.

LayLow · May 26, 2026, 11:41am

Fair enough

Nice! Thanks.

capote · May 28, 2026, 3:58pm

A new version (v1.18) of Agent Zero has been released.

oneitonitram · May 29, 2026, 9:37am

current version should Update to v1.18

capote · June 7, 2026, 9:18pm

1.20 has been released

oneitonitram · June 9, 2026, 3:22am

updated the fat that agentzero has a full desktop, been using the builtin esktop to run long running builds, with th elikes of opencode, and kimi code.

capote · June 16, 2026, 4:32am

I updated upt to 1.1.8 via

api-cli run update-module --data '{
  "module_url": "ghcr.io/geniusdynamics/agentzeroai:latest",
  "instances": ["agentzeroai1"],
  "force": true
}'

Docker Image:

runagent -m agentzeroai2 podman inspect agentzero-app --format '{{.Config.Image}}' | sed 's/.*://'
v1.20

But the user interface shows this version and an update prompt.

What’s going on?

There is also a version number at the bottom of the left-hand sidebar.

Version M v1.12 2026-05-03 01:26:04

harry · June 17, 2026, 12:12am

I am a novice here, but the command add-module fails with add-module: command not found.