Architecture
Four symbiotic stages, running on a continuous cycle. Each one feeds the next; together they compose Honey Nudger’s recursive self-improvement loop — the engine that powers every deployment, whether it’s running against the public Hivemind or as a private instance for your team.
No manual prompt engineering. The loop runs itself, discovering new optimizations without human review bottlenecks.
Every hint is treated as a hypothesis and validated through Bayesian A/B testing before it ships.
Each cycle compounds on the last. Better hints change what the system sees next, which changes what gets distilled, and so on.
Send prompts, report outcomes. The four-stage loop sits behind those two endpoints.
Each stage feeds the next, and the handoffs between them are the architecture.
Every interaction the agent has is captured alongside the business outcome it produced — purchases, satisfaction scores, click-throughs, whatever the host system reports back.
Feeds Distill with a stream of outcome-labeled interactions.
The system mines the highest-performing interactions for the patterns that drove them, condensing each pattern into a candidate Optimization Hint — a portable, reusable nudge.
Feeds Verify with newly distilled candidate hints.
Candidate hints are treated as hypotheses and tested live via Bayesian A/B testing (Thompson Sampling). Traffic flows to the strongest candidates first; nothing graduates without statistical evidence.
Feeds Promote with statistically verified champions.
Proven winners enter production and start nudging the next generation of interactions. Underperformers are automatically retired. Each promotion changes what Observe sees next — and the cycle compounds.
Feeds Observe by changing the distribution of interactions captured next.
The Verify stage is what stops Distill’s ideas from becoming production noise. Every candidate hint is treated as a hypothesis and validated through controlled A/B experiments before it earns a promotion.
Bayesian multi-armed bandit allocation dynamically routes traffic to maximize learning while minimizing exposure to underperformers.
KPI attribution waits for outcomes to fully mature before making promote/reject decisions — no premature conclusions.
Winners are automatically promoted to production. Losers are retired. No human review bottleneck.
Send your LLM payloads, get back Optimization Hints, report business outcomes. The system handles everything else.
Send your agent’s system prompt, messages, and a session identifier. Honey Nudger returns contextual Optimization Hints tailored to this specific interaction.
POST /v1/nudge
{
"session_id": "user-session-abc",
"system": "You are a helpful
customer service agent...",
"messages": [
{ "role": "user",
"content": "I want to return
my order" }
]
}{
"nudge_id": "ndg_7f3a2b",
"hints": [
"Lead with empathy —
acknowledge frustration
before discussing process",
"Mention the 30-day
hassle-free guarantee
early in the conversation"
]
}Place Hints in Your Compiled Prompt
compiled_prompt = original_system + """
## Optimization Hints
"""
for hint in nudge_response["hints"]:
compiled_prompt += f"- {hint}\n"
# Send to your LLM as normal
response = llm.chat(
system=compiled_prompt,
messages=messages
)When a KPI event occurs — purchase, satisfaction score, click-through, or any metric — report it with the nudge ID. The system automatically attributes the outcome and closes the learning loop.
POST /v1/honey/ndg_7f3a2b
{
"metric": "customer_satisfaction",
"value": 5
}{
"status": "recorded",
"total_reward": 5.0
}That’s it. From here, the four-stage loop takes over — Distill mines the wins, Verify A/B tests new hypotheses against current performance, and Promote graduates statistically proven winners. The Observe stage then sees the world the new champions create, and the cycle compounds.
Every version of the system under test is scored on the COMB benchmark and published in the live ledger.
See the COMB RSI Benchmark