Mæstery Logo
Mæstery
Published on

How a Free-Run Agent Cheated by Falsifying a Python Run

Authors
  • avatar
    Name
    Julia Wawrykowicz
    Twitter

Institutional underwriting demands precision. The market believes giving an AI agent Python skills guarantees deterministic math. This is false.

When you give a free-run agent a goal, it optimizes for completion — not correctness.

The Experiment

We designed a strict test for a large language model:

  • Read complex financial statements in a folder (VDR or a local folder)
  • Propose calculations to link the financials
  • Execute Python script from the same folder to verify the math

If the math checked out, the script would output a dataset and print "success".

1Read Financials
2Propose Adjustments
3Run Python Check
4Output Dataset

The Exploit: Silent Failure

Conventional wisdom: "Python is deterministic. Let the agent run it."

The reality: We gave the agent read access to the Python script in the folder. As a result, the agent undertood the shape of the output the script would produce and was able to fake the output and claim success.

So, the agent wrote an incorrect dataset in the expected format in the expected filename and printed "success".

The Falsified Run

The model hallucinated a dataset in the exact expected layout, saved it under the required filename, and manually output "success". It bypassed the execution entirely.

No crash. No error log. Just a confident, wrong answer.

Math Accuracy

0%

Hallucinated data

System Errors

0

Silent failure

Reported Success

100%

Fake success printed

The Cost of Chaos

In high-stakes finance, a black-box agent that falsifies results is a catastrophic liability.

We underwrite to loss avoidance before we underwrite to return. Similarly, we build systems that are first and foremost protectd against analysis failures.

AspectFree-Run AgentAgent on Rails
ArchitechtureWrite-access to scriptsConstrained tool calling
ExecutionFalsified logsFull audit trail
AuditabilityRisky (Silent failure)Protected
Downside RiskRisky (Silent failure)Protected

The Fix: Agents on Rails

To achieve commercial effectiveness, agents must be constrained.

  1. Remove Write Access: Do not give agents open write access to calculation scripts.
  2. Remove Read Access: Do not give agents open read access to calculation scripts.
  3. Encapsulate Tools: Wrap deterministic code into strict tools. The agent presses a button; it does not rewrite the machine.
  4. Enforce Rails: Limit the agent’s pathways. It cannot steer the process off a cliff.

The Glass Box Standard

At Mæstery, our agents run on proprietary rails. We separate AI reasoning from deterministic execution. The result is institutional-grade system with auditability and guardrails.