In the Age of Uncertainity

In the Age of Uncertainity

2 Min Read
07/03/24

Prompt engineering became more useful for me when I stopped treating it like a collection of clever phrases and started treating it like system design. A good prompt is not just persuasive text. It is an interface between intent, constraints, examples, and an unreliable but very capable model.

The shift was simple: stop asking whether a prompt “sounds smart” and start asking whether it survives real use. Can another person use it? Does it handle messy input? Does it fail clearly when context is missing? Does it produce an output shape that downstream work can trust?

Start with the job to be done

Most weak prompts are written from the perspective of the person authoring them. Battlehardened prompts are written from the perspective of the task. What is the actual decision being made? What information is always available? What ambiguities keep showing up? What does “good” look like in a way a model can actually follow?

If the job to be done is fuzzy, the prompt will be fuzzy. The model is not the first problem.

Design for failure before polish

The fastest way I have found to improve a prompt is to collect its worst outputs. Not the best ones. Not the demo case. The misses.

When a prompt fails, the failure usually fits one of a few patterns:

  • it invents missing context
  • it answers too broadly
  • it ignores the expected output format
  • it optimizes for sounding complete instead of being useful
  • it collapses edge cases into generic advice

Once you know the failure mode, the prompt gets easier to tighten.

Constrain the shape, not just the tone

The most reliable improvements usually come from structure:

  • define the role narrowly
  • pass only the context that matters
  • show one or two representative examples
  • specify the output format explicitly
  • state what the model should do when required data is missing

Tone can help. Structure does more.

Test prompts like product surfaces

If a prompt is going to be used in a workflow, I test it the same way I would test a feature:

  • normal cases
  • incomplete cases
  • adversarial or noisy cases
  • cases from another teammate instead of my own assumptions

The goal is not perfection. The goal is a prompt that behaves predictably enough to become part of a real system.

What battlehardened means

For me, a battlehardened prompt is one that can leave the lab. Another person can use it. The output is easy to inspect. The failure mode is understandable. And the prompt improves because it was used, not because it was endlessly admired.