Date:2026-01-21

LLM integration without the cargo cult

The current wave of LLM integration in production systems has produced a recognisable pattern: teams adopt the technology enthusiastically, copy patterns from blog posts and documentation examples, and end up with integrations that are fragile, expensive to run, and difficult to reason about. This is cargo cult engineering applied to a new domain.

What cargo cult integration looks like

The symptoms are consistent across organisations and stacks:

Prompts that are hundreds of lines long, encoding business logic that belongs in code
LLM calls in hot paths where latency and cost accumulate invisibly
No fallback when the model returns something unexpected
Output parsed with assumptions that break on minor model version changes
Evaluation done by eyeballing a handful of examples rather than systematic testing

None of this is a criticism of the teams involved. The tooling encourages it, the documentation examples demonstrate it, and the time pressure to ship something working is real.

Treating the LLM as a component

The discipline that fixes most of these problems is treating the LLM as a component with a defined interface, not as a magic box you talk to. This means:

Define the contract. What goes in, what comes out, and what constitutes a valid response. If you cannot write this down, you do not yet understand the integration well enough to build it.

Validate outputs. Parse and validate every response before using it. A response that does not conform to the expected schema should be treated as an error, not a special case to handle gracefully inline.

Separate prompt from logic. Prompts are configuration. Business logic is code. Mixing them produces something that is neither maintainable as configuration nor testable as code.

On cost and latency

LLM calls are not free and they are not fast. Both cost and latency are proportional to token volume, which means prompt length is a first-class engineering concern, not a detail to be optimised later.

Cache aggressively where the input is stable. Batch where latency permits. Move LLM calls out of synchronous request paths where possible. These are not optimisations for scale — they are correct engineering practice from the first integration.

Summary

LLM integration does not require new engineering disciplines. It requires applying existing ones: defined interfaces, validated outputs, separated concerns, and measured costs. The cargo cult emerges when teams treat the novelty of the technology as a reason to suspend normal engineering judgement. It is not.

Author

Bjørn T. Dahl Senior Consultant & GM

Co-founder of Apario. Data chain and systems integration specialist. Writes here about those cases that needed too much research not to document somewhere.

On writing software for people who work outside

Licence

Article text is licensed under CC BY 4.0 — you may share and adapt it freely with attribution. Code samples are released under the MIT licence.