Introducing Proactive Agents.
Learn more
Glossary

Large language model

A large language model (LLM) is a type of AI model trained on massive amounts of text data to understand and generate human language. These models learn the statistical relationships between words and phrases at a scale that allows them to answer questions, summarize content, draft responses, and carry out complex reasoning tasks across a wide range of topics.

LLMs underpin most modern AI customer service systems. They power the ability of AI agents to understand nuanced customer requests, generate coherent and contextually appropriate replies, and handle multi-step interactions without requiring rigid, pre-scripted flows.

How large language models work

LLMs are neural networks built on the attention mechanism, a technique that allows the model to weigh the relevance of different words in a passage when generating or interpreting text. Training involves exposing the model to enormous corpora of text and repeatedly adjusting the model's internal parameters to improve its ability to predict what comes next in a sequence.

The scale of an LLM is measured partly by the number of parameters, which are the adjustable values that encode learned patterns. Larger parameter counts generally produce more capable models, though training and running them requires proportionally more compute. Once trained, an LLM can be adapted for specific tasks through fine-tuning or guided at runtime using prompt engineering.

Two practical constraints shape how LLMs are used in production:

  • Context window: The maximum amount of text the model can consider at one time, measured in AI tokens. Longer context windows allow the model to handle extended conversations and large reference documents.
  • Inference time: The time required to generate a response after receiving an input. Lower inference time is critical for real-time support interactions where customers expect near-instant replies.

Why large language models matter for customer experience

Before LLMs, AI-driven support systems depended on rigid decision trees or narrow intent classifiers that broke down when customers phrased requests unexpectedly. LLMs handle variation in phrasing, follow complex instructions, and maintain coherent context across a multi-turn conversation, which makes them far better suited to realistic support interactions.

LLMs also reduce the cost of building support automation. Rather than manually scripting every possible interaction path, teams can configure an LLM-powered agent with policies, knowledge base content, and guidelines. The model handles the language variation, and the team focuses on defining what outcomes are acceptable. This flexibility accelerates deployment and reduces maintenance overhead compared to traditional chatbot approaches.

Deploying LLMs responsibly in support

LLMs can produce confident-sounding but incorrect outputs, a problem known as AI hallucinations. In customer-facing support, a hallucinated policy claim or incorrect product detail creates real problems. Teams mitigate this through retrieval-augmented generation (RAG), where the model is grounded in verified knowledge base content before generating a response, and through guardrails that restrict what topics the model can address.

Ongoing evaluation is equally important. Model behavior can shift across versions, and production traffic reveals edge cases that testing misses. Decagon's guide to AI customer service agent capabilities covers how teams can apply LLMs reliably in support environments. AWS's explanation of large language models offers a solid technical foundation for teams evaluating deployment options.

For further reading, explore Decagon's guide to agentic AI for customer experience and Decagon's report on AI and the next generation of customer experience.

Mike Krieger and Jesse Zhang | Decagon Dialogues 2025

Deliver the concierge experiences your customers deserve

Get a demo