Telephony
Telephony is the technology that carries voice communication over a distance. The word originally described the analog telephone network — copper wires, switches, and dial tones — but today it's an umbrella term covering everything from legacy landlines to internet-based voice systems and the AI voice agents built on top of them. Telephony is the foundation layer for every voice-driven customer experience.
For modern businesses, telephony almost always means cloud-based, software-defined voice. The legacy public switched telephone network (PSTN) still exists, but new deployments overwhelmingly run on internet protocols.
How modern telephony works
A voice call breaks down into three layers: signaling (setting up, modifying, and tearing down the call), media (the actual voice data flowing between endpoints), and routing (deciding where the call should go). On legacy networks, dedicated circuit-switched hardware handled all three. On modern IP networks, signaling and media are decoupled — signaling typically runs over the SIP protocol, while media flows as RTP packets over the internet.
This architectural shift is what makes everything from call recording to real-time transcription to AI voice agents possible. Once voice is software, anything you can do with audio data — transcribe it, analyze it, generate it — becomes available in the call path.
Telephony vs. VoIP vs. SIP
These three terms are often conflated. Telephony is the broad category — any technology that carries voice communication. VoIP (Voice over IP) is the family of technologies that delivers voice over internet protocols rather than the traditional phone network. SIP (Session Initiation Protocol) is the specific signaling protocol that most VoIP systems use to set up and manage calls. Said simply: telephony is the goal, VoIP is the modern way to do it, and SIP is the protocol most VoIP systems speak.
Telephony components in a modern stack
A production voice stack typically includes:
- Carriers and number providers: Assign phone numbers and route calls to and from the PSTN.
- Session border controllers (SBCs): Sit at the edge of the network, handling security, codec translation, and protocol normalization.
- PBX or cloud telephony platform: The brains — call routing, voicemail, IVR, call recording.
- Endpoints: Desk phones, softphones, browser-based clients, or AI voice agents.
- Media servers: Handle the audio stream — recording, mixing, transcription, text-to-speech.
Telephony in the contact center
For a contact center, telephony is the foundation everything else sits on. The automatic call distributor uses telephony signaling to queue and route calls. IVR and AI voice agents intercept calls at the telephony layer. Recording, real-time monitoring, and conversational analytics all tap into the media stream. Modern CCaaS platforms bundle the entire telephony layer so contact-center operators never have to think about SBCs or codecs — they just plug in numbers and start routing.
How AI voice agents use telephony
An AI voice agent participates in the same telephony fabric as a human agent. The call arrives at a SIP endpoint; the audio stream is fed into a speech-to-text model in real time; an LLM-driven conversational AI stack processes the transcript and generates a response; a text-to-speech model produces the audio reply; and the audio is sent back through the media stream. The user experiences a natural voice conversation. Everything that makes this feel real — sub-second latency, natural prompt design, barge-in support — depends on tight integration with the underlying telephony layer.
Trends in modern telephony
The PSTN is being progressively retired in many countries (the U.K. is shutting it down in 2027), and global business telephony is consolidating onto cloud platforms. As the FCC notes, VoIP and IP-based voice are now the default for both enterprises and consumers. The next wave is AI-native telephony — voice stacks that assume an AI agent participates in every call from the start rather than being bolted on.
Frequently asked questions
What is telephony? Telephony is the technology that transmits voice communication over a distance — historically over copper phone lines, today predominantly over internet protocols.
What is the difference between telephony and VoIP? Telephony is the general category for voice communication. VoIP is the specific modern technology that delivers it over internet protocols instead of the traditional phone network.
What is SIP in telephony? SIP — Session Initiation Protocol — is the signaling protocol most modern VoIP systems use to set up, modify, and tear down voice calls. It handles call control; the audio itself flows over a separate media protocol.
What is cloud telephony? Cloud telephony delivers voice services from a hosted, software-defined platform rather than from on-premise hardware. Calls are routed and processed in the provider's cloud and reach users over the internet.
How does AI fit into telephony? AI voice agents now participate in voice calls in real time — transcribing the caller, generating a response, and synthesizing voice — all through the same telephony fabric a human agent uses.
For a deeper dive, download Decagon's guide to agentic AI for customer experience.

