Basics of Troubleshooting VoIP (Part 1)
One of the difficulties we face when we put voice on IP-based networks is that different groups of administrators control parts of the equipment string. Most notably, voice administrators and firewall administrators have a large part to play in ensuring that voice works properly. Protocol interaction and firewall settings are not the first steps in being able to troubleshoot voice, but from my experience, it is the part where we struggle the most. If you are a voice administrator, before you can start working with your firewall administrator, you need to make sure you understand the protocols that make your voice work.
When I troubleshoot voice, I think of it in two distinct phases: phase 1 is signaling and phase 2 is the actual voice. I normally explain it as being very similar to a PRI T-1. Phase 1 is the D channel. This is where the digits are passed from your CallManager over to the CallManager on the distant end. This is TCP based communication, meaning if you are calling from a phone registered to your CallManager over to a phone registered to another CallManager, the first packet should have a source IP address of your CallManager, destination IP address of the distant end CallManager, and it will be a TCP SYN message. The next packet would be a response from the first packet. It would have a source IP address of the distant end CallManager, destination IP address of your CallManager, and it will be a TCP SYN/ACK. If you have an incorrectly configured firewall in your equipment string, whether it be on your end, the distant end, or a firewall between the two of you, this is one of the most common issues I have seen. When you run Wireshark, or a packet capture on the firewall, you see the SYN messages going out, but never see a SYN/ACK in return. Because you never know who will initiate the call, you need to make sure that all of the firewalls are allowing internal -> external (your CallManager -> distant end CallManager) and external -> internal (their CallManager -> your CallManager).
If you are using an ICT (Intercluster Trunk), it should look like this:
What you can see from the capture above is that ICTs use ephemeral ports (random high numbered ports, which a quick google search gave me as 32,768 to 61,000 for CUCM) on both ends. You will need to make sure your firewall administrators know to open this range of ports if you are using an ICT.
If you are using a SIP trunk, it will look like this:
To generate the capture above, I placed a call from a phone registered to the 192.168.1.10 CallManager to a phone registered to the 192.168.2.10 CallManager. The CallManager that initiates the call will use a source ephemeral port and the SIP destination port (5060). Because all of your calls will not be outbound (you initiating the call), you will need to make sure this communication is allowed both ways.
Phase 2 is the bearer or “B” channel. There are two separate and distinct streams that make up this phase. Voice bearer traffic is UDP, specifically RTP. The main difference between UDP and TCP (for troubleshooting purposes) is that traffic is unacknowledged. The significance of unacknowledged traffic is it can work in one direction and not the other. Have you ever heard the term “one-way audio”? This normally happens when you have a firewall policy permitting your voice out, but not letting the distant end’s voice to come into your network (or vice versa). The RTP port range is 16,384 to 32,767. You need to allow this range into your voice subnet and you need to allow your voice subnet to talk out to this range.
If it is working properly, your RTP stream should look like this:
Like described above, this second stream is separate and distinct from the signaling. If you look closely at the source and destination IP addresses, you will notice the RTP stream does not involve the CallManager (.80 and .90 are IP phones). You will also notice the source and destination IP addresses flip packet by packet. If you run a capture and notice the source is the same every time, that is normally a firewall issue.