This post discusses the factors that affect the quality of Village Telco phone calls, in particular under marginal conditions like heavy mesh load and self-interference. This information was gathered from VOIP over Mesh papers (e.g. [1]) and discussion threads on the Village Telco Google Group in October and November 2009.
Each node in the Village Telco mesh will be a Mesh Potato (MP) or similar device such as a Nanostation 2 with similar processing power and Wifi performance.
In the typical Use Case a Mesh Potato (MP) is mounted on a short mast 1m above a metallic roof of a dwelling in a village or town ship. A secondary use case may see the MP mounted inside the dwelling. Other MPs will be located a few 100m away to form the mesh. The MPs will often have line of site, but are unlikely to have a free Fresnel zone. Signal strengths from adjacent nodes will be relatively high however there will be significant multipath.
In this environment:
- Multipath effects will have a similar effect to increasing transmitter power and receiver sensitivity. For example moving the antenna a few cm up and down on the mast may increase or decrease the signal strength several dB due to signal re enforcement and annulment in the multipath environment.
- Antenna directivity could be a useful to remove the effects of directional interference. However by definition a mesh network needs to be able to detect nodes in many directions so at least in the centre of the mesh a roughly omnidirectional pattern is desirable (Note there is strong debate on this point on the mailing list).
- Surprisingly the speech codec and channel bit rate have a fairly small effect on mesh capacity for VOIP traffic. This is because of large overheads in (i) VOIP packet structure and (ii) the 802.11 MAC protocol. This is explained in the next section.
Packet and 802.11 MAC Overheads: Packet Rate is Key
VOIP packets are very small compared to other traffic on a Wifi network. Consider a 33 byte GSM codec packet compared to a packet of web traffic that may be up to 1500 bytes. To transmit this GSM codec packet using VOIP we must add a RTP header (12 bytes), a UDP header (8 bytes) and an IP header (20 bytes), giving a total IP packet size of 73 bytes. To send one GSM codec payload packet every 20ms (13.2 kbit/s) therefore requires an IP level bit stream of 29.2 kbit/s.
There are futher overheads due to the 802.11 MAC headers and protocol used to reliably transmit data over the Wifi channel. These are optimised for larger packets (1500 bytes). The time required to send 73 bytes at 11 Mbit/s is just 53 uS, however in practice around 800-1000 uS is required before the next packet can be sent. This time is consumed by MAC data, ACKs, physical layer synchronisation and various programmed delays and timers.
The Wifi channel capacity for VOIP is therefore dominated by 802.11 MAC layer overhead. This sets an upper limit on the number of packets/second we can send over the channel, and hence the number of VOIP calls that can be supported.
Packet rate (measured in packets/s) is the key factor for VOIP over Wifi capacity.
The channel bit rate has a relatively small effect as it only affects the small amount of time spent transmitting the payload IP packet. The speech codec bit rate has an even smaller effect as it only affects the size of a small part of the IP packet.
Using an x86 PC with built-in Wifi hardware Elektra has measured the maximum packet rate for 802.11b as 1350 packets/s and 802.11g as 3510 packets/s. This was measured using 100 byte packets with a small amount of ambient Wifi interference. Alex managed over 4000 packets/s on 802.11g. The use of x86 PCs ensures the tests were not CPU limited, i.e. the limits are likely to be the 802.11 MAC protocol.
The inefficiency of the VOIP over Wifi channel can be measured by comparing the throughput compared to the channel bit rate. The throughput at 11 Mbit/s (100 byte packets at 1350 packet/s) is 1.08 Mbit/s. The though put at 54 Mbit/s (100 byte packets at 3510 packet/s) is 2.81 Mbit/s.
For comparison with larger packets (1500 bytes) the Wifi protocols achieve throughputs of around half the channel bit rate, for example 5 Mbit/s over an 11 Mbit/s 802.11b link.
One way to improve throughput is to send multiple codec packets in every Wifi packet. In the literature this is known as source aggregation. We have been experimenting with 4 GSM codec packets per Wifi Packet. This effectively increases the VOIP call capacity by a factor of 4, at the expense of increased delay and possibly robustness under packet loss conditions.
Given the higher potential packet rates of 802.11g Elektra has suggested we configure the nodes to run 802.11g only – in 802.11bg compatibility mode the 802.11g packet rates are limited to 802.11b levels.
Self Interference and a Model for Mesh Capacity
In the use case above we assume good physical layer links. In this case the main interference source will be other nodes on our own mesh. The Wifi protocols arbitrate use of the channel between nodes, but clearly the capacity of the link must decrease if multiple nearby nodes all wish to transmit on the same channel.
The purpose of a mesh network is to relay data between distant nodes, e.g. node A sends data to C via B as A and C cannot send data directly. However only one node can transmit at any one time. Sending a single packet between A & C requires two packet transmissions, consuming 2 packets from the channel capacity.
Therefore the number of VOIP calls the mesh can support depends on the number of hops between nodes. In practice calls will span different numbers of hops. However we can model the number of calls as the (packet capacity)/(average number of hops).
Here is an example:
For a full duplex call using GSM codec we require 50 codec packets/s in each direction or a total of 100 packets/s. We aggregate 4 codec packets in each Wifi packet which reduces the mesh load to 100/4 = 25 packet/s for each call.
In this example we assume we have measured our mesh capacity as 1000 packets/s. Therefore the call capacity is 1000/25 = 40 calls for single hop links. If the average number of hops is 3, we have a call capacity of 1000/(25*3) = 13 calls.
CPU Load and Test Mesh Results
The VOIP over Wifi channel is characterised by many small packets, rather than a smaller number of large packets found in other forms of Internet traffic (e.g. fetching Web pages).
Just like the 802.11 protocols, low speed commodity router CPUs are designed to effectively handle large packets. Many small packets create many interrupts and context switches inside the router CPU which increases the CPU load. Data throughput at high packet rates tends to drop.
In addition to routing Mesh Wifi packets, each Mesh Potato must perform CPU-intensive operations such as echo cancellation, DTMF detection, and speech compression (the speech codec).
Some real world tests were performed using a two hop mesh. Two of the nodes were Nanostation 2s, the third node was a V1.1 Mesh Potato. All of the CPUs are Atheros AR231x variants, of approximately the same CPU performance (180 MHz). The nodes were all in the same room giving a good physical layer links, presumably 802.11g at a high bit rate. The mesh was loaded using iperf in UDP mode. This allowed a variable number of packets/s to be sent over the 2 hop mesh.
It was found that a throughput of around 900 packet/s (at < 1% packet loss, 200 byte packets) was possible over the two hop mesh with the Mesh Potato in the on-hook (inactive) state. Over a single hop around 1900 packets/s was possible with a 1% packet loss. For comparison the 802.11g x86 Wifi tests performed above achieved around 3500 packets/s.
During the two-hop tests the CPU load of the Mesh Potato was measured as 85% (all in the kernel). This suggests that receiving, routing, and re-transmitting mesh Wifi packets consumes a considerable amount of CPU on the Mesh Potato.
These tests suggest that the VOIP capacity of the Village Telco using 802.11g is currently constrained by CPU. This could be improved by optimising the kernel code responsible for receiving, routing, and transmitting the Wifi packets, possible via kernel compilation options.
When we try to send more packets than the mesh network can support we experience packet loss. To test the effect of packet loss we loaded the two-mesh network at 900 packet/s, then placed a VOIP test call across the mesh. The quality of the call could be monitored at variable packet loss rates. At 2% packet loss some click and pops started to become noticeable, however the call was still intelligible at 5-10% packet loss.
This is a fortunate result – call quality degrades gracefully in the presence packet loss rather than all calls on the mesh falling over. Adding a few more calls to a heavily loaded mesh should not cause a drastic drop in quality.
Consider a typical Village Telco mesh containing 150 nodes. We estimate a 10% activity factor, so we shall design the networks such that 15 calls may be operating at any one time. As the number of nodes (150) is much greater than the capacity of the mesh (15 calls) some control mechanism may be necessary to prevent a large number of calls being made at the same time. This could be achieved via scripts running on the Village Telco’s Asterisk server.
Conclusions
In the typical Village Telco Use Case we assume good physical layer links between mesh nodes. We have therefore not considered weak signal performance of performance in the presence of interfering signals from other services. After the initial deployments we should revisit these areas based on practical experience.
The throughput the mesh network can support is about 10% of the channel bit rate due to the effects of short VOIP packets.
For VOIP over mesh the maximum packet rate that the mesh can support is key. This is constrained by the 802.11 MAC algorithms to be around 3500 packets/s for 802.11g and around 1350 packets/s for 802.11b.
In our Use Case self-interference from other nodes on our mesh causes limits call capacity. A model for estimating the number of simultaneous calls possible on a mesh network is given above.
In practice the low end routers we are using are CPU-limited which further limits the packet rate to around 1900 packets/s for a single hop link (900 packets/s for a two-hop link).
VOIP signals are reasonably robust to packet loss when under high load, but it is advisable to limit the total number of active calls to avoid overload of the mesh.
In the next post we will discuss various ways to improve Village Telco performance.
References
[1] There are many papers discussing Voip of Mesh network performance, a good starting point is the list of publications published as a comment by Saritha Kalyanam on my blog. The list includes the following papers:
Performance Optimizations for Deploying VoIP Services in Mesh Networks
http://www.cs.sunysb.edu/~samir/Pubs/jsac-mesh-06.pdf
VoIP over WLAN 802.11b simulations for infrastructure and ad-hoc networks
http://www.it.uc3m.es/~acuevasr/publicaciones/LCS06.pdf
10 things you should know about VoIP over wireless
http://blogs.techrepublic.com.com/10things/?p=205
Simultaneous VoIP Calls Capacity Over an 802.11 Ad Hoc Network
http://ww1.ucmss.com/books/LFS/CSREA2006/ICW5018.pdf
Analysis of UDP, TCP and Voice Performance in IEEE 802.11b Multihop Networks
http://www.ew2007.org/papers/1569014366.pdf