We have made good progress on the Mesh Potato (MP) so far. The basic functionality (making phone calls over a Wifi Mesh network) is working OK. Over the past few weeks Elektra and I have been concentrating on several specific areas of the MP and testing them in more depth. This posts focuses on some of the tests I have performed, the bugs found, and the final results.
Stability
One simple stability test is uptime. I have left both my MP prototypes running for up to 5 days at a time with good results – Linux and Wifi stayed up, and we still had dial tone from the FXS port. This simple test says a lot: e.g no memory leaks, CPU instability, drastic over temperature problems.
Another test I like to do to all my telephony devices is hammer them with phone calls for 24 hours. Here is the test set-up:
An x86 Asterisk box acts as the call generator. It starts a new call to the MP. When the MP receives the call it starts the FXS port ringing. Another Asterisk box (an IP04 with an FXO port) answers the call, plays a prompt for a few seconds, then hangs up. The entire sequence is then repeated – make thousands of calls over a 24 hour period. The SIP call is placed over Mesh Wifi to make sure all parts of the MP system are being exercised.
This test gives the MP quite a hard time as much of the call processing load is involved in set up and tear down of calls, and unless you are my teenage daughter you can’t really make 3500 calls in 24 hours.
When I started the tests it threw up problems nearly immediately – Asterisk was seg faulting with no feedback as to why it was crashing. This led to a week of fun and games where the problem was eventually traced to thread problems in the Asterisk channel driver I had written (chan_mp). This channel driver has been a painful experience for me (even though I have written a few before), and has taken perhaps 3 weeks of part time work to write and debug. Lots of fun with threads and null pointers. Even now I still don’t quite understand all the issues around Asterisk channel drivers. I would appreciate a code review of the driver if there are any Asterisk channel driver experts out there.
Anyway I finally tracked down the source of the problem and modified the channel driver. It passed the 24 hour stability tests and made 3500 calls.
Mesh Load Test
This is a repeat of the load test from Phase 1 of the Mesh Potato Project, but this time on the actual MP hardware with the FXS driver running:
A big difference between the Village Telco and Cell Phone network is no base stations (Cell Phone Towers). Instead, the calls are routed via the client nodes. It’s a peer-peer network rather than a client-server architecture.
The idea of this test is to make sure that a given mesh potato node can relay 15 phone calls for other people while simultaneously making a phone call of it’s own. This scenario places significant CPU load on the router due to the number of Wifi packets that must be processed at the same time as DSP intensive code like echo cancellation and speech compression.
The set up details of this test are on the wiki and have been blogged about earlier. Speech quality was fine, and loadav about 0.71, similar to the Phase 1 results. Occasionally I could hear some packet loss but overall the call was fine. A good result considering all that processing (Linux, Wifi, mesh, Asterisk, echo cancelling, GSM speech compression, FXS driver) on that little router chip!
One interesting fact is that even with a total of 16 calls the bit rate is only around 500 kbit/s, so the bandwidth of the mesh is quite lightly loaded. However speech packets are rather short, so raising the number of phone calls would likely run into CPU load (rather than Wifi bandwidth) issues due to the per-packet processing load.
Wifi Range Test
Wifi is mainly the domain of Elektra and Jeff on this project but I was keen to get some rough idea of how well the radio was working. I had made a few calls around the house OK but noticed that the received signal level was quite low. A bit of thought led us to a static protection diode we had placed across the antenna circuit. Although a good idea for the RX side it was conducting when transmit signal was present. After removing this diode the TX level jumped right up to roughly comparable levels to a DIR-300 router that uses the same Atheros AR2317 SoC.
However using radiated signal levels it is hard to get consistent results, they bounce around all over the place. Jeff suggested connecting a combination of DIR-300 and Mesh Potatoes in A-B tests via cables and calibrated attenuators to obtain consistent measurements.
Over a few days I made several attempts at range tests. However where I live is quite flat so it’s difficult to get a good Line-Of-Sight (LOS). I did manage to make phone calls over about 250m which is pretty much what we need for a Village Telco Network.
Elektra and I feel the radio performance could probably be improved with some adjustment of the RF circuit components and possibly PCB layout. Elektra has access to a RF test lab later this week so we will get some more concrete information of Wifi performance.
One area we really need more information on is radio calibration data and the Hardware Design Guide document for the AR2317, however we have been unable to obtain this important information from Atheros.
More blogging to come from Elektra on Wifi radio and power supply performance.