Why compression
All people like it cheap (says the marketing division). True or not, the human voice has a lot of wholes during a conversation and other "properties", that allows a more efficient transmitting over a data line than uncompressed through a single 64k bandwidth. So compression is recomended.
The question is, how much is ok and what are the restrictions.
As explained on the former pages, this must be a balance between the CPU power, the cut size of packets, the latency and the remaining quality of the voice. The first lossless compression method was the STAC compression more than 20 years ago. Then the scientists found out, there is no need for "lossless".
You may read the publications about MP3 and how they did it. They have supressed unneeded parts of the sound, a human cannot hear. There are many articles about MP3 out in the web. The VOIP compression is similar. At the end, the other end likes to understand voice, as clear as possible and in contrast to the basic hum and noise of the line.
What is a codec ?
The codec has two jobs. It must cut the voice stream into packets for Internet transmission and may or may not compress the contents of each packet.
So the technician needs to select one of many methods to compress and decompress the voice. The methods are mathematical operations and they are specially designed to compress little packets (about 20ms) in realtime.
We call them codecs. (code and decode = code and decode) Each codec has its own feature. Some are really fast but not very efficient, others are really intelligent but slow.
We will not go into depth here, it is very much confusing and does not help to make a decision. Many codecs are forced and powerd by the chip or DSP. So you must check, what the specs of your selected hardware can do. And you should follow the standards.
Codec comparison by features
Codec | sampling rate (kHz) | bitrate (kbps) | frame size | delay (ms) | license |
Speex | 2.15-24.6 | 30-34 | open source | ||
iLBC | 13.3 | 30 | no charge, but not open-source | ||
GSM-FR | 13 | 20 | proprietary | ||
GSM-EFR | 12.2 | 20 | proprietary | ||
GSM 6.10 | 8 | 13 | 22.5 | ||
G.711 | 8 | ||||
G.711A | |||||
G.711U | |||||
G.721 | 8 | ||||
G.722 | 16 | 48 56 64 | ? | ? | |
G.722.1 | 16 | 20 | |||
G.722.2 | 6.6 - 23.85 | 20 | proprietary | ||
G.723 | 8 | ||||
G.723.1 | 8 | 5.3 - 6.3 | 30 | 37.5 | proprietary |
G.726 | 8 | ||||
G.727 | |||||
G.728 | 8 | 16 | 0.625 | proprietary | |
G.729 | 8 | 10 | 15 | proprietary | |
G.729A | proprietary | ||||
G.729B | proprietary | ||||
G.729AB | proprietary |
Some real impressive samples are here
http://www.speex.org/samples.html
and here :