|
|
| Home |
The CD FormatUnderstanding the CD-format and other formats at a deepler level, will make it much easier to pursue high fidelity audio playback and recording. For understanding the CD-format we must first understand PCM. PCM means "Pulse Coded Modulation" which is a rather stupid and misleading name for what it really does, as there are neither pulses nor modulation present in the process of PCM ;-)
What PCM doesWell, it does two things: 1) A analog signal (which has the attribute to be continuous) is captured by only looking at it from time to time. If we look at a continuous signal at equally spaced time instants, what we see is only an amplitude: As we look NOW, the amplitude is high. The next NOW-instance (equally spaced) the amplitude is low. This process is called discrete time sampling. We look at a continuous wave, but only at discrete (and equally spaced) time-points. The wave is continuous, but our sampling is time-discrete. 2) Every discrete timepoint, we look at the wave signal, we write down the amplitude that is present at this exact timepoint. And now the C from PCM comes into action: We measure the amplitude and write it down as a number. We humans have 10 fingers and so we would use the decimal-code for writing down numbers. The digital technology we invented works with 2-fingers only and therefore when writing down the amplitudes, we have to use the binary-code. A number-code is a powerful thing! If I would like to communicate that I want to sell you 100 Apples, I could draw an apple on a piece of paper, and then make 100 dots below the apple, which would be an uncoded value. Then you can count the dots and know, how many apples you are able to buy from me. Stupid idea, you say ? Right you are. Its a good thing to have a number code, so as a human with 10 fingers I will draw an apple and below the apple I would write 100. If we would have only two fingers, I would write down the number in binary code: 1100100, and you would certainly know how many apples I mean. With the aid of a number code, we can save storage space. Instead of making 100 dots for 100 apples, I simply write 100 (decimal) or 110100 (binary). The real power of the number code comes into action, when larger numbers are to be communicated: Whenever, I add a digit to a binary coded number, I double the expression. When I add a digit to our decmial code system, I am able 10-fold the expression. 3 digits in decimal go up to 100 combinations, 4 digits go to 1.000, 5 digits go to 10.000, wow !!! 1 digit in binary goes up to 2 cominations (1&0), 2 digits go to 4, 3 ->8... 8 digits make up 256 combinations, 16 binary digits go as large as 65536 combinations, wow !!! Why the shit am I telling you this shit ? Trust me, you will need that later when we go to explore recent 'other than PCM' formats ;-) Okay, lets summarize: PCM is time discrete sampling of a continuous wave and writing down (storing) the amplitude values in a binary coded form. A much closer expression would be TDAC: Time Discrete Amplitude Coding But lets just stay with that old PCM.
PCM has exactly Two Parameters
As for the CD-format the parameters are:
Wow, that's pretty fast and pretty accurate. Let's take a look at it. The following diagram shows you a 1kHz sine wave, sampled with CD parameters (16bits, 44.1kHz):
Wow, this looks like a sine wave. It seems that this PCM works pretty well. Now, what happens, if we pull some throttle and choose to sample a higher frequency sine, let's say 6 kHz:
Okay, with 6kHz, we have 6-times less sample points per wave-period, compared to 1kHz, so the 6kHz wave looks a little coarse, as one wave is sampled at only 8 discrete timepoints. But hey, if we smooth out the squares with some kind of filter, it will come very close to a real sine. So let's stay optimistic and further increase the frequency of our sampled sine wave to14kHz
Err, this looks strange. Does this look like a sine wave? There seems to be some real strong amplitude modulation. Is this the M of PCM ? Certainly not after its inventor ;-) As you see, there are less than 4 sample-points per wave-period. How can we guarantee, that when we sample, we always get the highest and lowest point of the wave ? Well, actually we cannot, as our sampling-frequency of 44.1kHz is not synchronized to our sine wave and we have so few sampling-points per wave. However, we still see that there is a 14kHz wave, but its volume (amplitude) has been modulated by a lower frequency of about 2kHz. Let's call this a beat-frequency.
In fact, we always get that beat-frequency with any sampled wave that is not exactly a whole division of the sample-rate. So we expect no beats with 22.05 kHz, 14.7kHz, 11.025 kHz, 8820Hz, 7350Hz, 6300Hz, and so on. Let's give that a try and take a look at 14.7kHz
Yes. There's no beat to see. All exactly the same amplitude. But what happened to the waveform? This looks more like a sawtooth than like a sine-wave. Well, with exactly 3 sample points per wave-period, it cannot be any other way. Above and below any whole division of the sample-rate we get beats. The higher the frequency (or the lower the available sample-points per wave-period) the higher the amplitude of the beat. From app. 18kHz on we got a beat (amplitude-modulation) of 100%. So this is 18kHz:
You can hardly tell, what this one wants to be. However, as our recorded frequency approaches the next whole division of the sample-rate, the beat frequency slows down. Look at 20kHz:
This has a 4kHz beat with 100% amplitude modulation. The following is 21kHz:
The beat frequency goes down to app. 2kHz. When we increase the frequency of our sampled-sine to 22kHz, we get a real slow beat:
Let's step back a little:
Wow, a 22kHz sine wave, sampled with the CD format of 44.1kHz results in a sine wave with a 100Hz beat. As a matter of fact, this does not only happen when we sample very high-frequencies, but with every frequency that approaches a whole division of the sample-rate. For lower frequencies we just have a lower amplitude-modulation.
Yes, But ... err... Nyquist ...eeh ...Well, what did Nyquist say ? maximum data rate in a noiseless channel = C = 2*W log base2( L ) bits/sec * where 2W is 2 times the highest frequency contained in the noiseless channel, and * where L = number of discrete levels (e.g., binary = two levels, 0 and 1) As Nyquist seems to have been more interested in data transmission than in high-fidelity, we should not wonder, that his statement just defines a maxiumum data-rate of a communications channel. If we consider the presence of a frequency in a communication channel to be a piece of information, we can agree, that we need twice the sampling rate in order for this frequency to show up. And as we see in the above diagrams, although those signals can look awful, the frequency 'as an information' shows up. Later, Claude Shannon said: If a function s(x) has a Fourier transform F[s(x)] = S(f) = 0 for |f| > W, then it is completely determined by giving the value of the function at a series of points spaced 1/(2W) apart. The values sn = s(n/(2W)) are called the samples of s(x). This goes much further than Nyquist's words, in that it states, that a signal which consists of sine waves with a maximum frequency of W is completely described by recording its values twice as fast as W. The real cool thing is that Shannon also gave a interpolation formula to get back to the original signal: |