Part 1 of this article can be found here.
Basically, everything in part 1 gives a basic overview of color spacing, but it’s far from accurate. It’s simplified to give a better understanding of color subsampling. So, if you’ve read it, great. But don’t go repeating it until you’ve read this.
Cameras do not capture information the way described in part 1. The color subsampling is a compression issue, not necessarily a camera issue. Digital cameras, scanners, basically any digital input device captures images using one or more storage boards (“chips”) called Charge-Coupled Devices or CCDs. Lower end cameras use a single “chip” to capture all of the color information. Mid to top end cameras use 3 chips, one chip dedicated to each color of light- Red, Green, & Blue.
Even with today’s fast computers, processing all of this imagery at roughly 30 frames per second is very taxing and takes very high end equipment. The solution is to compress the data for storage. From the old days of analog standard def television, you still were watching compressed visual data. The human eye is more sensitive to lights and darks than it is to color, so the color gets compressed.
So, the camera may “see” all the data, but it has to store it on something. Even storing to magnetic tape requires some compression. Remember MiniDV? Those little tapes had to have a lot more compression because the tape width was so limited. The MiniDV cameras would capture the image, compress the color to 1/4 that of the luminance (lights and darks) and store it on the tiny magnetic tape. That 1/4 of the color is where we get the 4:1:1 color sampling. The three numbers do not represent Red, Green, and Blue. They represent YUV (or YCbCr) which is better explained in this Wiki article.
So, in summary, when buying a camera, knowing the type of compression it uses will help you determine it’s color subsampling. If you want raw uncompressed 4:4:4 color, you’ll need a camera with HD-SDI output. Then you’ll need a super awesome capture device to capture that data real time during live recording.