Originally posted by Gooshin people who have mauled over it try and break it down to me than a textbook.
Well, than start with this, others may follow up
- Compression reduces the size of data, as the term says.
- Compression may be lossless or not.
- Lossless compression does not destroy information, it only reformats the way data is represented inside a data container. The easy example is this:
Transform "00000011100000" -> "6x0,3x1,5x0" and you compressed w/o a loss. Lossless compression is the default, the bzip2 algorithm is very good at it.
Now a big IF
- If the meaning of the data is known (image, music, text etc.) one gets new options:
- Content-specific lossless (FLAC for music, PNG for images etc.)
- Compression with a loss of information (i.e., cannot be undone)
The easy example is this:
Transform "00000011100000" -> "000000111" if you know that the loss of trailing stuff may pass unnoticed. Examples are MP3 for music, MP4/AVC for movies, JPEG2000 for images etc. The more the algorithm knows about the content the better it can compress without a noticeable loss.
Sidenote:
There is a theorem saying that noise cannot be compressed (without a loss).