compression and file formats, some clarification?

Gooshin · 02-20-2008, 09:36 AM

i had a friend sit and explain to me over a beer how digital data processing and compression work on a mass scale, and how compression is just taking a file thats 010010010001 and creating an algorithm to make it 0101101 but display the identical thing, thereby knocking off that last string of 0's and 1's and making the file smaller.

he also tried to explain to me how an operating system works but that just went over my head.

a digital light sensors converts light intensity in the form of electric signals that are then recorded and stored as data.

that data is then recorded, in our case lets take a .PEF on a K100D, this amount to either a 10.6 megabyte fyle or a 9.7 megabyte file, what this depends on i still dont know.

is this data compressed? how RAW is it?

and if it technicaly uncompressed and 12 bit in nature, why does a conversion to TIFF somehow add information to this and double the file size, is it extrapolating, is it uncompressing whatever was never technically compressed to an even greater height?

should i convert all my .PEF files from the camera to TIFF first before i do any sort of post processing since there is more information there?

inquiring minds want to know!

Last edited by Gooshin; 02-20-2008 at 09:41 AM.

procyon · 02-20-2008, 11:23 AM

PEF files are compressed losslessly, that means when your post-processing software uncompresses it, you get exactly the same data you had before compression.
JPEG compression on the other hand is a lossy compression, which means that some information is irreversibly lost during compression.
The more data you are willing to discard the better compression you get.
And you won't get any more data by converting your PEF files to TIFF, since you are just saving them in another container (if you pour your milk to a huge water tank, you don't get any more milk, you just change the container). The reason some people convert to TIFF, is just that TIFF files are more standardized, and better supported by editing software.

Gooshin · 02-20-2008, 11:27 AM

Originally posted by procyon

The more data you are willing to discard the better compression you get.
And you won't get any more data by converting your PEF files to TIFF, since you are just saving them in another container (if you pour your milk to a huge water tank, you don't get any more milk, you just change the container). The reason some people convert to TIFF, is just that TIFF files are more standardized, and better supported by editing software.

thats a very interesting analogy!

so what is it, physicaly (or "programaticaly") that makes a TIFF file larger if its not adding to the data? does it use more complex code?

and why then would TIFF be popular if you got something like say DNG floating around?

Photo Tramp · 02-20-2008, 11:29 AM

WOW,
There is no simple way of answering this question, with out getting into formal theory. Meaning it will take to long to explain. Raw files are just that. You have all the data and colors the lens transmitted to the sensor and if you want to work with the image in post processing the best file formats to use ( Where all the information is available to the program you are using. ) Are PEF (RAW files) DNG, (a mirror image of the RAW files.) and TIFF files (which contain all the data and color information also,) It is best to have all the information of the image for post processing so you can manipulate the data to improve the image or change the image. when you convert down to JPEG you do lose the use of Data and makes post processing limited.
Know this, for storage purposes it is best to use RAW files or DNG file formats than it is to use TIFF file formats. You will be taking up less drive storage using DNG and Raw files.

I'm sorry I just don't know how to give you the answer in writing.

attack11 · 02-20-2008, 01:29 PM

Originally posted by Gooshin

thats a very interesting analogy!

so what is it, physicaly (or "programaticaly") that makes a TIFF file larger if its not adding to the data? does it use more complex code?

and why then would TIFF be popular if you got something like say DNG floating around?

tiffs have multiple compression options, as well as stacking layers. dng is a raw standard; a completely different beast.

tiff's are handy for sending final comps to a printer, etc.

procyon · 02-20-2008, 02:18 PM

Originally posted by Gooshin

so what is it, physicaly (or "programaticaly") that makes a TIFF file larger if its not adding to the data? does it use more complex code?
and why then would TIFF be popular if you got something like say DNG floating around?

In RAW files you have the sensor values, 12 bits of info for each sensor, but all this info is monochromatic, because each sensor only registers one color (red, green or blue). So the file size for RAW image is: megapixels * 12 bits.
If you develop the RAW image the color information is computed by interpolation. Now all of a sudden you have 3 color channels for each pixel ! If each of these color channels contains 8 bits of information (the most common option) your file size becomes: megapixels * 3 channels * 8 bits. And for efficient calculations an additional 8 bits might be added (then you get 32 bits per pixel which is convenient because computer processors also work at 32 bits (well some do it at 64 nowadays.. )). So the file size increases. And if you want to avoid losing color data in a TIFF file you might consider saving as a 16 bits per channel. You might ask why not 12 (since that is what you have in a RAW file) ? Well it's again all about computer processors. They just like powers of 2. So after 8 bits comes 16.
As to your other question ... DNG has been here for how long ? A couple of years. The current version of TIFF was created at 1992 ... It's all about compatibility with existing software.

AlexL · 02-20-2008, 02:26 PM

procyon—that is a very simple and to the point reply, good analogy too

Gooshin—DNG and TIFF serve different purposes, even though they do overlap. DNG is a form of RAW, and you're not suppose to be able to directly modify a RAW image. Software programs that edit RAW don't directly edit it, they write the changes to another file that accompanies the RAW, or they "export" the RAW image with the changes to another format, like TIFF.

With TIFF, you can directly modify the file to your hearts delight.

Think of a DNG and RAW like the original roll of film from a camera, and something like TIFF like a print you make from the film. Once you take the picture and develop the film, you can't modify the film anymore. You can make modifications when you're making a print, like changing the colors around.

For this reason, you can't use DNG and TIFF interchangeably, you can't really replace one with the other. You would want to keep a DNG as the original of whatever photo you're working on, and then export the modifications done to it to a TIFF or something else. It's always good to have the original.

Alex

Gooshin · 02-20-2008, 02:32 PM

this is good info, good work guys,

following question, for printing, should i save my files as 16 bit TIFFS.

my buddy keeps telling me jpegs are fine, but i dont know... a good photolab should be able to take advantage of more colour information right?

procyon · 02-20-2008, 03:03 PM

The main reason to use 16bit TIFF files is to have more headroom for post-processing. It really helps to avoid smooth color gradients becoming banded etc.
Personally I think you should be fine if you save your final result as 8-bit.
Since the photo lab is unlikely to process the image further, they just don't need that extra headroom. Since 16bit TIFF files are less used, their software might not even accept it.

falconeye · 02-23-2008, 04:35 PM

Hi Gooshin,

are you doing an experiment how willingly this forum would answer your questions?

I don't believe that you don't know the answer. But if you really don't, here are all your answers: Image compression - Wikipedia, the free encyclopedia

AlexL · 02-24-2008, 03:53 PM

Originally posted by falconeye

Hi Gooshin,

are you doing an experiment how willingly this forum would answer your questions?

I don't believe that you don't know the answer. But if you really don't, here are all your answers: Image compression - Wikipedia, the free encyclopedia

It's a pretty good experiment, I'll say. I think the experiment showed how well some people on this forum can explain complicated topics to people willing to ask the questions. It's more responsive than Wikipedia.

Alex

Gooshin · 02-24-2008, 04:28 PM

Originally posted by falconeye

Hi Gooshin,

are you doing an experiment how willingly this forum would answer your questions?

I don't believe that you don't know the answer. But if you really don't, here are all your answers: Image compression - Wikipedia, the free encyclopedia

as someone that peer-tutored alot of friends during highschool and university, i can tell you that having someone break it down to you in laymens or spin the argument from a different angle is much better than reading a textbook.

your right, i do "know" the answer, the same way i know how a car engine works, i know that there are pistons and a crankshaft, and you have 4 strokes to create movement, but start asking me about volumentric efficiency and the mathematical effects of camshaft overlap and lift duration and i'll be drawing a blank.

same thing with computer data, i "know" the top layer, but i dont know the gritty part about it... so i'm asking, and id much rather have people who have mauled over it try and break it down to me than a textbook.

falconeye · 02-24-2008, 06:31 PM

Originally posted by Gooshin

people who have mauled over it try and break it down to me than a textbook.

Well, than start with this, others may follow up

- Compression reduces the size of data, as the term says.

- Compression may be lossless or not.

- Lossless compression does not destroy information, it only reformats the way data is represented inside a data container. The easy example is this:
Transform "00000011100000" -> "6x0,3x1,5x0" and you compressed w/o a loss. Lossless compression is the default, the bzip2 algorithm is very good at it.

Now a big IF

- If the meaning of the data is known (image, music, text etc.) one gets new options:

- Content-specific lossless (FLAC for music, PNG for images etc.)

- Compression with a loss of information (i.e., cannot be undone)
The easy example is this:
Transform "00000011100000" -> "000000111" if you know that the loss of trailing stuff may pass unnoticed. Examples are MP3 for music, MP4/AVC for movies, JPEG2000 for images etc. The more the algorithm knows about the content the better it can compress without a noticeable loss.

Sidenote:
There is a theorem saying that noise cannot be compressed (without a loss).

02-20-2008, 09:36 AM	#1
Gooshin Veteran Member Join Date: Sep 2007 Location: Toronto, the one in Canada. Posts: 5,610	compression and file formats, some clarification? i had a friend sit and explain to me over a beer how digital data processing and compression work on a mass scale, and how compression is just taking a file thats 010010010001 and creating an algorithm to make it 0101101 but display the identical thing, thereby knocking off that last string of 0's and 1's and making the file smaller. he also tried to explain to me how an operating system works but that just went over my head. a digital light sensors converts light intensity in the form of electric signals that are then recorded and stored as data. that data is then recorded, in our case lets take a .PEF on a K100D, this amount to either a 10.6 megabyte fyle or a 9.7 megabyte file, what this depends on i still dont know. is this data compressed? how RAW is it? and if it technicaly uncompressed and 12 bit in nature, why does a conversion to TIFF somehow add information to this and double the file size, is it extrapolating, is it uncompressing whatever was never technically compressed to an even greater height? should i convert all my .PEF files from the camera to TIFF first before i do any sort of post processing since there is more information there? inquiring minds want to know! Last edited by Gooshin; 02-20-2008 at 09:41 AM.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
File formats & colourspaces compared	dosdan	Post-Processing Articles	1	09-03-2010 07:27 PM
file compression without affecting IQ?	Pentaxor	Digital Processing, Software, and Printing	10	08-01-2009 02:56 AM
Need Spotmatic Clarification	geauxpez	Film SLRs and Compact Film Cameras	24	05-25-2009 05:34 PM
CS2 and pentax formats	maca	Troubleshooting and Beginner Help	15	04-15-2009 04:15 PM
File compression software help	heatherslightbox	Digital Processing, Software, and Printing	3	07-16-2008 10:37 AM

02-20-2008, 11:23 AM	#2
procyon Senior Member Join Date: Oct 2007 Location: Tallinn Posts: 265	PEF files are compressed losslessly, that means when your post-processing software uncompresses it, you get exactly the same data you had before compression. JPEG compression on the other hand is a lossy compression, which means that some information is irreversibly lost during compression. The more data you are willing to discard the better compression you get. And you won't get any more data by converting your PEF files to TIFF, since you are just saving them in another container (if you pour your milk to a huge water tank, you don't get any more milk, you just change the container). The reason some people convert to TIFF, is just that TIFF files are more standardized, and better supported by editing software.

02-20-2008, 11:27 AM	#3
Gooshin Veteran Member Join Date: Sep 2007 Location: Toronto, the one in Canada. Posts: 5,610 Original Poster	Originally posted by procyon The more data you are willing to discard the better compression you get. And you won't get any more data by converting your PEF files to TIFF, since you are just saving them in another container (if you pour your milk to a huge water tank, you don't get any more milk, you just change the container). The reason some people convert to TIFF, is just that TIFF files are more standardized, and better supported by editing software. thats a very interesting analogy! so what is it, physicaly (or "programaticaly") that makes a TIFF file larger if its not adding to the data? does it use more complex code? and why then would TIFF be popular if you got something like say DNG floating around?

02-20-2008, 11:29 AM	#4
Photo Tramp Inactive Account Join Date: Sep 2006 Location: Lancaster, PA. Photos: Gallery Posts: 2,041	WOW, There is no simple way of answering this question, with out getting into formal theory. Meaning it will take to long to explain. Raw files are just that. You have all the data and colors the lens transmitted to the sensor and if you want to work with the image in post processing the best file formats to use ( Where all the information is available to the program you are using. ) Are PEF (RAW files) DNG, (a mirror image of the RAW files.) and TIFF files (which contain all the data and color information also,) It is best to have all the information of the image for post processing so you can manipulate the data to improve the image or change the image. when you convert down to JPEG you do lose the use of Data and makes post processing limited. Know this, for storage purposes it is best to use RAW files or DNG file formats than it is to use TIFF file formats. You will be taking up less drive storage using DNG and Raw files. I'm sorry I just don't know how to give you the answer in writing.

02-20-2008, 01:29 PM	#5
attack11 Veteran Member Join Date: Apr 2007 Location: Ottawa, ON - Canada Photos: Gallery Posts: 658	Originally posted by Gooshin thats a very interesting analogy! so what is it, physicaly (or "programaticaly") that makes a TIFF file larger if its not adding to the data? does it use more complex code? and why then would TIFF be popular if you got something like say DNG floating around? tiffs have multiple compression options, as well as stacking layers. dng is a raw standard; a completely different beast. tiff's are handy for sending final comps to a printer, etc.

02-20-2008, 02:18 PM	#6
procyon Senior Member Join Date: Oct 2007 Location: Tallinn Posts: 265	Originally posted by Gooshin so what is it, physicaly (or "programaticaly") that makes a TIFF file larger if its not adding to the data? does it use more complex code? and why then would TIFF be popular if you got something like say DNG floating around? In RAW files you have the sensor values, 12 bits of info for each sensor, but all this info is monochromatic, because each sensor only registers one color (red, green or blue). So the file size for RAW image is: megapixels * 12 bits. If you develop the RAW image the color information is computed by interpolation. Now all of a sudden you have 3 color channels for each pixel ! If each of these color channels contains 8 bits of information (the most common option) your file size becomes: megapixels * 3 channels * 8 bits. And for efficient calculations an additional 8 bits might be added (then you get 32 bits per pixel which is convenient because computer processors also work at 32 bits (well some do it at 64 nowadays.. )). So the file size increases. And if you want to avoid losing color data in a TIFF file you might consider saving as a 16 bits per channel. You might ask why not 12 (since that is what you have in a RAW file) ? Well it's again all about computer processors. They just like powers of 2. So after 8 bits comes 16. As to your other question ... DNG has been here for how long ? A couple of years. The current version of TIFF was created at 1992 ... It's all about compatibility with existing software.

02-20-2008, 02:26 PM	#7
AlexL Senior Member Join Date: Aug 2007 Location: Athens, GA Posts: 137	procyon—that is a very simple and to the point reply, good analogy too Gooshin—DNG and TIFF serve different purposes, even though they do overlap. DNG is a form of RAW, and you're not suppose to be able to directly modify a RAW image. Software programs that edit RAW don't directly edit it, they write the changes to another file that accompanies the RAW, or they "export" the RAW image with the changes to another format, like TIFF. With TIFF, you can directly modify the file to your hearts delight. Think of a DNG and RAW like the original roll of film from a camera, and something like TIFF like a print you make from the film. Once you take the picture and develop the film, you can't modify the film anymore. You can make modifications when you're making a print, like changing the colors around. For this reason, you can't use DNG and TIFF interchangeably, you can't really replace one with the other. You would want to keep a DNG as the original of whatever photo you're working on, and then export the modifications done to it to a TIFF or something else. It's always good to have the original. Alex

02-20-2008, 02:32 PM	#8
Gooshin Veteran Member Join Date: Sep 2007 Location: Toronto, the one in Canada. Posts: 5,610 Original Poster	this is good info, good work guys, following question, for printing, should i save my files as 16 bit TIFFS. my buddy keeps telling me jpegs are fine, but i dont know... a good photolab should be able to take advantage of more colour information right?

02-20-2008, 03:03 PM	#9
procyon Senior Member Join Date: Oct 2007 Location: Tallinn Posts: 265	The main reason to use 16bit TIFF files is to have more headroom for post-processing. It really helps to avoid smooth color gradients becoming banded etc. Personally I think you should be fine if you save your final result as 8-bit. Since the photo lab is unlikely to process the image further, they just don't need that extra headroom. Since 16bit TIFF files are less used, their software might not even accept it.

02-23-2008, 04:35 PM	#10
falconeye Veteran Member Join Date: Jan 2008 Location: Munich, Alps, Germany Photos: Gallery Posts: 6,871	Hi Gooshin, are you doing an experiment how willingly this forum would answer your questions? I don't believe that you don't know the answer. But if you really don't, here are all your answers: Image compression - Wikipedia, the free encyclopedia

02-24-2008, 03:53 PM	#11
AlexL Senior Member Join Date: Aug 2007 Location: Athens, GA Posts: 137	Originally posted by falconeye Hi Gooshin, are you doing an experiment how willingly this forum would answer your questions? I don't believe that you don't know the answer. But if you really don't, here are all your answers: Image compression - Wikipedia, the free encyclopedia It's a pretty good experiment, I'll say. I think the experiment showed how well some people on this forum can explain complicated topics to people willing to ask the questions. It's more responsive than Wikipedia. Alex

02-24-2008, 04:28 PM	#12
Gooshin Veteran Member Join Date: Sep 2007 Location: Toronto, the one in Canada. Posts: 5,610 Original Poster	Originally posted by falconeye Hi Gooshin, are you doing an experiment how willingly this forum would answer your questions? I don't believe that you don't know the answer. But if you really don't, here are all your answers: Image compression - Wikipedia, the free encyclopedia as someone that peer-tutored alot of friends during highschool and university, i can tell you that having someone break it down to you in laymens or spin the argument from a different angle is much better than reading a textbook. your right, i do "know" the answer, the same way i know how a car engine works, i know that there are pistons and a crankshaft, and you have 4 strokes to create movement, but start asking me about volumentric efficiency and the mathematical effects of camshaft overlap and lift duration and i'll be drawing a blank. same thing with computer data, i "know" the top layer, but i dont know the gritty part about it... so i'm asking, and id much rather have people who have mauled over it try and break it down to me than a textbook.

02-24-2008, 06:31 PM	#13
falconeye Veteran Member Join Date: Jan 2008 Location: Munich, Alps, Germany Photos: Gallery Posts: 6,871	Originally posted by Gooshin people who have mauled over it try and break it down to me than a textbook. Well, than start with this, others may follow up - Compression reduces the size of data, as the term says. - Compression may be lossless or not. - Lossless compression does not destroy information, it only reformats the way data is represented inside a data container. The easy example is this: Transform "00000011100000" -> "6x0,3x1,5x0" and you compressed w/o a loss. Lossless compression is the default, the bzip2 algorithm is very good at it. Now a big IF - If the meaning of the data is known (image, music, text etc.) one gets new options: - Content-specific lossless (FLAC for music, PNG for images etc.) - Compression with a loss of information (i.e., cannot be undone) The easy example is this: Transform "00000011100000" -> "000000111" if you know that the loss of trailing stuff may pass unnoticed. Examples are MP3 for music, MP4/AVC for movies, JPEG2000 for images etc. The more the algorithm knows about the content the better it can compress without a noticeable loss. Sidenote: There is a theorem saying that noise cannot be compressed (without a loss).