A few scanning tips
The most common image file formats, the most important for cameras, printing, scanning, and internet use, are JPG, TIF, PNG, and GIF.
Frankly, JPG is used when small file size is more important than maximum image quality (web pages, email, memory cards, etc). But JPG is good enough in many cases, if we don't overdo the compression. Perhaps good enough for some uses even if we do overdo it (web pages, etc). But if you are concerned with maximum quality for archiving your important images, then you do need to know two things: 1) JPG should always choose higher Quality and a larger file, and 2) do NOT keep editing and saving your JPG images repeatedly, because more quality is lost every time you save it as JPG (in the form of added JPG artifacts... pixels become colors they ought not to be - lossy). More at the JPG link at page bottom.
We could argue that there really is no concept of RAW files from the scanner. Vuescan does offer an output called RAW, which is 16 bits, includes the fourth Infrared noise correction channel data if any, and defers gamma correction. Vuescan itself is the only post-processor for these. But scanner color images are already RGB color, instead of Bayer pattern data like from cameras. Camera RAW images are not RGB (the meaning of RAW), and must be converted to RGB for any use.
|Photographic Images||Graphics, including
Logos or Line art
|Properties||Photos are continuous tones, 24-bit color or 8-bit Gray, no text, few lines and edges||Graphics are often solid colors, with few colors, up to 256 colors, with text or lines and sharp edges|
|For Unquestionable Best Quality||TIF or PNG (lossless compression
and no JPG artifacts)
|PNG or TIF (lossless compression,
and no JPG artifacts)
|Smallest File Size||JPG with a higher Quality factor can be decent.||TIF LZW or GIF or PNG (graphics/logos without gradients normally permit indexed color of 2 to 16 colors for smallest file size)|
(PC, Mac, Unix)
|TIF or JPG||TIF or GIF|
|Worst Choice||256 color GIF is very limited color, and is a larger file than 24 -bit JPG||JPG compression adds artifacts, smears text and lines and edges|
These are not the only choices, but they are good and reasonable choices.
Major considerations to choose the necessary file type include:
The only reason for using lossy compression is for smaller file size, usually due to internet transmission speed or storage space. Web pages require JPG or GIF or PNG image types, because sone browsers do not show TIF files. On the web, JPG is the clear choice for photo images (smallest file, with image quality being less important than file size), and GIF is common for graphic images, but indexed color is not normally used for color photos (PNG can do either on the web).
Other than the web, TIF file format is the undisputed leader when best quality is desired, largely because TIF is so important in commercial printing environments. High Quality JPG can be pretty good too, but don't ruin them by making the files too small. If the goal is high quality, you don't want small. Only consider making JPG large instead, and plan your work so you can only save them as JPG only one or two times. Adobe RGB color space may be OK for your home printer and profiles, but if you send your pictures out to be printed, the mass market printing labs normally only accept JPG files, and only process sRGB color space.
Photo images have continuous tones, meaning that adjacent pixels often have very similar colors, for example, a blue sky might have many shades of blue in it. Normally this is 24-bit RGB color, or 8-bit grayscale, and a typical color photo may contain perhaps a hundred thousand RGB colors, out of the possible set of 16 million colors in 24-bit RGB color.
Graphic images are normally not continuous tone (gradients are possible in graphics, but are seen less often). Graphics are drawings, not photos, and they use relatively few colors, maybe only two or three, often less than 16 colors in the entire image. In a color graphic cartoon, the entire sky will be only one shade of blue where a photo might have dozens of shades. A map for example is graphics, maybe 4 or 5 map colors plus 2 or 3 colors of text, plus blue water and white paper, often less than 16 colors overall. These few colors are well suited for Indexed Color, which can re-purify the colors. Don't cut your color count too short though - there will be more colors than you count. Every edge between two solid colors likely has maybe six shades of anti-aliasing smoothing the jaggies (examine it at maybe 500% size). Insufficient colors can rough up the edges. Scanners have three modes to create the image: color (for all color work), grayscale (like B&W photos), and lineart. Line art is a special case, only two colors (black or white, with no gray), for example clip art, fax, and of course text. Low resolution line art (like cartoons on the web) is often better as grayscale, to add anti-aliasing to hide the jaggies.
JPG files are very small files for continuous tone photo images, but JPG is poor for graphics, without a high Quality setting. JPG requires 24-bit color or 8-bit grayscale, and the JPG artifacts are most noticeable in the hard edges of graphics or text. GIF files (and other indexed color files) are good for graphics, but are poor for photos (too few colors possible). However, graphics are normally not many colors anyway. Formats like TIF and PNG can be used either way, 24-bit or indexed color - these file types have different internal modes to accommodate either type optimally.
Something we all need to know, but it takes more to show this, so it was placed on its own page.
Our digital images are dimensioned in pixels (not bytes, and definitely not inches). And a pixel is simply a color definition, the color that this tiny dot of image sampled area ought to be. Put all those colored dots together, and our brain sees the image. The losses of image data we are speaking about is about the altered color of the pixels.
Image data consists of pixels, and pixels are "colors", simply the storage of the three RGB data components (see What is a Digital Image Anyway?).
Any 24-bit RGB image will use three bytes per pixel (see Color Bit-Depth - Memory Size).
So - for example- any 10 megapixel camera image data will occupy 3x10 = 30 million bytes, by definition of RGB color. This number is the "data size" (when opened into computer memory for use). A TIF file will be near that size (and is lossless), but JPG is normally compressed very heavily (lossy, not lossless) to store in a JPG file of perhaps 1/10 this size (variable with JPG Quality setting), which is "file size" (not image size and not data size). This example image size is still 10 megapixels (dimensioned in pixels, width x height), and the data size is 30 million bytes, but the JPG file size might be 3 MB (lossy compression takes a few liberties). The image will still come out of the JPG file as the same 10 megapixels and the same 30 million bytes when the 3 MB JPG file is opened. We hope its quality also comes out about the same - the JPG losses are altered color values of some of the pixels).
Image size (pixels) determines how we can use the image - everything is about the pixels. See a summary of digital basics.
All photo editor programs will support these file formats, which will generally support and store images in the following color modes:
|Color data mode -bits per pixel|
|JPG||RGB - 24-bits (8-bit color),
Grayscale - 8-bits
JPEG always uses lossy JPG compression, but its degree is selectable, for higher quality and larger files, or lower quality and smaller files. JPG is for photo images, and is the worst possible choice for most graphics or text data.
|TIF||Versatile, many formats supported.
Mode: RGB or CMYK or LAB, and others, almost anything.
8 or 16-bits per color channel, called 8 or 16-bit "color" (24 or 48-bit RGB files).
Grayscale - 8 or 16-bits,
Indexed color - 1 to 8-bits,
Line Art (bilevel)- 1-bit
For TIF files, most programs allow either no compression or LZW compression (LZW is lossless, but is less effective for color images). Adobe Photoshop also provides JPG or ZIP compression in TIF files too (but which greatly reduces third party compatibility of TIF files). "Document programs" allow ITCC G3 or G4 compression for 1-bit text (Fax is G3 or G4 TIF files), which is lossless and tremendously effective (small). Many specialized image file types (like camera RAW files) are TIF file format, but using special proprietary data tags.
24-bits is called 8-bit color, three 8-bit bytes for RGB (256x256x256 = 16.7 million colors maximum.)
|PNG||RGB - 24 or 48-bits (called 8-bit or 16-bit "color"),
Alpha channel for RGB transparency - 32 bits
Grayscale - 8 or 16-bits,
Indexed color - 1 to 8-bits,
Line Art (bilevel) - 1-bit
Supports transparency in regular indexed color, and also there can be a fourth channel (called Alpha) which can map RGB graduated transparency (by pixel location, instead of only one color, and graduated, instead of only on or off).
PNG also supports animation (like GIF), showing several sequential frames fast to simulate motion.
PNG uses ZIP compression which is lossless, and somewhat more effective color compression than TIF LZW. For photo data, PNG is somewhat smaller files than TIF LZW, but larger files than JPG (however PNG is lossless, and JPG is not.) PNG is a newer format than the others, designed to be both versatile and royalty free, back when the patent for LZW compression was disputed for GIF and TIF files.
|GIF||Indexed color - 1 to 8-bits (8-bit indexes, limiting to only 256 colors maximum.) Color is 24-bit color, but only 256 colors.
One color in indexed color can be marked transparent, allowing underlaying background to be seen (very important for text, for example). GIF is an online video image, the file contains no dpi information for printing. Designed by CompuServe for online images in the days of dialup and 8-bit indexed computer video, whereas other file formats can be 24-bits now. However, GIF is still great for web use of graphics containing only a few colors, when it is a small lossless file, much smaller and better than JPG for this. GIF files do not save the dpi number for printing resolution.
GIF uses lossless LZW compression. (for Indexed Color, see second page at GIF link at page bottom).
GIF also supports animation, showing several sequential frames fast to simulate motion.
Note that if your image size is say 3000x2000 pixels, then this is 3000x2000 = 6 million pixels (6 megapixels). Assuming this 6 megapixel image data is RGB color and 24-bits (or 3 bytes per pixel of RGB color information), then the size of this image data is 6 million x 3 bytes RGB = 18 million bytes. That is simply how large your image data is (see more). Then file compression like JPG or LZW can make the file smaller, but when you open the image in computer memory for use, the JPG may not still have the same image quality, but it is always still 3000x2000 pixels and 18 million bytes. This is simply how large your 6 megapixel RGB image data is (megapixels x 3 bytes per pixel).
The most common image file formats, the most important for general purposes today, are JPG, TIF, PNG and GIF. These are not the only choices of course, but they are good and reasonable choices for general purposes. Newer formats like JPG2000 never acquired popular usage, and are not supported by web browsers, and so are not the most compatible choice.
PNG and TIF LZW are lossless compression, so their file size reduction is not as extreme as the wild heroics JPG can dream up. In general, selecting lower JPG Quality gives a smaller worse file, higher JPG Quality gives a larger better file. Your 12 megapixel RGB image data is three bytes per pixel, or 36 million bytes. That is simply how big your image data is. Your JPG file size might only be only 5-20% of that, literally. TIF LZW might be 65-80%, and PNG might be 50-65% (very rough ballpark for 24-bit color images). We cannot predict sizes precisely because compression always varies with image detail. Blank areas, like sky and walls, compress much smaller than extremely detailed areas like a tree full of leaves. But the JPG file can be much smaller, because JPG is not required to recover the original image intact, losses are acceptable. Whereas, the only goal of PNG and TIF LZW is to be 100% lossless, which means the file is not as heroically small, but there is never any concern about compression quality with PNG or TIF LZW. They still do impressive amounts of file size compression, remember, the RGB image data is actually three bytes per pixel.
Camera RAW files is one way to bypass this JPG issue, at least until the last one final save as JPG when required. And it offers additional processing advantages too. Better easier tools in RAW than JPG has, the RAW data has wider range than JPG has. Much the same controls as in the camera, which you would have needed anyway, but this step is done after you see the camera results, to know exactly what it still needs, and can simply tweak and judge it by eye (as opposed to settings in the camera done in advance, as hopeful wishing).
We hear: But RAW images require an editing step first. Some people do seem terrified of the word "edit", but no matter what, we do always have to stop and look at our images on the computer, every one of them. That is the same extra step. Surely we have to crop them a bit, and resample smaller, and many of mine will need a slight Exposure or White Balance tweak to be their best. It makes a tremendous difference. That is the same editing, a few seconds each, a few clicks, and then the file must be saved again. You might as well do this step in the RAW software, which has better easier tools to do it, and more range to do it. If your session included 100 images of same lighting situation, just select them all, edit ONE of them (say White Balance and Exposure, even Cropping, etc), and the same edit clicks are applied to all of the selected RAW images in one click. Extremely convenient. And no JPG artifacts of course, no losses, and any changes can easily be Undone anytime later, with full recovery of our original RAW master copy. RAW is the trivial, easy, and good way, Day and Night good, if you care about these things.
We all have our own notions, but here is a popular opinion about the ultimate, in quality, in versatility, in convenience. RAW files are popular indeed, from most DSLR cameras. When we take any digital picture, the camera has a RAW sensor, but normally processes and outputs the image as a JPG file. But often we can choose to output the original RAW image instead, to defer that JPG step until later. We cannot view or use that RAW file any way other than to process it in computer software and then output a final TIF or JPG image, however postponing this processing offers a few serious advantages, better editing options, and we can bypass all JPG artifacts entirely, until the one final output Save for whatever purpose. RAW allows us to tweak exposure and color, and defer White Balance decisions until later when we can see the image first, and judge any trial results. The 12-bit RAW file offers greater range for any of our adjustments, often on multiple files simultaneously. And RAW always preserves the intact original version, so we can easily back out any editing changes we made, crop size for example. An argument is made that processing RAW requires this extra step, but of course, same is true of any editing that is required. RAW is the easy way, with the best results.
The Next button will browse through the descriptions on the next pages, or you can use these shortcut links directly:
|PNG Format||TIF Format||JPG Format||GIF Format|