A few scanning tips

www.scantips.com

Menu of the other Photo and Flash pages here

Understanding File Types, Bit Depth, & Memory Cost of Images

Calculators below:

Image Size Goal for Desired Print Size

The Four Sizes of a Digital Image - How many bytes?

Convert Bytes to KB, MB, GB, and TB size numbers

Scanned Image Size

Large photo images consume much memory and can make our computers struggle. Uploads can be very slow. Memory cost for an image is computed from the image size. Our common 24-bit RGB image size is three bytes per pixel when uncompressed in memory (so 24 megapixels is x3 or 72,000,000 bytes, which is 68.7 MB uncompressed in memory, but can be smaller in a compressed file. Digital camera images today are typically much larger than most viewing or printing purposes can use (but the plentiful pixels offer advantages for largest prints or more extreme cropping, etc).

One needed basic that shows the image size necessary to have sufficient pixels to properly print a photo is this very simple calculation:

Image Size Goal for
desired Print Size

To print x
 inches
 mm
at dpi resolution  

There is a larger dpi calculator that knows about scanning, printing, and enlargement.

This little calculator has these purposes:

File size is shown at The Four Sizes of a Digital Image below

Printing photos at 250 or 300 dpi is considered very desirable and optimum. But this dpi number does NOT need to be exact, 10% or 15% variation won't have great effect. But planning image size to have sufficient pixels to be somewhere around 240 to 300 pixels per inch is a very good thing for printing, called "photo quality". More than 300 dpi really can't help photo prints, but less than 200 dpi can suffer less image quality. It's generally about what our eye is capable of seeing, but varies with the media. See a printing guideline for the resolution needed for several common purposes.

That's a pretty simple calculation. More pixels will work too (but slow to upload, essentially wasted effort). The printer or the print lab will simply discard the excess, but too many fewer pixels can seriously limit the resolution and sharpness of the printed copy.

Cropping Aspect Ratio to fit the paper size is an important concern too.

And there is a larger dpi calculator that knows about scanning, printing, and enlargement.

The memory cost for the initial default 8x10 inch color image is:

  3000 x 2400 pixels x 3 = 21.6 million bytes = 20.6 megabytes.

The last "× 3" is for 3 bytes of RGB color information per pixel for 24-bit color (3 RGB values per pixel, which is one 8-bit byte for each RGB value, which totals 24-bit color).

But the compressed file will be smaller (maybe 10% of that size for JPG), selected by our choice for JPG Quality. But the smaller it is, the worse the image quality. The larger it is, the better the image quality. If uncompressed, the data is three bytes per pixel.

Data Compression and File Sizes

Image size is always dimensioned in pixels, for example 6000x4000 pixels, or 24 megapixels. Expressing Image Size in bytes of File Size is useless to describe how the image might be used, because images get variably compressed smaller for storage.

Data and File size is dimensioned in bytes, for example, 12 megabytes (but JPG is compressed smaller, not lossless).
24-bit RGB photo data is always 3 bytes per pixel (when uncompressed for use). 24-bit PNG is lossless for photos, and TIF with LZW compression is lossless too, but neither is near as small as JPG, because JPG takes liberties with the data in order to be smaller. Still, saving JPG as High Quality JPG is very adequate for viewing or printing, but the lossless files are much better ways to archive it.

Data is often compressed to smaller size for storage in the file (very radically smaller for JPG). It must be uncompressed for use, but lossless compression comes back out exactly as it went in.

Lossy compression can make small changes in the data values. Lossy compression cannot be used in like Quicken, Excel or Word or backup software, because we insist that every byte compressed come back out exactly as it went into the file. Anything else is corruption. However, image tonal values can be more forgiving in casual viewing, until it becomes excessive.

Image Data Compression is of two types, lossless or lossy.

Data compression while in the file varies data size too much for bytes to have specific meaning about image size. Saying, the size of our 24 megapixel image is 6000x4000 pixels. This "dimension in pixels" is the important parameter that tells us how we can use that image. The data size may be 72 MB (uncompressed, or maybe 12 MB or other numbers if compressed in a JPG file), but that file size doesn't tell us anything about the image size, only about storage space or internet speed. For example, normally we have a 24-bit color photo image which is 3 bytes of data per pixel when uncompressed (one byte each of RGB data). That means any 24 megapixel camera takes RGB images of data size 72 million bytes (the calculator below converts this to 68.7 MB, the data size before compression). However, data compression techniques can make this data smaller while stored in the file. In some cases drastically smaller, and maybe the 68.7 MB goes into perhaps a 4 to 16 MB file if JPG compression. We can't state any exact size numbers, because when creating the JPG file (in camera or in editor), we can select different JPG Quality settings. For this example 24 megapixel image, the JPG results might range from:

Of course, we do prefer higher quality. We do our photos no favor by choosing lower JPG quality. However, emailing grandma a picture of the kids doesn't need to be 24 megapixels. Maximum dimension of maybe 1000 pixels is reasonable for email, still large on the screen. Or even less if to a cell phone. Even printing 5x7 inches only needs 1500x2100 pixels. But this resample should be a COPY. Never overwrite your original image.

JPG files made too small are certainly not a plus, larger is better image quality. Surely we want our camera images to be the best they can be. Also this compressed file size naturally varies some with image content too. Images containing much fine detail everywhere (a tree full of small leaves) will be a little larger, and images with much blank featureless content (walls or blue sky, etc.) will be noticeably smaller (better compressed). File sizes might vary over a 2:1 range due to extreme scene detail differences. But JPG files are typically 1/5 to 1/12 of the image data size (but other extremes do exist). Both larger and smaller are possible (an optional choice set by JPG Quality setting).

Then when the file is opened and the image data is uncompressed and shown, the image data comes back out of the file uncompressed, and original size, with the original number of bytes and pixels when open in computer memory. Still the same pixel counts, but JPG Quality differences affect the color accuracy of some of the pixels (image detail is shown by the pixel colors). Bad compression effects can add visible added JPG artifacts, which we can learn to see.

Best Safe Plan to Use JPG Images

Simply opening and viewing a JPG image does no more harm whatsoever, but editing and SAVING that JPG does do JPG compression again. And maybe again and again and again if that is your style of working.

The wise choice is to ALWAYS first archive and preserve the original pristine JPG image from the camera. Then whatever might happen, you still have the original. When edit or resize is needed, edit the image as desired, but never overwrite that archived original file. Only make another high quality JPG file COPY to use (with a different file name). Never overwrite your original archived file, it may be important to have later. The more important the image, the more important it is to preserve a copy of the pristine original image intact. There is no other way for a JPG file to go back.

And a second reason: Don't SAVE again any JPG copy additional times. I am suggesting Never, but at least extremely few times. Meaning if subsequent plans require SAVING yet another edit or resized image, NEVER start from that previously edited JPG file. This is because JPG lossy compression means it already has two sets of JPG artifacts in it, from first the camera and then the first edit (or crop or resize), so a third or fourth time won't help that. Treat a JPG copy as expendable, discard it when done with it. To make changes (editing or just resizing) START OVER from the archived unmodified original file. Because, each SAVE operation to a JPG file does the JPG compression again, on top of any previous Saves as JPG. Or the Easy and fail-safe Way, if your edit was extensive work (more than you want to repeat next time), or if maybe you are just not done yet with the edit that day, you could think ahead then to also save that work into a lossless file (TIF LZW or 24-bit PNG, which are lossless and will not add additional JPG artifacts). Then when your work is complete, then also save that finished image as a TIF LZW or PNG archive, and then have the choice to use it as a master version in the future, and make any subsequent JPG copy from it. Both TIF and PNG are much larger than a JPG, because they are lossless. PNG is slightly smaller than TIF, but also slightly slower to open, but these facts are not very important. This lossless SAVE will not remove any existing JPG artifacts in the image data, but it will not in itself add more.

Among the many advantages of RAW images is that they don't have this JPG artifact concern. The original RAW image is always automatically preserved (and no way is provided to change it). Also the list of past edit operations are automatically saved, and any new edit starts with the original RAW image and this list of past edits. Any new edit simply edits the saved list of edit operations, which is then saved for next time, and then can be followed by only the one first save of the JPG for use. This RAW procedure is called lossless editing (always starting from the preserved original full image and the saved list of past edits). RAW is a pretty big deal.

Photo programs differ in how they describe JPG Quality. The software has options about how it is done, and Quality 100 is arbitrary (Not a percentage of anything), and it NEVER means 100% Quality. It is always JPG. But Maximum JPG Quality at 100, and even Quality of 90 (or 9 on a ten scale) should be pretty decent. I usually use Adobe Quality 9 for JPG pictures to be printed, as "plenty good enough". Web pictures can be less quality, because file size is so important on the web, and they are only glanced at one time.

13 MB JPG from 68.7 MB data would be 19% original size (~1/5), and we'd expect fine quality (not exactly perfect, but extremely adequate, hard to fault).

6 MB JPG from 68.7 MB would be compression to 8% size (~1/12), and we would Not expect best quality. Possibly perhaps acceptable for some casual uses, like for the internet, but this much JPG compression would likely be bad news. Don't foolishly create your JPG to be small storage. It is wise to make them large for good quality. Use a High JPG Quality setting when you save a JPG.

Compromising small, down towards 1/10 size (10%) might be a typical and reasonable file size for JPG, except when we might prefer better results. We should realize too, that images with much blank featureless areas like sky or blank smooth walls can compress exceptionally well, less than 10%, which that is Not an issue itself, but a number like 10% is just a very vague specification. File size is not the final criteria, we have to judge how the picture looks. We can learn to see and judge JPG artifacts. We would prefer not to see any of them in our images.

But there are downsides with JPG, because it is lossy compression, and image quality can be lost (not recoverable). The only way to recover is to discard the bad JPG copy and start over again from the pristine original camera image. Selecting higher JPG Quality is better image quality but a larger file size. Lower JPG Quality is a smaller file, but lower image quality. Don't cut off your nose to spite your face. Large is Good regarding JPG, the large one is still small. File size may matter when the file is stored, but image quality is important when we look at the image. Lower JPG quality causes JPG artifacts (lossy compression) which means the pixels may not all still be the same original color (image quality suffers from visible artifacts). There are the same original number of bytes and pixels when opened, but the original image quality may not be retained if JPG compression was too great. Most other types of file compression (including PNG and GIF and TIF LZW) are lossless, never any issue, but while impressive, they are not as dramatically effective (both vary greatly, perhaps 70% size instead of 10% size).

How many bytes? There are four sizes of a digital image.

Image Size is dimensioned in pixels, which is important to determine how the image might be used. The FIRST numbers you need to know about using a digital image is its dimensions in pixels (and the image size viewed on the monitor screen is also dimensioned in pixels).

Data Size is its uncompressed size in bytes when the file is opened into computer memory. If the usual 24-bit color image, that data size will be 3 bytes per pixel. If 24 megapixels, then 72 million pixels, but which due to the 1024 thing, MB will be about 68.7 MB. Again, that is size in memory, and image data is usually compressed smaller while in the image file (like .JPG).

File Size is its size in bytes in a disk file (which is Not a meaningful number regarding how the image might be used, because image size is instead in pixels, not bytes). Data compression (such as JPG) can reduce the file size drastically, but image size in pixels and data size in bytes remain the original same when recovered into computer memory.

Print Size is its size when printed on paper (in inches or mm). The size of film is also inches or mm. Digital sensor size is mm, smaller, which must be enlarged more to the print or viewing size.

Again, image size on a monitor screen is still dimensioned in pixels (print paper is dimensioned in inches or mm, but video screens are dimensioned in pixels). If the image size is larger than the screen size, we normally are shown a temporary resampled smaller copy of more suitable smaller size.

The usual and most common type of color image (such as any JPG file) is the 24-bit RGB choice.

Calculate the Four Sizes of an Image

  Specify image size with one of these two options:
Image Size x pixels
Megapixels   and Aspect Ratio
  Data Type
  Add estimated Exif size (optional)  
Bytes   KB
  If Printed at pixels per inch  
Image Size
Data Size
File Size
Print Size

Disclaimer: Image Size is the actual size of binary image, in pixels. Data Size is the uncompressed data bytes for the image pixels when the file is opened into computer memory. These parts are known and simple, but there are also other factors.

Note that uncompressed 24-bit RGB data is always three bytes per pixel, regardless of image size. Color data in JPG files is 24-bit RGB. For example, an uncompressed 24 megapixel 6000x4000 pixel image is 6000x4000 x 3 = 72 million bytes, also 24 x 3, every time. That is its actual size in computer memory bytes when the file is opened. Fill in your own numbers, but converting to MB units is bytes divided by 1048576 (or just divide by 1024 twice) which converts units to 68.66 megabytes. The JPG files will vary in size, because JPG compression degree varies with scene detail level, and with the proper JPG Quality factor specified when writing the JPG.

Speaking of scene size variations, if you have several dozen JPG images from widely assorted random scenes, in one folder (but specifically, all written from one source at the same image size with same JPG settings), and then sorted by size, the largest and smallest file might often vary by 2:1 file size (possibly much more for extremes). Smooth areas of featureless detail (cloudless sky, smooth walls, etc) compress significantly smaller than a scene full of highly detailed areas (like many trees or many tree leaves for example). If a JPG in this 24 megapixel example is say 12.7 MB size, then (ignoring small Exif) it is 12.7 MB / 68.66 MB = 18.5% size of uncompressed, which is 1/0.185 = 5.4 : 1 size reduction. That would be a high quality JPG. But JPG file size does also vary with the degree of scene detail, so file size is not a hard answer of quality. See a sample of this JPG size variation. See more detail about pixels.

Compatible File Types

Different color modes have different size data values, as shown.

Image TypeBytes per pixelPossible color
combinations
Compatible
File Types
1-bit
Line art
1/8 byte per pixel2 colors, 1 bit per pixel.
One ink on white paper
TIF, PNG, GIF
8-bit Indexed ColorUp to 1 byte per pixel if 256 colors256 colors maximum.
For graphics use today
TIF, PNG8, GIF
8-bit Grayscale1 byte per pixel256 shades of grayLossy: JPG
Lossless:
  TIF, PNG
16-bit Grayscale2 bytes per pixel65636 shades of grayTIF, PNG
24-bit RGB
(8-bit mode)
3 bytes per pixel (one byte each for R, G, B) Computes 16.77 million colors max. (normally well less than 1 million in use). 24-bits is the "Norm" for photo images, e.g., JPGLossy: JPG
Lossless:
  TIF, PNG24
32-bit CMYK4 bytes per pixel, for PrepressCyan, Magenta, Yellow and Black ink, typically in halftonesTIF
48-bit RGB
(16-bit mode)
6 bytes per pixel 2.81 trillion colors max.
Except we don't have 16-bit display devices
TIF, PNG

The number of color combinations are the "maximum possible" computed. The human eye is limited, and might be able to distinguish 1 to 3 million of the 16.77 million technically possible in 24-bit color. A typical real photo JPG image might have about 100K to 500K unique colors used.

A few notes:

A few features of common file types
File PropertyJPG   TIF   PNG   GIF
Web pages can show itALL ALLALL
Uncompressed option Yes
Lossy compressionALL
Lossless compression YesALLALL
GrayscaleYesYesYesYes
8-bit RGB color (24-bits)ALLYesYes
16-bit RGB color (48-bits) YesYes
CMYK or LAB colorYes
Indexed color option YesYesALL
Transparency option YesYes
Animation option Yes

The term ALL means it is the only option. Yes means it is an available option. Blank means there is no option.

8-bits: As is common practice, there are often multiple definitions used for the same words, with different meanings: 8-bits is one of those.

In RGB images - 8-bit "mode" means three 8-bit channels of RGB data, also called 24-bit "color depth" data. This is three 8-bit channels, one byte for each of the R or G or B components, which is 3 bytes per pixel, 24-bit color, and up to 16.7 million possible color combinations (256 x 256 x 256). Our monitors or printers are 8-bit devices, meaning 24-bit color. 24-bits is very good for photos.

In Grayscale images (B&W photos), the pixel values are one channel of 8-bit data, of single numbers representing a shade of gray from black (0) to white (255).

Indexed color: Typically used for graphics containing relatively few colors (more than 256 colors is not possible in 8-bit indexed color, but likely graphics has only 4 or 8 or 16 colors). All GIF and PNG8 files are indexed color, and indexed is an option in TIF. These indexed files include a color palette (is just a list of the actual RGB colors). An 8-bit index is 28 = 256 values of 0..255, which indexes into a 256 color palette. Or a 3-bit index is 23 = 8 values of 0..7, which indexes into an 8 color palette. The actual pixel data is this index number into that limited palette of colors. For example, the pixels data might say "use color number 3", so the pixel color comes from the palette color number 3, which could be any 24-bit RGB color stored there. The editor creating the indexed file rounds all image colors into the closest values of just this limited number of possible palette values. The indexed pixel data is most commonly still one byte per pixel before compression, but if the bytes only contain these small index numbers for say 4-bit 16 colors, compression (lossless) can do awesome size reductions in the file. Being limited to only 256 colors is not good for photo images, which normally contain maybe up to 500K colors. But graphics of maybe 8 or 16 colors is a very small indexed color file and very suitable for a graphics. More on Indexed color.

The first Microsoft Dos had only 8 colors, plus a low intensity option for each, a total of 16 colors. There were no computer images yet. Then later, there was 8-bit indexed color that in general use (all there was) before our current 24-bit color hardware became available. A note from history, we might still see old mentions of "web safe colors". This wasn't about security, this was back in the day when our 8-bit monitors could only show the 256 indexed colors. The "web safe" standard palette was six specific shades of each R,G,B (216), plus 40 system colors that the OS might use. These colors would be rendered correctly, any others were just hopefully the nearest match. Having two indexed images (not of the standard web-safe pallette) on the same screen was general confusion, one pallette would be used by both (Not a problem today). The term "Web-safe" is obsolete now, every RGB color is "safe" for 24-bit color systems today. Today, 24-bit RGB color shows 256 shades of each of red, green, and blue, which 256x256x256 = 16.78 possible color combinations. But most actual photo images contain well less than 1 million colors.

Line Art (also called Bilevel) is two colors, normally black ink dots on white paper (the printing press could use a different color of ink or paper, but your home printer will only use black ink). Line art is packed bits (each 0 or 1 for black or white) and is not indexed, and is not the same as 1-bit Indexed (2 colors, index also 0 or 1), but which can be any two colors from a palette. Scanners have three standard scan modes, Line art, Grayscale, or Color mode (they may call it these names, or some (HP) may call them B&W mode and B&W Photo mode and Color, same thing. Line art is the smallest, simplest, oldest image type, 1 bit per pixel, which each pixel is simply either a 0 or 1 data. Examples are that fax is line art, sheet music would be best as line art, and printed text pages are normally best scanned as line art mode (except for any photo images on the same page). The name comes from line drawings such as newspaper cartoons which are normally line art (color may be added today inside the black lines, like a kids coloring book). We routinely scan color work at 300 dpi, but line art is sharper lines if created at 600 dpi, or commercially even 1200 dpi if you have some way to print that (the high resolution works because it's only one ink dot, there are no color dots that have to be dithered from multiple inks). So line art makes very small files (and more so if compressed). Line art is great stuff when applicable, the obvious first choice for these special cases. Line art mode in Photoshop is cleverly reached at Image - Mode - BitMap, where it won't say line art, but line art is created by selecting 50% Threshold there in BitMap (which has to already be a grayscale image to reach BitMap). BitMap there is actually for halftones, except selecting 50% Threshold there means all tones darker than middle will simply be black, and all tones lighter than middle will be white, which is line art. Two colors, black and white (50% threshold) means all tones darker than middle will simply be black, and all tones lighter than middle will be white, which is line art. Two colors, black and white.

One MB is a little more Than One Million Bytes

The memory size of images is often shown in megabytes. You may notice a little discrepancy from the number you calculate from pixels with WxHx3 bytes. This is because (as regarding memory sizes) "megabytes" and "millions of bytes" are not quite the same units.

Memory sizes in terms like KB, MB, GB, and TB count in units of 1024 bytes for binary K, whereas humans count thousands in units of 1000 for decimal K.

A million is 1000x1000 = 1,000,000, powers of 10, or 106, which the standard International System of Units (SI) defines the prefix kilo as 1000 (103). Per this definition, one kilo is 1000. However binary units were used for memory sizes, powers of 2, where one kilobyte is 1024 bytes, and a one megabyte is 1024x1024 = 1,048,576 bytes, or 220. So a number like 10 million bytes is 10,000,000 / (1024x1024) = 9.54 megabytes. One binary megabyte holds 4.86% (1024×1024/1000000) more bytes than one million, so there are 4.86% fewer megabytes than millions.

Converting Bytes to KB, MB, GB, TB Size Units of Memory

Type a value somewhere here, and click that Units button to convert the other Unit equivalences.

Convert: Bytes, KB, MB, GB, TB

Convert Memory sizes, units of 1024

Convert megapixel and disk sizes, units of 1000

Bytes B
Kilobytes KB
Megabytes MB
Gigabytes GB
Terabytes TB

If changing mode between 1024 (210) and 1000 (103) units, it will retain and use the previous K choice.

If you might see a format like an "e-7" in a result, it just means to move the decimal point 7 places to the left (or e+7, move to right). Example: 9.53e-7 is 0.000000953

Any computed fractional bytes are rounded to whole bytes. In binary mode, each line in the calculator is 1024 times the line below it (powers of 2). Which is binary, and is how memory computes byte addresses. However humans normally use 1000 units for their stuff (powers of 10). To be very clear:

Binary powers of 2 are 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024 ... which is 2 to the power of 0, 1, 2, 3, 4, 5, etc.

Digital powers of 10 are 1, 10, 100, 1000, 10000, 100000 ... which is 10 to the power of 0, 1, 2, 3, 4, 5, etc.

Specifically, megapixels and the GB or TB hard disk drive we buy are correctly dimensioned in 1000 units, and a 500 GB drive is correctly 500,000,000,000 bytes. However when we format the drive, when Windows then shows 1024 units, calling it 465.7 GB - but which is exactly the same bytes either way. Memory chips (including SSD and camera cards and USB sticks) necessarily use 1024 units. File sizes do not need 1024 units, however it has been common practice anyway. Windows may show file size either way, depending on location. Windows File Explorer normally shows binary KB, but CMD DIR shows actual decimal bytes, so the two do not agree unless you know its rules and do the conversion.

Convert with direct math
From\ToBKBMBGBTB
B-/1024/1024
2 times
/1024
3 times
/1024
4 times
KBx1024-/1024/1024
2 times
/1024
3 times
MBx1024
2 times
x1024-/1024/1024
2 times
GBx1024
3 times
x1024
2 times
x1024-/1024
TBx1024
4 times
x1024
3 times
x1024
2 times
x1024-

The math is easy to do directly. Conversion goes from left to right in the table. If you want to convert bytes to MB, Bytes to MB is two steps right in the list (B, KB, MB, GB, TB), so just divide bytes by 1024 twice to get MB. Or divide three times for GB.

Example:
3 GB = 3×1024×1024 = 3,145,728 KB
( x 1024 two times for GB to KB)

We also see units of Mb as a rate of bandwidth. Small b is bits, as in Mb/second of bandwidth. Capital B is bytes of data, as in MB size. Bandwidth uses digital units, powers of 10. There are Eight bits per byte, so Mb = MB x 8.

About Megabyte and Megapixel numbers

Humans count in decimal units of 10 or 1000 (which is 103), but binary units are powers of 2 (like 1024, which is 210). But binary units are indeed necessary for memory chips, including SSD and flash drives. These are different numbers.

Because every memory chip address line to select a byte can have two values, 0 and 1, therefore memory Total byte count of the memory chip must be a power of 2, for example 2, 4, 8 16, 32, 64, 128, 512, 1024, etc, etc.) But then computer operating systems arbitrarily got the notion to also use 1024 units for file sizes, but it is not necessary for file sizes, and it just confuses most humans. 😊 But all other human counting uses normal decimal 1000 units (powers of 10 instead of binary 2).

Specifically, specifications for megapixels in digital images, and hard disk drive size in gigabytes are both advertised as multiples of decimal thousands (meaning NOT 1024). So millions are 1000x1000. Or giga is 1000x1000x1000. Same way as humans count. That is the existing definition of kilo, mega, giga, and tera. The calculator offers a mode for units of 1000 to make the point about the difference. That 1000 is a smaller unit than 1024, therefore there are fewer memory units of KB, MB and GB, each of which holds a more bytes than 1000 units do. The same amount of bytes just have different counting units. Thousands is just how humans count (in powers of 10) — and million IS THE DEFINITION of Mega.

However, after formatting the disk, the computer operating system has notions to count it in binary 1024 units. There's no good reason for doing that on hard drives, it is merely a complication. The disk manufacturer DOES advertise the size correctly as decimal (like humans count), and formatting does NOT make the disk smaller, the computer just changes the units (in computer lingo, 1K became counted as 1024 bytes instead of 1000 bytes). So the size is a smaller number when said in the larger binary GB unit than in the decimal GB unit. This is why we buy a 500 GB hard disk drive (sold as 1000's, the actual real count, the decimal way humans count), and it does mean exactly 500,000,000,000 bytes (500 billion), and we do get them all. But then we format it, and then we see it said to be 465 gigabytes of binary file space (using units of 1024). Both numbering systems are numerically correct in their own way.

An actual 2 TB hard disk is sold as 2,000,000,000,000 decimal bytes (two trillion), but the exact same number of bytes becomes shown as 1.819 TB binary when formatted in the computer operating system. Still same exact number of bytes either way. But users who don't understand this numbering system switch might assume the disk manufacturer cheated them somehow. Instead, no, not at all, they know how to count, and you got the honest count. The disk just counted in 2,000,000,000,000 decimal, same way as we humans count. No crime in that, tetra does in fact mean trillion (1012), and we do count in decimal (powers of 10 instead of 2). It is the operating system that confuses us, calling tetra units something different, as powers of 2 (240 is approximately 1.1 trillion).

So again, note that a 2 TB hard disk drive does actually have 2,000,000,000,000 bytes as claimed. But instead, then our operating system converts it to specify it as 1.818989403546 TB (binary, which does Not seem useful to me, because it really does have exactly 2 TB of bytes, 2,000,000,000,000, in the way humans count in powers of 10). Powers of 10 is also true of camera megapixels, which also have no need to use the binary counting system (megapixels are NOT binary powers of 2). A camera sensor of 6000×4000 pixels is in fact exactly 24 megapixels. The 24 megapixels really is the true 24,000,000 pixels (6000 × 4000). Neither Megapixels nor hard disk sizes re in binary units. But all memory chips are, including SSD and flash drives.

So kilo, mega, giga and tera terms were defined as powers of 10, but were corrupted to have two meanings. Computers used those existing terms with different meanings for memory sizes. Instead of human units like 10, 100, 1000, etc, units become like 128, 256, 512, 1024, 4096, etc. Memory chips necessarily must use the binary counting system, but it is not necessary for hard disks or disk files (even if the operating system insists on calling it anyway). The meaning of the Mega, Kilo, Giga and Tera prefixes does and always has meant decimal units of 1000. And with the goal to preserve their actual decimal meanings, new international SI units Ki and Mi and Gi were defined in 1998 for the binary power units, same numeric units, but they have not caught on, and are very unused. So, this is still a complication today. Memory chips are binary, but there is absolutely no reason why our computer operating system still does this regarding file sizes. Humans count in decimal powers of 10, and so do the hard disk manufacturers counting disk bytes and camera megapixels counting pixels. But memory devices use 1024.

However, Memory chips (also including SSD and camera memory cards and USB flash sticks, which are all memory chips) are different, and their construction requires using binary kilobytes (counting in 1024 units) or megabytes (1024x1024) or gigabytes (1024x1024x1024). This is because each added address line exactly doubles size (powers of 2). Example: four address lines is a 4-bit number counting up to 1111 binary, which is 15 decimal (the maximum value that can be stored in 4 bits), which therefore can address 16 values of memory (0 to 15). Or 8-bits counts 256 values, or 16-bits addresses 65536 values. So if the memory chip has N address lines, it necessarily provides 2N bytes of memory. That's why memory size is dimensioned in units of 1024 bytes for what we call a 1K step. When two of these 1K chips are connected together, the plan is that they count up to 2x or 2048 bytes. But if each implemented only 1000 bytes, that leaves a missing 24 byte gap between them, when memory addressing would fail.

So there are necessary technical reasons for memory chips to use binary numbers, because each address bit is a power of two — the sequence 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, makes it be extremely impractical (simply unthinkable) to build a 1000 byte memory chip in a chip that counts to 1024 multiples. It simply would not come out even. The binary address lines count 0 to 1023, so it is necessary to add the other 24 bytes to fill it up. By totally filling the memory chip address lines, we can connect a few chips in series and have continuous larger memory. However, leaving any gaps in the addressing would totally ruin it (simply unusable bad byte values there), so it is never done (unthinkable).

In the early days, memory chips were very small, and it was a concern if they could hold the size of one specific file. Describing these files in binary terms to match the memory chip was useful then to know if it would fit. However there is no good reason for file sizes in binary today. Files are just a sequential string of bytes, which can be any total number. But memory chip size must be a binary power of 2, to match the address lines. The memory chip arrays today likely hold gigabytes and thousands of any files. So it is now unimportant to know an exact binary count in a file any more, and counting them in binary is an useless complication now. It would be much more practical to just know the actual byte size of a gigabyte. Nevertheless, the operating system counting in binary 1024 units is still commonly still done on files too. If we did have a file of actual size exactly 200,000 bytes (base 10), the computer operating system will call it 195.3 KB (base 2). That procedure seems pointless.

In base 10, we know the largest numeric value we can represent in 3 digits is 999. That's 9 + 90 + 900 = 999, and including 0 to be the range of 0..999 is a maximum of 103 = 1000 values. Binary base 2 works the same way, the largest number possible stored in 8-bits is 255, because 28 = 256 (which is 256 values stored as 0..255, in 8 bits). So 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 = 255. And 16-bits can contain addresses 0..65535. 216 = 65536 which is values 0..65525.

Units of 10 or 1000 are extremely handy for humans, we can convert KB and MB and GB in our head, by simply moving the decimal point. Units of 1024 are not so easy, but it came about in the computer early days back when 1024 bytes was a pretty large memory chip. Historically, we had to count bytes precisely to ensure the data would fit into the chip, and the 1024 number was important for programmers. Not so true today, chips are huge, and exact counts are unimportant now. Hard drives dimension size in units of 1000, but our operating systems still like to convert file sizes to 1024 units. There is no valid reason why today...

But it reminds of a memory, as a computer programmer back in the day decades ago, I had the job of modifying a computer's boot loader in a 256 byte PROM (1/4 of 1K binary, but 256 instead of 250). It was used with an 8080 chip in factory test stations that booted from a console cassette tape, and I had to add an option for booting from a central computer disk if it was present. I added the code, but then it was too large. After my best tries, the change was still 257 bytes, simply one byte too large to fit in the 256 byte PROM chip. It took some dirty tricks to make it fit and work. So exact memory size was very important in the earliest days (of tiny memory chips), but today, our computers have several GB of memory and terabytes of disk storage, and the exact precise file sizes really matter little. Interesting color maybe, at least for me. 😊

The definition of the unit prefix "Mega" absolutely has always meant millions (decimal factors of 1000x1000) — and it still does mean units of 1000, it does NOT mean 1024 units (except it is of course used that way too). Because memory chips are necessarily dimensioned in binary units (factors of 1024), and they simply incorrectly appropriated the terms kilo and mega, years ago... so that's special, but we do use it that way. In the early days, when memory chips were tiny, it was useful to think of file sizes in binary, when they had to fit. Since then though, chips have become huge, and files can be relatively huge too, and we don't sweat a few bytes now.

Note that you may see different numbers in different units for the same file size dimension:

Scanning Size calculator

Scanning any 6x4 inch photo will occupy the amounts of memory shown in the table below. I hope you realize that extreme resolution rapidly becomes impossible.

You may enter another resolution and scan size here, and it will also be calculated on the last line of the chart below. Seeing a result of NaN means that some input was Not a Number.

Scan size: by
inches
cm

At scan resolution: dpi    

(Result is shown on last row of the table below)

When people ask how to fix memory errors when scanning photos or documents at 9600 dpi, the answer is "don't do that" if you don't have 8 gigabytes of memory, and a 9600 dpi scanner, and have a special reason. It is normally correct to scan at 300 dpi to reprint at original size (600 dpi can help line art scans, but normally not if color or grayscale photos).

Saying that again: (is about a common first time error)

Scanning a 35 mm slide to print at 8x10 inches is very roughly 9x enlargement (approximate, allowing for very slight cropping).
The goal is that to print 8x10 inches at 300 dpi needs 2400x3000 pixels.
Two scanning methods work. Both examples here will scan at 2700 dpi:

The scan Input size is the 35 mm film size. The Output size is the print 8x10 inches.
You mark the Input size on the scanner preview with the mouse.

The pixels are the same either way (A or B), about 2400 x 3000 pixels. If sending it out with instruction to print 8x10, it will be 8x10 either way. You do need sufficient pixels (reasonably close), but it need not be precisely 300 dpi, most shops will probably print at 250 dpi anyway.

There are two points here:

Notice that when you increase resolution, the size formula above multiplies the memory cost by that resolution number twice, in both width and height. The memory cost for an image increases as the square of the resolution. The square of say 300 dpi is a pretty large number (more than double the square of 200).

Scan resolution and print resolution are two very different things. The idea is that we might scan about 1x1 inch of film at say 2400 dpi, and then print it 8x size at 300 dpi at 8x8 inches. We always want to print photos at about 300 dpi, greater scan resolution is only for enlargement purposes.
The enlargement factor is Scanning resolution / printing resolution. A scan at 600 dpi will print 2x size at 300 dpi.
Emphasizing, unless it is small film to be enlarged, you do not want a high resolution scan of letter size paper. You may want a 300 dpi scan to reprint it at original size.

When we double the scan resolution, memory cost goes up 4 times. Multiply resolution by 3 and the memory cost increases 9 times, etc. So this seems a very clear argument to use only the amount of resolution we actually need to improve the image results for the job purpose. More than that is waste. It's often even painful. Well, virtual pain.  😊

Copyright © 1997-2024 by Wayne Fulton - All rights are reserved.

Previous Main Next