Automatic Color Detection: How to Reduce the Size of your Documents

Hi folks,

In this article we are going to explain to our general public what color detection is all about and how can it be used to dramatically reduce the size of your electronically stored documents.

In a previous article, we were showing that bitmaps (or raster images) are made of pixels (ordered in arrays or matrices, each pixel having its own coordinates and color) in a way similar to how mosaics are made out of pieces of coloured glass.
Since bits (« 0 » and « 1 ») are used to store information about color, it is quite logical that, the more colours need to be encoded in an image, the more bits per pixel (or « bpp ») are necessary to store that information and therefore, the larger the size of the bitmap image file will be.

From a color point of view, bitmaps can be:

– Black and white
Being only 2 colors, they are encoded in 1 bpp (either « 0 » or « 1 » for either black or white) so these bitmaps consumes less size the lesser possible size for color information.

– Grayscale

Such images are in black, white and various sets of intermediary grey shades.
Generally, a 8 bpp color encoding is considered acceptable, but you will note that each pixel color requires already 8 times more data than for the B/W images.

– Color

Images are colored in nuance (color gradation) palettes of various sizes but 24 bpp color encoding is considered to be satisfactory as it can store over 16,7 million colours while the human eye can discern only about 10 million.
Of course, each pixel color for such images takes 3 times more data than the 8 bpp and 24 times more data than the 1 bpp.

Now, why is all this so important?

In real life, not only the professionals in document storage but also most of us are forced to compromise between the needs of storing documents at as high quality as possible but at smallest possible size (mainly for sharing purposes).
To achieve that, scanning operators have to separate B/W pages from grayscale and from colored ones and scan each of those sets at 1 bpp, 8 bpp and 24 bpp, respectively.
This is a terribly slow, painful and subject to human error task.

What if everything could be done instantly, automatically and with no scanning constraints?

Well, we at ORPALIS have developed a patent pending, proprietary technology of automatic color detection.
All you have to do is put all your documents in one batch, no matter their color type, scan them all in color mode and our software will automatically determine the color type of each page.
Then, depending on the detected color-type, the filter will automatically encode the image in its best suited / optimized bits-per-pixel encoding.
In other words, providing best quality for smallest possible size.

This feature is already implemented in PaperScan Pro starting with version 1.6 and will be fully programmatically available in next GdPicture.NET major release.

Care for a practical testing?

Make sure you have latest PaperScan Pro (even a trial version) installed.
For your convenience, we provide a 3 TIFF test files in a zipped folder to use for batch import, but you can test using your own images, either acquired from scanner or importing existing images files.
Each TIFF file is bigger than 1 MB so the 3 will total more than 3 MB in size.
Now save them in PDF multipage format.
The resulting PDF file (PaperScan creates it using JPEG optimization and PDF pack technology) will be about 800 kb in size.
Not bad, but if you think we can’t do even better, you’ll have to think again!

From the main menu, go to « Options / Batch Acquisition/Import Filters…« .

PaperScan Pro Batch Acquisition/Import Filters…

Select « Automatic Color Detection » option and click « Save »

PaperScan Pro Automatic Color Detection

Now import the TIFF files again and save as multipage PDF : the resulting file is 65 kb in size !
Ta-daaam!

Our next step is to provide automatic color detection for regions of same single document.
This will be available for end-users since one of the upcoming PaperScan versions and, of course, programmatically for developers using our next GdPicture.NET toolkit!

Cheers!

Bogdan

Automatic color detection: how to dramatically reduce the size of your documents

Share This Information