Here are the links for the two previous articles of the series:
- https://www.gdpicture.com/blog/pdf-optimization-series-part1-methods/
- https://www.gdpicture.com/blog/pdf-optimization-series-part2-lossless-methods/
A PDF document is composed of various data structures, described as different objects.
In the scope of optimization, one is particularly important: the stream object.
This object aims to store binary data representing the content of an image, font file, color profile, embedded file, or other.
Stream objects offer various filters to compress the data they encapsulate. At the same, these are the only objects whose contents can be compressed.
Version 1.5 of the PDF format introduces a new type of stream, an object stream (ObjStm). It is a collection of many PDF objects together inside a single binary stream. The purpose of this type of object is to allow the compression of PDF objects not of the stream type.
This process considerably reduces the size of PDF files.
Compression of objects of type Stream
As said, each stream object could represent the data in a compressed way.
However, it is not uncommon for applications producing PDF files not to use this possibility. In those situations, it will be appropriate to regenerate the file by compressing the data of these objects.
The attached example illustrates file size reduction after compressing a stream:
Compression of other types of objects using Object Streams
Many PDF creators do not use object stream (ObjStm) types and therefore, do not compress all the embedded data in the generated files.
It is easy to determine if the produced file uses this feature. With a simple text editor, you can easily check if the version of the PDF file is greater than or equal to 1.5. You can also observe if data of types like boolean, numbers, strings, dictionaries, etc. are “readable.”
In this case, it will be convenient to regenerate the file by grouping and compressing objects that are not of type stream into a sequence of objects – object streams. File size reduction demonstrates the attached example.
GdPicture.NET offers compression of PDF files by using the EnableCompression() method.
You will find an example of usage in our reference guide.