Compression
Compression is the process of cataloguing pieces
of data or eliminating unnecessary bits to shrink data or
files so that it can be transferred or transmitted faster.
The sole objective of compression program is to reduce the
overall number of bits and bytes from a file or data at
the point of compression and to recreate the data or file
to its original size, composition or nearest form when expanded.
Different compression programs use different methods to
compress data or file, some uses adaptive dictionary-based
algorithm while others do not. Those that uses adaptive
dictionary based algorithm often times expand back to its
original size or composition and as a consequence regarded
as lossless compression programs. Lossy compression
programs on the other hand works differently, it simply
eliminates "unnecessary" bits of information, tailoring
the file so that it is smaller, and it does not reproduce
files or data to its original size or composition when expanded.
It mere expands it back to its nearest composition using
its interpretation to recreate truncated data. Note that
this may affect the quality of compressed file positively
or negatively. Take for example, truncated noise in a compressed
music file that expanded back without such noise makes the
quality of the file better. Lossy compression is very good
for compression of picture, music, and video files because
colour or sound bits could be swapped or even dropped without
losing the image or sound completely but is certainly not
suitable for compressing software applications, databases
and others that must be recreated exactly.
Lossless compression programs usually track attributes
or patterns within a data set or file in a catalogue. Thereafter
replaces each attribute or pattern with certain representation,
number for example within the data set or file. This catalogue
file is also transmitted along side the compressed data,
making it easy to regenerate exactly like the original when
expanded.
The crude example below would help exemplify what happens
when a file or data is compressed using lossless compression
program and thereafter expanded:
Before compression:
"I hate talking but talking is the best way of expression.
Fortunately talking is not the only way to show expression."
Cataloguing, compression in process…
What the compression program does is to catalogue the repeated
words as in:
The new look of the statements…. "I hate 1 but 1 5 2 best
3 of 4. Fortunately 1 5 not 2 only 3 to show 4."
Now, the above shall be transmitted alongside with the
catalogue. The compression program will use the catalogue
data along side with the transmitted data to expand the
file back to its original size.
How good a compression program is depends on its file-reduction
ratio which is also affected by a number of factors, including
file type, file size, compression scheme etc. For example,
a compression program that can reduce the size of a document
by 75% and recreate same back on expansion is better than
one that can only reduce same document by 50%. However,
if after expansion, it is discovered that part of the content
is not reproduced exactly, then no mater what its compression
ratio is, such program is not suitable for document compression.
|