🧠

Compression

The act of making something smaller
Often used to store/transmit efficiently
Reduction in the amount of space needed to store a piece of data or the bandwidth to transmit it
Compression ratio
- Size compressed data / size of original data
Can be lossy or lossless
- Lossless --> compression where no data is lost
- Lossy --> compression where some data is lost

Text Compression

Three common types:

Keyword Encoding

Frequently used words are replaced with characters
We do not bother encoding single length words (I, a, etc) because we wouldn't save any space
The longer the word, the higher the compression ratio

Keyword Encoding Example

word	symbol
As*	^
the	~
and	+
that	$
must	&

*note the uppercase A

As the Sun and the Moon circle the Earth, we must remember that we are small.

becomes...

^ ~ Sun + ~ Moon circle ~ Earth, we & remember $ we are small.

Run-Length Encoding

When characters are repeated, we remove all duplicates and replace them with a count
- AAAAAAA ---> *A7
- * is a flag indicating compression needs to be decoded

Huffman Encoding

Uses variable-length bit strings to represent each characters
Frequently used characters have short codes
- Saves space
Every bit string representing a character must have a unique prefix
- Allows you to decode

Decoding

Look for match left --> right (bit by bit)
Replace when match found
Repeat until decoded

Huffman Encoding Example

Huffman Code	Character
00	A
01	E
100	L
110	O
111	R
1010	B
1011	D

Encoding DOORBELL would result in 1011110110111101001100100
If this used a fixed-size bit string for each character (say 8 bits) it would take 64 bits to store the word (8 x 8)
The compressed version only takes 25 bits
Compression ratio is 25/64 (~0.39)

Video Compression

Temporal Compression

Looks for differences between consecutive frames
If most of an image in two frames hasn't changed, we don't need all of the data of the second frame
Keyframe is chosen to compare differences
- Entire keyframe is stored
Consecutive images only store changes
- Delta frames

Spatial Compression

Removes redundant information within a frame
Similar issues faced by compression still images
Group pixels into blocks that have same color
- Instead of storing each pixel, color and coordinates are stored
- Similar to run-length encoding

Computer Science

Backlinks