Techware Labs Header

Forums have moved

See this announcement for more details, or just go directly there.


Go Back   Techwarelabs Community > Tech > Networking

Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old 02-28-2002, 07:40 PM
Keefe Keefe is offline
Administrator
 
Join Date: May 2002
Location: Wisconsin
Posts: 2,337
Send a message via ICQ to Keefe Send a message via AIM to Keefe Send a message via MSN to Keefe Send a message via Yahoo to Keefe
Default Term of the Day: Huffman compression

Also known as Huffman encoding, an algorithm for the lossless compression of files based on the frequency of occurrence of a symbol in the file that is being compressed. The Huffman algorithm is based on statistical coding, which means that the probability of a symbol has a direct bearing on the length of its representation. The more probable the occurrence of a symbol is, the shorter will be its bit-size representation. In any file, certain characters are used more than others. Using binary representation, the number of bits required to represent each character depends upon the number of characters that have to be represented. Using one bit we can represent two characters, i.e., 0 represents the first character and 1 represents the second character. Using two bits we can represent four characters, and so on.

Unlike ASCII code, which is a fixed-length code using seven bits per character, Huffman compression is a variable-length coding system that assigns smaller codes for more frequently used characters and larger codes for less frequently used characters in order to reduce the size of files being compressed and transferred.

For example, in a file with the following data:

XXXXXXYYYYZZ

the frequency of "X" is 6, the frequency of "Y" is 4, and the frequency of "Z" is 2. If each character is represented using a fixed-length code of three bits, then the number of bits required to store this file would be 36, i.e., (3 x 6) + (3 x 4) + (3 x 2) = 36.

If the above data were compressed using Huffman compression, the more frequently occurring numbers would be represented by smaller bits, such as:

X by the code 11 (2 bits)
Y by the code 10 (2 bits)
Z by the code 011 (3 bits)


therefore the size of the file becomes 26, i.e., (2 x 6) + (2 x 4) + (3 x 2) = 26.

In the above example, more frequently occurring characters are assigned smaller codes, resulting in a smaller number of bits in the final compressed file.

Huffman compression was named after its discoverer, David Huffman.
__________________
It's crazy I'm thinking, just knowing that the world is round.
-http://www.techwarepc.com/ - The Technology Experts
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 04:36 AM. Powered by vBulletin® Version 3.6.5
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Forum style by ForumMonkeys.