

Thus, we are assigning 'a' the codeword of the shortest length i.e., 0 and 'c' a longer one i.e., 1100. In this example, 'a' is appearing 51 out of 100 times and has the highest frequency, 'c' is appearing only 2 out of 100 times and has the least frequency. Since characters which have high frequency has lower length, they take less space and save the space required to store the file.

#Youtube how to encode a message using binary code code
Huffman code assigns a shorter length codeword for a character which is used more number of time (or has a high frequency) and a longer length codeword for a character which is used less number of times (or has a less frequency). Huffman code doesn't use fixed length codeword for each character and assigns codewords according to the frequency of the character appearing in the file. For example, if we assign 'a' as 000 and 'b' as 001, the length of the codeword for both the characters are fixed i.e., both 'a' and 'b' are taking 3 bits. We know that our files are stored as binary code in a computer and each character of the file is assigned a binary character code and normally, these character codes are of fixed length for different characters. The algorithm is based on the frequency of the characters appearing in a file. Huffman code is a data compression algorithm which uses the greedy technique for its implementation. You can learn these from the linked chapters if you are not familiar with these. We are going to use Binary Tree and Minimum Priority Queue in this chapter.
