. Huffman Encoding (a.) (6 points) Suppose a certain file contains only the following letters with the corresponding frequencies 1 AİB 73 9 30 44 130 28 16 In a fixed-length encoding scheme, cach...
. Huffman Encoding (a.) (6 points) Suppose a certain file contains only the following letters with the corresponding frequencies 1 AİB 73 9 30 44 130 28 16 In a fixed-length encoding scheme, cach character is given a binary representation with the same number of bits. What is the minimum number of bits required to represent each letter of this file under fixed-length encoding scheme? Describe how to encode all seven letters in this file using the number of bits you gave carlier. What is the length of the encoded file under this encoding scheme? (b.) Huffman code is a way to encode information using variable-length binary strings to represent symbols depending on the frequency of each individual letter. Specifically, letters that appear more frequently can be encoded into strings of shorter lengths, while rarer letters can be turned into longer binary strings. On average, Huffman code is a more efficient way to encode a message as the number of bits in the output string will be shorter than if a fixed-length code was used. In addition, this encoding scheme is proven to be unambiguous in the sense that we can easily identify the boundaries between letters and uniquely decrypt an encoded message. That is, no letter's encoding can be the prefix of another letter's encoding (e.g., we cannot have both 00 and 001 as the encoding for two different letters). Huffman encoding is managed through a full binary tree, called the Huffman tree. Here, the leaf vertices are letters in the original message. For each internal vertices, the left outgoing edges (branches) are labeled with a 0, and right branches are labeled with a 1 The path from the root to the leaf gives us encoding of the corresponding letter in the leaf. The following is the algorithm for building Huffman tree (Rosen p.764) procedure Huffiman(C: letters aj with frequencies uin) 1. F collection of n rooted trees, each cosisting of the single vertex a, and weight w, 2. while F is not a single tree Replace the rooted trees T and T of least weights from F with w(T) 2(T) with a tree having a new root that has T as its left subtree and T' as its right subtree. Label the new edge to T with 0 and the new edge to T with 1 Assign (T) + (T) as the weight of the new tree. The Huffman coding for the symbol a, is the concatenation of the labels of the edges in the unique path fron the root to the vertex αί (12 points) Construct the Huffman code that enables you to encrypt the same file in part (a) so that you can store it using the least number of bits. Specify the binary representation for each letter and compute the length of the encoded file under Huffman coding
. Huffman Encoding (a.) (6 points) Suppose a certain file contains only the following letters with the corresponding frequencies 1 AİB 73 9 30 44 130 28 16 In a fixed-length encoding scheme, cach character is given a binary representation with the same number of bits. What is the minimum number of bits required to represent each letter of this file under fixed-length encoding scheme? Describe how to encode all seven letters in this file using the number of bits you gave carlier. What is the length of the encoded file under this encoding scheme? (b.) Huffman code is a way to encode information using variable-length binary strings to represent symbols depending on the frequency of each individual letter. Specifically, letters that appear more frequently can be encoded into strings of shorter lengths, while rarer letters can be turned into longer binary strings. On average, Huffman code is a more efficient way to encode a message as the number of bits in the output string will be shorter than if a fixed-length code was used. In addition, this encoding scheme is proven to be unambiguous in the sense that we can easily identify the boundaries between letters and uniquely decrypt an encoded message. That is, no letter's encoding can be the prefix of another letter's encoding (e.g., we cannot have both 00 and 001 as the encoding for two different letters). Huffman encoding is managed through a full binary tree, called the Huffman tree. Here, the leaf vertices are letters in the original message. For each internal vertices, the left outgoing edges (branches) are labeled with a 0, and right branches are labeled with a 1 The path from the root to the leaf gives us encoding of the corresponding letter in the leaf. The following is the algorithm for building Huffman tree (Rosen p.764) procedure Huffiman(C: letters aj with frequencies uin) 1. F collection of n rooted trees, each cosisting of the single vertex a, and weight w, 2. while F is not a single tree Replace the rooted trees T and T of least weights from F with w(T) 2(T) with a tree having a new root that has T as its left subtree and T' as its right subtree. Label the new edge to T with 0 and the new edge to T with 1 Assign (T) + (T) as the weight of the new tree. The Huffman coding for the symbol a, is the concatenation of the labels of the edges in the unique path fron the root to the vertex αί (12 points) Construct the Huffman code that enables you to encrypt the same file in part (a) so that you can store it using the least number of bits. Specify the binary representation for each letter and compute the length of the encoded file under Huffman coding