路径长度:从树中一个节点到另一个节点之间的分支构成两个节点之间的路径,路径上的分支数目就成为路径长度。
树的路径长度:从树根到每一节点的路径长度之和。
带权路径长度:考虑到带权的节点,节点的带权路径长度是从该节点到根之间的路径长度与节点权的乘积。
带权路径长度(WPL):树中所有叶子节点的带权路径长度之和。
在上图中,根节点到节点D的路径长度是4,二叉树的路径长度=1+1+2+2+3+3+4+4=20。WPL(a)=51+152+403+304+10*4=315。
假设有n各权值{w1,w2,…,wn},构造一棵有n各叶子节点的二叉树,每个叶子节点带权wk,每个叶子的路径长度为lk,WPL=w1l1+w2l2+…+wn*ln,记带权路径长度WPL最小的二叉树为哈夫曼树,也称其为最优二叉树。
注意:二叉树的存储以数组形式保存,而且0下标不存储元素,i下标的左孩子2*i,右孩子2*i+2.(之前有关二叉树的文章介绍过,大家可以自行查看)。
#include
#include
#include
#include using namespace std;typedef unsigned int WeigthType;typedef unsigned int NodeType;typedef struct
{WeigthType weight;NodeType parent, leftchild, rightchild;
}HNode;typedef struct IndexWeight
{int index;WeigthType weight;operator WeigthType() const { return weight; }//按照权值比较
};class HuffManTree
{
private:vector w;vector hft;int n = 0;
public://构造函数(初始化Huffman树) (叶子节点树,权重,hufman树)HuffManTree(vector w1){w = w1;n = w1.size();hft.resize(n * 2);for (int i = 1; i <= n; i++){hft.at(i).weight = w[i-1];} }void CreatHuffman(){//优先级队列priority_queue, std::greater> qu;for (int i = 1; i <= n; i++){qu.push(IndexWeight{ i,hft.at(i).weight });}int k = n +1;while (!qu.empty()){//左节点if (qu.empty()) break;IndexWeight left = qu.top(); qu.pop();//右节点if (qu.empty()) break;IndexWeight right = qu.top(); qu.pop();hft.at(k).weight = left.weight + right.weight;hft.at(k).leftchild = left.index;hft.at(k).rightchild = right.index;hft.at(left.index).parent = k;hft.at(right.index).parent = k;qu.push(IndexWeight{ k,hft.at(k).weight });k++; }}void PrintTree(){int lenght = hft.size();for (int i = 1; i < lenght; i++){cout << "index:" << i<< "\tweight:" << hft.at(i).weight<< "\tparent:" << hft.at(i).parent<< "\tleftchile:" << hft.at(i).leftchild<< "\trightchild:" << hft.at(i).rightchild << endl; }cout << endl;}vector GetHft(){return hft;}
};
int main()
{vector w{ 5,29,7,8,14,23,3,11 };HuffManTree tree(w);tree.CreatHuffman();cout << "打印哈夫曼树:" << endl;tree.PrintTree(); return 0;
}
哈夫曼树是为了解决远距离通信的数据传输的最优化问题。
比如,有一段文字内容为ABCDEFGH,现需要将其传输给别人,通常就会用二进制来表示每个字符,如下(A的ASCII码为65):
字母 | A | B | C | D | E | F | G | H |
---|---|---|---|---|---|---|---|---|
二进制字符 | 100 0001 | 100 0010 | 100 0011 | 100 0100 | 100 0101 | 100 0110 | 100 0111 | 100 1000 |
按照上述所说,传输的内容变为:“100 0001100 0010100 0011100 0100100 010100 0110100 0111100 1000”接受时可以三位来解码。但是当文档很大时,所需要的二进制串是非常可怕的,而且有些字符出现的频率是非常高的。
为提高数据传输的效率,一般采用哈夫曼编码对数据进行压缩,随着字符的增多和多字符权重的不同,这种压缩会更加显出优势。
前缀编码:编码中非0即1,长度不等的话其实很容易混淆,若要设计长短不等的编码,则必须是任一字符的编码都不是另一个字符的编码的前缀,这种编码称为前缀编码。
一般地,设需要编码的字符集为{d1,d2,...,dn},各个字符在电文中出现的次数或频率集合为{w1,w2,...,wn},以d1,d2,...,dn作为叶子节点,以w1,w2,...,wn作为相应叶子节点的权值来构造一棵哈夫曼树。规定哈夫曼树的左分支表示0,右分支表示1,则从根节点到叶子节点所经过的路径分支组成的0和1的序列便为该节点对于字符的编码,这就是哈夫曼编码。
字母 | A | B | C | D | E | F | G | H |
---|---|---|---|---|---|---|---|---|
出现的频率*100 | 5 | 29 | 7 | 8 | 14 | 23 | 3 | 11 |
二进制字符 | 11111 | 10 | 1110 | 000 | 110 | 01 | 11110 | 001 |
#include
#include
#include
#include using namespace std;typedef unsigned int WeigthType;typedef unsigned int NodeType;typedef struct
{WeigthType weight;NodeType parent, leftchild, rightchild;
}HNode;typedef struct IndexWeight
{int index;WeigthType weight;operator WeigthType() const { return weight; }//按照权值比较
};class HuffManTree
{
private:vector w;vector hft;int n = 0;
public://构造函数(初始化Huffman树) (叶子节点树,权重,hufman树)HuffManTree(vector w1){w = w1;n = w1.size();hft.resize(n * 2);for (int i = 1; i <= n; i++){hft.at(i).weight = w[i-1];} }void CreatHuffman(){//优先级队列priority_queue, std::greater> qu;//按照下标比较for (int i = 1; i <= n; i++){qu.push(IndexWeight{ i,hft.at(i).weight });}int k = n +1;while (!qu.empty()){//左节点if (qu.empty()) break;IndexWeight left = qu.top(); qu.pop();//右节点if (qu.empty()) break;IndexWeight right = qu.top(); qu.pop();hft.at(k).weight = left.weight + right.weight;hft.at(k).leftchild = left.index;hft.at(k).rightchild = right.index;hft.at(left.index).parent = k;hft.at(right.index).parent = k;qu.push(IndexWeight{ k,hft.at(k).weight });k++; }}void PrintTree(){int lenght = hft.size();for (int i = 1; i < lenght; i++){cout << "index:" << i<< "\tweight:" << hft.at(i).weight<< "\tparent:" << hft.at(i).parent<< "\tleftchile:" << hft.at(i).leftchild<< "\trightchild:" << hft.at(i).rightchild << endl; }cout << endl;}vector GetHft(){return hft;}
};typedef struct HuffmanCodeNode
{char ch;string code;
}HuffmanCodeNode;class HuffManCode:public HuffManTree
{
private:vector huffcode;vector hft;
public:HuffManCode(const vector *ch1, vector w1):HuffManTree(w1){int length = ch1->size();huffcode.resize(length+1);//0下标不用,从1下标开始for (int i = 1; i <= length; i++){huffcode.at(i).ch = ch1->at(i-1); } }void CreateHuffmanCode(){hft = HuffManTree::GetHft();int length = huffcode.size();for (int i = 1; i < length; i++)//只需计算0-8下标的huffman编码{string code1;int k = length;int c = i;int parent = hft.at(c).parent;while (parent != 0){string temp = hft.at(parent).leftchild == c ? "0" : "1";//code1.insert(0, temp);huffcode.at(i).code.insert(0, temp);c = parent;parent = hft.at(c).parent;}}}void PrintCode(){int length = huffcode.size();for (int i = 1; i < length; i++){cout << "data:" << huffcode.at(i).ch << " --> " << hft.at(i).weight<< " --> " << "huffmancode:" << huffcode.at(i).code << endl;}}
};int main()
{vector w{ 5,29,7,8,14,23,3,11 };vector ch{ 'A','B','C','D','E','F','G','H' };HuffManCode hc(&ch,w);hc.CreatHuffman();hc.CreateHuffmanCode();cout << "打印哈夫曼树:" << endl;hc.PrintTree();cout << "打印哈夫曼编码:" << endl;hc.PrintCode(); return 0;
}