将值的完整/“最佳”分配给二叉树节点以覆盖一组序列

如何解决将值的完整/“最佳”分配给二叉树节点以覆盖一组序列

我有一个由K个符号组成的字母，在这个字母上有一组S长度为L的序列（无重复），以及一个具有2 ^ N个节点的给定二叉树。我想将符号分配给节点（每个节点一个符号），以使集合中的每个序列在路径中至少出现一次在树中（或者，如果不可能，则尽可能多的序列至少出现一次）覆盖整个集合。

注1：不必分配所有节点，只需树中覆盖该集的最大可能部分即可。

注释2：：树的遍历（以及随后的输出顺序）是从节点到根的。对于节点n，对于L个符号，所得L序列的第一个符号将为n.symbol，其第二个符号将为n.parent.symbol，依此类推。所有节点都初始化为一个空符号（无符号）。

到目前为止，我的方法是以下（使用“ python”伪代码）

 1  Create a dictionary,D,(or HashMap/HashTable) with the unique L-1 subsequences (last L-1 symbols) of S as keys and a list of all (first) symbols per key
 2  fringe ← Ø
 3  Assign the top (root) L nodes a random sequence (and remove it) from S and add their children to the fringe
 4  while fringe ≠ Ø and S ≠ Ø
 5      for n in fringe                       # Fringe contains all the unassigned children of all assigned nodes
 6          sub_seq = n.sequence()            # Unassigned nodes output the last L-1 symbols of their L-path (because the first (empty) symbol is concatenated with the last L-1 (non-empty) symbols)
 7          best_score_1 ← -1                 # Used to evaluate the possible symbols for node n
 8          best_score_2 ← -1                 # Tie-breaker for score_1
 9          
10          for symbol in D[sub_seq]          # Due to line 24,D[sub_seq] may be an empty list
11              L_seq = symbol + sub_seq
12              score_1 = 1 if L_seq in S else 0
13              score_2 = <Table/Function below for arguments: node n,first L-1 symbols of L_seq>
14              
15              if score_1 > best_score_1 or (score_1 == best_score_1 and score_2 > best_score_2)
16                  best_score_1 = score_1
17                  best_score_2 = score_2
18                  best_of_node = n
19                  best_with_sub_seq = sub_seq
20                  best_for_symbol = symbol
21          
22          if best_score_1 > -1
23              best_of_node.set_symbol(best_for_symbol)
24              D[best_with_sub_seq].remove(best_for_symbol)
25              S.remove(best_of_node.sequence())
26              fringe.remove(best_of_node)
27              fringe.append(best_of_node.children)
28              break
29      
30      if best_score_1 == -1                 # If D[sub_seq] is empty for all nodes in fringe,best_score_1 will be -1
31          print(Assignment incomplete: run out of good options)
32          break

score_2（针对节点n及其top_sub_seq）的功能：

                              len( n.children ) == 2   |   len( n.children ) < 2
                            -------------------------------------------------------
len( D[top_sub_seq] )  > 1  |            1             |            0             |
-----------------------------------------------------------------------------------
len( D[top_sub_seq] ) <= 1  |           -1             |            1             |
                            -------------------------------------------------------

基于对“最佳”解决方案的属性的直观了解，这两个分数可作为启发式方法。第一个分数告诉我们是否正在创建所需的和以前未见的序列。第二个分数评估应在分叉节点（有两个孩子的节点）还是非分叉节点中进行分配。

当S中的其余序列无法与边缘中的“暴露”序列重叠（带有L-1个公共符号）时，算法终止。显然，我们可以通过“初始化”边缘下方的j-1个未分配节点来寻找较少的重叠（与Lj个公共符号）来继续分配，但是我正在尝试实现最佳分配，但我不确定这是否是一个好方法方法。

问题：

是否可以最佳地解决这个问题？
给定集合S和树，我们可以从理论上计算最优解的比率（在S /分配的节点中覆盖的序列）（最优比率或“好的”上限）吗？
解决该问题的更好的算法是什么样的？

将值的完整/“最佳”分配给二叉树节点以覆盖一组序列

如何解决将值的完整/“最佳”分配给二叉树节点以覆盖一组序列

相关推荐