Dynamic Programming with Trees

Consider the following problem: (Source: leetcode problem 40)

Given a collection of candidate numbers (C) and a target number (T), find all unique combinations inCwhere the candidate numbers sums to T.

Each number in C may only be used once in the combination.

Note:

All numbers (including target) will be positive integers.
The solution set must not contain duplicate combinations.

For example, given candidate set[10, 1, 2, 7, 6, 1, 5]and target8, A solution set is:

[
  [1, 7],
  [1, 2, 5],
  [2, 6],
  [1, 1, 6]
]

The Idea:

Lets begin with a brute force approach, because doing so often reveals areas we can potentially improve upon. With brute force, our goal is to build a combination tree that will identify all sub combinations within the array. We can then traverse this tree to obtain all path paths that sum to target k.

Consider the array [2,1,1] and target k = 3

Iterating through the tree, we can identify the accumulative summation that follows from each nodes parent. Using this approach, we can identify the following solutions.

The general strategy in building any sort of tree to first build each node, and then form the connection as the algorithm recurs back up. However, rather than forming a single connection at a time, when it comes to an n-ary tree, it is best to be concerned with a vector of children at a time. To get myself started, I like to build the default connection root node, connect all it's children, and then work on each subbranch independently.

Can you spot any redundancies or issues in the image above? Here are a few questions we can ask:

How can we be smarted about the order of the elements within the array?
When can we conclude the construction of a subtree?
How can we incorporate dynamic programming to both save even more time and space?
Duplicates. How can we efficiently acknowledge that a path is duplicated?

I am going to address these questions in order.

How can we be smarted about the order of the elements within the array?

Sorting the array in ascending order is a solid investment because it ensures that our algorithm does not have to deal with elements that already exceed the target k. Because our algorithm iterates left to right on each subbranch, each subbranch also carries this advantage.

When can we conclude the construction of a subtree?

For the same reason, it would be unwise to continue our search for summations that exceed target k. My first implementation deals with this concern, while my second one is does not.

How can we incorporate dynamic programming to both save even more time and space?

Because trees have a recurring structure, it is quite likely that our algorithm would be redoing the same computation over and over again. Fun fact: if we print all combinations in a systematic way, we will notice visual fractal like properties. Dynamic programming allows us to identify these kind of circumstances and essentially just store the result in some form or other so it does not become necessary to perform the same computation. There are two major advantages here. One, in our case, this would be mean avoiding having to recompute a potentially large subtree. Instead, we can create a pointer to that subtree and immediately move on from there. Secondly, we save space from having allocate new memory to every node in this subtree. Since we are redirecting pointers here, it is important to note iterating through the tree bares no advantage, but in terms of time and space - we save a ton.

The following image demonstrates recurring the kind of recurring subtrees we want to look out for.

Using dynamic programming, we can vastly reduce the complexity.

Duplicates. How can we efficiently acknowledge that a path is duplicated?

One solution is to store paths into an unordered_set<string> that can be used to as a check for unique path solutions. (See root_leaf_paths_rec)

Previousk-Clustering NextImplementation 1

Last updated 4 years ago