## Monday, May 19, 2014

### Back To The Basics: The Binary Tree Insert and Find Operations

In my previous post Back To The Basics: The Binary Tree I described The Binary Tree data structure as well as provided a basic template for the class in Java. In this post we’ll take a look at the insert and find operations.

The insert and find operations on a Binary Search Tree are actually quite similar. They both recurse through the tree starting at a particular node (typically the root) using a similar method to a binary search to find the appropriate place that the node should exist at. Once the appropriate place for the node has been determined the find method will return the node if it exists and the insert method will insert the node into that position if it does not already exist. Both operations take time proportional to the height of the tree.

Let’s first take a look at the insert operation.

public void insert(T newData)
this.insert(this, newData);
}

private void insert(Tree<T> parent, T newData)
if(parent.data == null{
parent.data = newData;
return;
}

if(newData.compareTo(parent.data) == 0)
return;

if(newData.compareTo(parent.data) < 0{
if(parent.left == null) parent.left = new Tree<T>();
this.insert(parent.left, newData);
} else {
if(parent.right == null) parent.right = new Tree<T>();
this.insert(parent.right, newData);
}
}

Now let's take a look at the find operation.

public
Tree<T> find(T dataToFind) {
return this.find(this, dataToFind);
}

private Tree<T> find(Tree<T> parent, T dataToFind) {
if(parent == null)
return parent;

if(parent.data.compareTo(dataToFind) == 0)
return parent;

if(dataToFind.compareTo(parent.data) < 0)
return this.find(parent.left, dataToFind);

return this.find(parent.right, dataToFind);
}

What you should notice is that these two operations are strikingly similar. They both take advantage of recursion. Both of the algorithms are made up of the same three parts.

The first part of the algorithm checks to see if the current node exists. In the case of the insert operation this is true if the data of the node is null.* If this is true we know that we can safely set the nodes data and exit. In the find operation this is true if the node itself is null in which case we didn't find what we're looking for. In either case this is an exit condition of the recursive calls.

The second part of the algorithm checks to see if the current nodes data is equal to the data passed in to the method. In the case of the insert operation this is a signal that duplicate data is attempting to be inserted into the tree. In this case we can terminate the recursion since we know we no longer have any additional work to do. In the case of the find operation this is the signal that we've found the data that we're looking for so we can return the current node and exit our recursion.

The last part of the algorithm recurses to the next level of the tree. It decides which branch of the tree to recurse down based on whether the data passed in to the method is less or greater than the current nodes value.  This is what allows this algorithm to have time proportional to the height of the tree. Because we only recurse down one child node we're only ever acting on one node at each level of the tree.

* The insert operation needs to set the data on a an existing node and cannot act on a null node. The reason for this is that we'd need a way to create the new node and assign it to the parent. Because we're using recursion this is difficult but easily overcome if before we recurse into the insert operation we first make sure the node we're sending in IS NOT NULL.