Understanding Tree Serialization: Implementation Guide
Implement BFS and DFS tree serialization in Python, Java, and C++. Step-by-step code with detailed line-by-line explanations.
Implement BFS and DFS tree serialization in Python, Java, and C++. Step-by-step code with detailed line-by-line explanations.
Author
Mr. Oz
Date
Read
8 mins
Level 2
In Level 1, we used the shipping container analogy to understand what serialization and deserialization are at a conceptual level. Now it's time to roll up our sleeves and write the actual code.
We'll cover two approaches: BFS (level-order) and DFS (preorder), with full implementations in Python, Java, and C++.
Throughout this guide, we'll work with this binary tree:
1
/ \
2 3
/ \
4 5
Our goal: convert this tree to the string "1,2,3,null,null,4,5" and then back again.
BFS visits the tree level by level, left to right. We use a queue to track which node to visit next, and we include null markers for missing children.
from collections import deque
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def serialize(root):
if not root:
return "null"
queue = deque([root])
result = []
while queue:
node = queue.popleft()
if node:
result.append(str(node.val))
queue.append(node.left)
queue.append(node.right)
else:
result.append("null")
# Trim trailing nulls for cleaner output
while result and result[-1] == "null":
result.pop()
return ",".join(result)
Line 1: Import the queue data structure from collections
Lines 3-6: Define the tree node with a value and two child pointers
Line 9: Handle the edge case of an empty tree immediately
Line 10: Start BFS with the root node in the queue
Lines 13-20: Dequeue each node, record its value (or "null"), and enqueue its children regardless of whether they exist
Lines 22-23: Remove trailing nulls to keep the output compact and readable
def deserialize(data):
if data == "null":
return None
values = data.split(",")
root = TreeNode(int(values[0]))
queue = deque([root])
i = 1
while queue and i < len(values):
node = queue.popleft()
if i < len(values) and values[i] != "null":
node.left = TreeNode(int(values[i]))
queue.append(node.left)
i += 1
if i < len(values) and values[i] != "null":
node.right = TreeNode(int(values[i]))
queue.append(node.right)
i += 1
return root
Lines 2-3: Return None if the serialized data represents an empty tree
Line 5: Split the comma-separated string into individual value tokens
Line 6: Create the root node from the first value
Lines 8-17: Use the same BFS pattern — each dequeued node gets its left and right children assigned from the next two values in the list. Non-null values become new nodes added to the queue for further processing
The key insight: The queue-based approach guarantees that parent nodes are processed before their children, so the value index i always stays synchronized with the correct position in the serialized data.
DFS preorder visits root first, then the entire left subtree, then the entire right subtree. This produces a different — often more compact — format for sparse trees.
public class Codec {
public String serialize(TreeNode root) {
StringBuilder sb = new StringBuilder();
buildString(root, sb);
return sb.toString();
}
private void buildString(TreeNode node, StringBuilder sb) {
if (node == null) {
sb.append("null").append(",");
return;
}
sb.append(node.val).append(",");
buildString(node.left, sb);
buildString(node.right, sb);
}
public TreeNode deserialize(String data) {
LinkedList<String> nodes = new LinkedList<>(
Arrays.asList(data.split(","))
);
return buildTree(nodes);
}
private TreeNode buildTree(LinkedList<String> nodes) {
String val = nodes.removeFirst();
if (val.equals("null")) return null;
TreeNode node = new TreeNode(Integer.parseInt(val));
node.left = buildTree(nodes);
node.right = buildTree(nodes);
return node;
}
}
Lines 2-4: Public serialize method creates a StringBuilder and delegates to the recursive helper
Lines 7-13: The recursive preorder traversal — append the current node's value, then recursively process left, then right. Null nodes are explicitly marked with "null,"
Lines 16-18: Deserialization splits the string into a linked list of tokens and passes it to the recursive builder
Lines 21-27: The recursive builder removes tokens from the front of the list. Each "null" returns null; each value creates a node and recursively builds its left then right children. The linked list acts as a queue, maintaining order
#include <string>
#include <sstream>
#include <queue>
using namespace std;
struct TreeNode {
int val;
TreeNode *left, *right;
TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
};
string serialize(TreeNode* root) {
if (!root) return "null";
queue<TreeNode*> q;
q.push(root);
ostringstream oss;
while (!q.empty()) {
TreeNode* node = q.front();
q.pop();
if (node) {
oss << node->val << ",";
q.push(node->left);
q.push(node->right);
} else {
oss << "null,";
}
}
string result = oss.str();
// Trim trailing commas and nulls
while (!result.empty() && result.back() == ',') result.pop_back();
size_t pos = result.rfind(",null");
while (pos != string::npos && pos == result.size() - 5) {
result.erase(pos);
pos = result.rfind(",null");
}
return result.empty() ? to_string(root->val) : result;
}
Lines 1-3: Include necessary standard library headers for string operations and queue
Lines 5-8: Define the TreeNode struct with constructor initialization
Lines 11-28: BFS traversal using ostringstream for efficient string building — the stream avoids the overhead of repeated string concatenation in C++
Lines 29-35: Post-processing to clean up trailing commas and null values from the output string
split(",") on an empty string can produce unexpected arrays
i < len(values) before accessing the array
"1,2,3,null,null,4,5"
"1,2,null,null,3,4,null,null,5,null,null"
Want the deep dive?
Level 1
Learn what tree serialization is through a fun analogy about shipping a treehouse across the country.
Author
Mr. Oz
Duration
5 mins
Level 2
Implementation guide with BFS and DFS serialization in Python, Java, and C++ with line-by-line explanations.
Author
Mr. Oz
Duration
8 mins
Level 3
Advanced optimization with compression, binary formats, and real-world production use cases.
Author
Mr. Oz
Duration
12 mins