Understanding Tree Serialization: Implementation Guide

Implement BFS and DFS tree serialization in Python, Java, and C++. Step-by-step code with detailed line-by-line explanations.

Author

Mr. Oz

Date

Read

8 mins

Level 2

Side-by-side code editor showing tree serialization implementation with tree diagrams transforming into string arrays

Author

Mr. Oz

Date

Read

8 mins

Share

In Level 1, we used the shipping container analogy to understand what serialization and deserialization are at a conceptual level. Now it's time to roll up our sleeves and write the actual code.

We'll cover two approaches: BFS (level-order) and DFS (preorder), with full implementations in Python, Java, and C++.

The Tree We'll Serialize

Throughout this guide, we'll work with this binary tree:

        1
       / \
      2   3
         / \
        4   5

Our goal: convert this tree to the string "1,2,3,null,null,4,5" and then back again.

Approach 1: BFS Level-Order Serialization (Python)

BFS visits the tree level by level, left to right. We use a queue to track which node to visit next, and we include null markers for missing children.

from collections import deque

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def serialize(root):
    if not root:
        return "null"
    queue = deque([root])
    result = []
    while queue:
        node = queue.popleft()
        if node:
            result.append(str(node.val))
            queue.append(node.left)
            queue.append(node.right)
        else:
            result.append("null")
    # Trim trailing nulls for cleaner output
    while result and result[-1] == "null":
        result.pop()
    return ",".join(result)

Line 1: Import the queue data structure from collections

Lines 3-6: Define the tree node with a value and two child pointers

Line 9: Handle the edge case of an empty tree immediately

Line 10: Start BFS with the root node in the queue

Lines 13-20: Dequeue each node, record its value (or "null"), and enqueue its children regardless of whether they exist

Lines 22-23: Remove trailing nulls to keep the output compact and readable

BFS Deserialization (Python)

def deserialize(data):
    if data == "null":
        return None
    values = data.split(",")
    root = TreeNode(int(values[0]))
    queue = deque([root])
    i = 1
    while queue and i < len(values):
        node = queue.popleft()
        if i < len(values) and values[i] != "null":
            node.left = TreeNode(int(values[i]))
            queue.append(node.left)
        i += 1
        if i < len(values) and values[i] != "null":
            node.right = TreeNode(int(values[i]))
            queue.append(node.right)
        i += 1
    return root

Lines 2-3: Return None if the serialized data represents an empty tree

Line 5: Split the comma-separated string into individual value tokens

Line 6: Create the root node from the first value

Lines 8-17: Use the same BFS pattern — each dequeued node gets its left and right children assigned from the next two values in the list. Non-null values become new nodes added to the queue for further processing

The key insight: The queue-based approach guarantees that parent nodes are processed before their children, so the value index i always stays synchronized with the correct position in the serialized data.

Approach 2: DFS Preorder Serialization (Java)

DFS preorder visits root first, then the entire left subtree, then the entire right subtree. This produces a different — often more compact — format for sparse trees.

public class Codec {
    public String serialize(TreeNode root) {
        StringBuilder sb = new StringBuilder();
        buildString(root, sb);
        return sb.toString();
    }

    private void buildString(TreeNode node, StringBuilder sb) {
        if (node == null) {
            sb.append("null").append(",");
            return;
        }
        sb.append(node.val).append(",");
        buildString(node.left, sb);
        buildString(node.right, sb);
    }

    public TreeNode deserialize(String data) {
        LinkedList<String> nodes = new LinkedList<>(
            Arrays.asList(data.split(","))
        );
        return buildTree(nodes);
    }

    private TreeNode buildTree(LinkedList<String> nodes) {
        String val = nodes.removeFirst();
        if (val.equals("null")) return null;
        TreeNode node = new TreeNode(Integer.parseInt(val));
        node.left = buildTree(nodes);
        node.right = buildTree(nodes);
        return node;
    }
}

Lines 2-4: Public serialize method creates a StringBuilder and delegates to the recursive helper

Lines 7-13: The recursive preorder traversal — append the current node's value, then recursively process left, then right. Null nodes are explicitly marked with "null,"

Lines 16-18: Deserialization splits the string into a linked list of tokens and passes it to the recursive builder

Lines 21-27: The recursive builder removes tokens from the front of the list. Each "null" returns null; each value creates a node and recursively builds its left then right children. The linked list acts as a queue, maintaining order

BFS Serialization (C++)

#include <string>
#include <sstream>
#include <queue>
using namespace std;

struct TreeNode {
    int val;
    TreeNode *left, *right;
    TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
};

string serialize(TreeNode* root) {
    if (!root) return "null";
    queue<TreeNode*> q;
    q.push(root);
    ostringstream oss;
    while (!q.empty()) {
        TreeNode* node = q.front();
        q.pop();
        if (node) {
            oss << node->val << ",";
            q.push(node->left);
            q.push(node->right);
        } else {
            oss << "null,";
        }
    }
    string result = oss.str();
    // Trim trailing commas and nulls
    while (!result.empty() && result.back() == ',') result.pop_back();
    size_t pos = result.rfind(",null");
    while (pos != string::npos && pos == result.size() - 5) {
        result.erase(pos);
        pos = result.rfind(",null");
    }
    return result.empty() ? to_string(root->val) : result;
}

Lines 1-3: Include necessary standard library headers for string operations and queue

Lines 5-8: Define the TreeNode struct with constructor initialization

Lines 11-28: BFS traversal using ostringstream for efficient string building — the stream avoids the overhead of repeated string concatenation in C++

Lines 29-35: Post-processing to clean up trailing commas and null values from the output string

Common Pitfalls

  • Trailing commas: When building the string, it's easy to end with a comma. Always trim it before returning, or use a join method that avoids it
  • Empty string input: Always check for empty or "null" input before attempting to split and process. An unguarded split(",") on an empty string can produce unexpected arrays
  • Leading/trailing nulls: In BFS serialization, all the nulls at the end of the last level can be safely trimmed. But nulls in the middle levels must be preserved, or the deserialization will produce a different tree structure
  • Integer parsing errors: When deserializing, always validate that non-null tokens are valid integers before calling parseInt or stoi
  • Index out of bounds: In BFS deserialization, always check i < len(values) before accessing the array

BFS vs DFS Serialization Formats

BFS Level-Order

"1,2,3,null,null,4,5"

  • Preserves the visual tree shape
  • LeetCode standard format
  • Easier to verify visually
  • Can include trailing nulls (trimmed for cleanliness)
DFS Preorder

"1,2,null,null,3,4,null,null,5,null,null"

  • More explicit null markers
  • Naturally recursive implementation
  • Compact for balanced trees
  • Verbose for wide/shallow trees

Production Tips

  • Use StringBuilder / ostringstream: Never concatenate strings in a loop. It creates O(n^2) temporary objects
  • Choose your delimiter carefully: Commas work for integers, but if your tree stores strings, use a delimiter that won't appear in the data (like a pipe character or a length-prefixed format)
  • Validate input during deserialization: Production code should handle malformed input gracefully — return null or throw a descriptive exception
  • Consider using integer arrays instead of strings: If you control both serialization and deserialization, passing an integer array (with a sentinel like Integer.MIN_VALUE for null) avoids string parsing overhead entirely

Key Takeaways

  • BFS serialization uses a queue and processes nodes level by level
  • DFS preorder serialization uses recursion and processes root-left-right
  • Null markers are essential for both approaches to preserve tree structure
  • BFS is the LeetCode standard format; DFS is more natural for recursive implementations
  • Always handle edge cases: empty tree, single node, and completely unbalanced trees
  • Use StringBuilder (Java) or ostringstream (C++) instead of string concatenation in loops

All Levels