Understanding Arrays vs Strings: Implementation Deep Dive

Dive into the technical implementation of arrays and strings. Learn how memory allocation, mutability, and character encoding affect your code's performance.

Author

Mr. Oz

Date

Read

8 mins

Level 2

Technical diagram showing memory layout of arrays and strings with visual representation of data structures

Author

Mr. Oz

Date

4 February 2026

Read

8 mins

Memory Layout: The Foundation

At the memory level, arrays and strings share a fundamental characteristic: contiguous memory allocation. When you create an array of integers or a string of characters, your computer allocates a continuous block of memory.

For an array [10, 20, 30, 40], memory might look like:

Address  | Value
0x1000   | 10
0x1004   | 20
0x1008   | 30
0x100C   | 40

Each integer occupies 4 bytes (on a 32-bit system), and they're stored sequentially. This layout enables O(1) random access—if you want the element at index 2, the computer calculates base_address + (2 × 4) and jumps directly to 0x1008.

Strings as Character Arrays

Strings are essentially arrays of characters, but with special semantics. The string "hello" is stored as:

Address  | Value
0x2000   | 'h' (104)
0x2001   | 'e' (101)
0x2002   | 'l' (108)
0x2003   | 'l' (108)
0x2004   | 'o' (111)
0x2005   | '\0' (0)   ← Null terminator (C-style strings)

In C-style strings, the null character '\0' marks the end. Modern languages like Python and Java use length fields instead of null terminators, but the principle remains: strings are sequential character data.

Mutability: The Critical Difference

One of the most important distinctions between arrays and strings is mutability—whether you can change the data after creation.

In Python, lists (dynamic arrays) are mutable:

arr = [1, 2, 3]
arr[0] = 99      # ✓ Works: arr is now [99, 2, 3]

s = "hello"
s[0] = 'H'       # ✗ TypeError: strings are immutable

This immutability has profound implications. When you "modify" a string in Python or Java, you're actually creating a new string and abandoning the old one:

s = "hello"
s = s + " world"   # Creates new string "hello world"
                    # Original "hello" is garbage collected

Arrays don't have this limitation. You can modify individual elements without copying the entire structure.

Common Operations and Their Complexities

Understanding the time complexity of operations helps you write efficient code:

Operation Array String
Access by index O(1) O(1)
Search for value O(n) O(n)
Insert at beginning O(n) O(n) — creates new string
Append at end O(1) amortized O(n) — creates new string

Notice how string operations are inherently slower because they require creating new copies. This is why building a large string by repeated concatenation is an anti-pattern—use join() or a StringBuilder instead.

Character Encoding: The Hidden Complexity

Strings bring encoding complexity that arrays don't have. Each character in a string might be 1 byte (ASCII), 2 bytes (UTF-16), or even 4 bytes (UTF-32) depending on the encoding.

# Python 3: Unicode strings
s = "café"
print(len(s))        # Output: 4 (characters)
print(len(s.encode('utf-8')))  # Output: 5 (bytes)

# The 'é' uses 2 bytes in UTF-8

Arrays don't have this complexity—each element is a fixed size. An array of 32-bit integers always uses exactly 4 bytes per element, regardless of the value stored.

Professional Patterns and Pitfalls

Pitfall: String Concatenation in Loops

Never build large strings by concatenating in a loop. Each concatenation creates a new string, copying all previous characters.

# BAD: O(n²) time
s = ""
for i in range(10000):
    s += str(i)  # Creates new string every iteration

# GOOD: O(n) time
parts = [str(i) for i in range(10000)]
s = "".join(parts)

Best Practice: Character Arrays for Manipulation

When you need to perform many modifications, convert strings to character arrays first.

# Convert to list for in-place modifications
s = "hello"
arr = list(s)      # ['h', 'e', 'l', 'l', 'o']
arr[0] = 'H'        # ✓ Works
s = "".join(arr)   # Convert back to string

Best Practice: Use StringBuilder for Java

In Java, use StringBuilder instead of string concatenation for mutable sequences.

// GOOD: O(n) time
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 10000; i++) {
    sb.append(i);
}
String result = sb.toString();

Production Considerations

When working with strings in production, consider these factors:

  • Memory overhead: Strings are objects with metadata (length, encoding, hash cache). Character arrays are more lightweight.
  • Interning: Some languages intern strings (reuse memory for identical strings), which can save memory but introduces GC pressure.
  • Immutability safety: String immutability makes them thread-safe and suitable as hash map keys. Arrays require synchronization for concurrent access.
  • Substring efficiency: In some languages, substrings share memory with the original string. In others, they copy data entirely.

Building on Level 1

In Level 1, we compared arrays to numbered lockers and strings to alphabet chains. Now you understand why those analogies hold: arrays provide direct access to any "locker," while strings require traversing the character "chain."

But there's more beneath the surface. How does the CPU cache interact with these contiguous memory layouts? Why do some string operations seem faster than others? In Level 3, we'll explore the hardware-level details that explain these behaviors.