String Concatenation in Java — Level 3

A deep dive for JVM enthusiasts: from pre–Java 9 desugaring to invokedynamic-based concatenation via StringConcatFactory, with practical guidance on escape analysis, allocation behavior, and measurement.

Author

Mr. Oz

Date

Read

15 mins

Level 3

Illustration representing advanced Java string concatenation internals

Author

Mr. Oz

Date

Read

15 mins

Share

Concatenation is ultimately about constructing a contiguous String value. The question is how many temporary objects and copies you create along the way, and whether the JIT can eliminate them.

  1. Pre–Java 9: desugaring to StringBuilder

    The compiler desugars + into a fresh StringBuilder for each expression.

    // Real Java source
    String greet(String name) {
      return "Hello " + name + "!";
    }
    // Rough pseudo bytecode
    0: new           #java/lang/StringBuilder
    3: dup
    4: invokespecial #StringBuilder.<init>()
    7: ldc           #"Hello "
    9: invokevirtual #StringBuilder.append(...)
    12: aload_1      // name
    13: invokevirtual #StringBuilder.append(...)
    16: ldc           #"!"
    18: invokevirtual #StringBuilder.append(...)
    21: invokevirtual #StringBuilder.toString()
    24: astore_2     // s
    // Bad in hot loops
    String joinPlus(List<String> parts) {
      String s = "";
      for (String p : parts) {
        s += p;
      }
      return s;
    }
    
    // Better: one builder, optional capacity hint
    String joinBuilder(List<String> parts) {
      int est = parts.stream().mapToInt(String::length).sum();
      StringBuilder sb = new StringBuilder(est);
      for (String p : parts) sb.append(p);
      return sb.toString();
    }
  2. Java 9+: invokedynamic and StringConcatFactory

    JEP 280 replaced most string concatenations with an invokedynamic call site linked to StringConcatFactory.

    // Pseudo bytecode for: "Hello, " + name + "!"
    0: invokedynamic makeConcatWithConstants(a0) : String, 
       BootstrapMethod: StringConcatFactory.makeConcatWithConstants
       Recipe: "Hello, \u0001!" // \u0001 placeholders for args
    // Real Java (JDK 9+):
    String full(String name) { return "Hello, " + name + "!"; }
    
    // Disassembly (abridged):
    // javac Full.java && javap -v Full | grep -A3 invokedynamic
    //   0: invokedynamic #0,  0             // MakeConcatWithConstants
    //      // Recipe: "Hello, \u0001!"
  3. Recipe forms and constants

    When concatenation mixes constants and values, the recipe string contains \u0001 markers.

    // Example with three values and constants
    String msg = a + ":" + b + "/" + c;
    // Recipe (illustrative): "\u0001:\u0001/\u0001"
  4. Choose the right tool: concrete scenarios

    • Logging: Prefer parameterized APIs.
      // Good (SLF4J): placeholders defer formatting
      log.info("User {} uploaded {} files", userId, count);
    • SQL: Use prepared statements.
      PreparedStatement ps = conn.prepareStatement(
        "SELECT * FROM users WHERE id = ? AND status = ?");
      ps.setLong(1, userId);
      ps.setString(2, "ACTIVE");
    • Joining collections: Use library helpers.
      String csv = String.join(",", items);
      String path = String.join("/", List.of("api", version, resource));
  5. Allocation behavior: EA, TLAB, compact strings

    • Escape analysis: HotSpot can scalar‑replace builders, eliminating allocations if the result does not escape.
    • TLABs: Most young‑gen allocations are thread‑local and cheap.
    • Compact strings (JDK 9): String stores bytes in LATIN1 when possible.
    // Capacity hint helps: avoids internal array resizes/copies
    int estimated = prefix.length() + parts.stream().mapToInt(String::length).sum();
    StringBuilder sb = new StringBuilder(estimated + 16);
    sb.append(prefix);
    for (String p : parts) sb.append(p);
    String out = sb.toString();
  6. Benchmark like a pro (JMH)

    Measure with proper warm‑up and isolation.

    @State(Scope.Thread)
    public class ConcatBench {
      @Param({"10", "100"}) int n;
      String[] parts;
    
      @Setup public void setup() {
        parts = new String[n];
        for (int i = 0; i < n; i++) parts[i] = String.valueOf(i);
      }
    
      @Benchmark public String plusInLoop() {
        String s = "";
        for (String p : parts) s += p;
        return s;
      }
    
      @Benchmark public String builderOnce() {
        StringBuilder sb = new StringBuilder();
        for (String p : parts) sb.append(p);
        return sb.toString();
      }
    }

    Run with a proper harness and warmup:

    # Example commands
    mvn -DskipTests -Pjmh clean install
    java -jar target/benchmarks.jar ConcatBench -wi 5 -i 10 -f 2
  7. Edge cases and correctness

    • nulls: String.valueOf(x) yields "null" without NPE.
    • Primitives: append(int), append(double) avoid boxing.
    • Unicode: Watch combining marks and surrogate pairs when slicing.
    • Formatting: String.format obeys locale; specify Locale.ROOT for stable machine output.
  8. Takeaways

    • Use + for simple, non‑loop expressions.
    • Use a single StringBuilder in loops.
    • On Java 9+, invokedynamic handles most cases efficiently; still avoid += inside hot loops.

Tips and pitfalls

  • Pre‑size builders when estimating result length is easy.
  • Benchmark with JMH, not ad‑hoc timing. Warm‑up matters.
  • Beware of hidden allocations from boxing and toString() of complex objects.