Right arrowGo Back

Categories

String Concatenation in Java — Level 3

A deep dive for JVM enthusiasts: from pre–Java 9 desugaring toinvokedynamic-based concatenation viaStringConcatFactory, with practical guidance on escape analysis, allocation behavior, and measurement.

Text

Mr. Oz

Date

Read

15 mins

Level 3

Illustration representing advanced Java string concatenation internals
A professional headshot of a software developer in their early thirties with a friendly smile

Mr. Oz

Date

Read

15 mins

Share

Instagram logoTwitter logoYouTube logo

Concatenation is ultimately about constructing a contiguousString value. The question is how many temporary objects and copies you create along the way, and whether the JIT can eliminate them.

  1. Pre–Java 9: desugaring to StringBuilder

    The compiler desugars + into a freshStringBuilder for each expression (unless it is a compile‑time constant). The essential pattern looks like:

    // Real Java source
    String greet(String name) {
      return "Hello " + name + "!";
    }
    // Java source
    String s = "Hello " + name + "!";
    
    // Rough pseudo bytecode
    0: new           #java/lang/StringBuilder
    3: dup
    4: invokespecial #StringBuilder.<init>()
    7: ldc           #"Hello "
    9: invokevirtual #StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    12: aload_1      // name
    13: invokevirtual #StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    16: ldc           #"!"
    18: invokevirtual #StringBuilder.append(Ljava/lang/String;)Ljava/lang/StringBuilder;
    21: invokevirtual #StringBuilder.toString()Ljava/lang/String;
    24: astore_2     // s

    In loops, s += part repeatedly allocates builders and intermediate Strings, increasing pressure on young‑gen and GC.

    // Bad in hot loops (risking quadratic copying pre-9, still allocation-heavy)
    String joinPlus(List<String> parts) {
      String s = "";
      for (String p : parts) {
        s += p;
      }
      return s;
    }
    
    // Better: one builder, optional capacity hint
    String joinBuilder(List<String> parts) {
      int est = parts.stream().mapToInt(String::length).sum();
      StringBuilder sb = new StringBuilder(est);
      for (String p : parts) sb.append(p);
      return sb.toString();
    }
  2. Java 9+: invokedynamic and StringConcatFactory

    JEP 280 replaced most string concatenations with aninvokedynamic call site linked toStringConcatFactory. The VM selects a strategy (recipe and bootstrap) suitable for the operands.

    // Pseudo bytecode for: "Hello, " + name + "!"
    0: invokedynamic makeConcatWithConstants(a0) : String, 
       BootstrapMethod: StringConcatFactory.makeConcatWithConstants
       Recipe: "Hello, !" //  placeholders for args

    Recipes encode the constant parts and the argument layout. Hot call sites can be optimized aggressively by C2, often removing intermediate allocations entirely.

    // Real Java (JDK 9+):
    String full(String name) { return "Hello, " + name + "!"; }
    
    // Disassembly (abridged):
    // javac Full.java && javap -v Full | grep -A3 invokedynamic
    //   0: invokedynamic #0,  0             // MakeConcatWithConstants
    //      // Recipe: "Hello, !"
  3. Recipe forms and constants

    When concatenation mixes constants and values, the recipe string contains \u0001 markers. For multiple operands the factory may pick makeConcat (no constants) or makeConcatWithConstants (with constants).

    // Example with three values and constants
    String msg = a + ":" + b + "/" + c;
    // Recipe (illustrative): ":/"
  4. Choose the right tool: concrete scenarios

    • Logging: Prefer parameterized APIs to avoid eager concatenation.
      // Good (SLF4J): placeholders defer formatting
      log.info("User {} uploaded {} files", userId, count);
      // Avoid: log.info("User " + userId + " uploaded " + count + " files");
    • SQL: Use prepared statements, not concatenation.
      PreparedStatement ps = conn.prepareStatement(
        "SELECT * FROM users WHERE id = ? AND status = ?");
      ps.setLong(1, userId);
      ps.setString(2, "ACTIVE");
    • Joining collections: Use library helpers.
      String csv = String.join(",", items);
      String path = String.join("/", List.of("api", version, resource));
    • CSV/JSON text: Prefer builders and joining collectors.
      String csv = list.stream()
        .map(s -> s.replace(""", """")) // naive CSV escape
        .collect(Collectors.joining(","));
      
      // For JSON, use a JSON library instead of manual concatenation.
    • URLs: Avoid manual concatenation of query params.
      String url = String.format(
        "https://example.com/search?q=%s&page=%d",
        URLEncoder.encode(query, StandardCharsets.UTF_8), page);
  5. Allocation behavior: EA, TLAB, compact strings

    • Escape analysis: HotSpot can scalar‑replace builders at a call site, eliminating allocations if the result does not escape.
    • TLABs: Most young‑gen allocations are thread‑local and cheap; cost shows up as GC if volumes are high or long‑lived.
    • Compact strings (JDK 9): Stringstores bytes in LATIN1 when possible, halving footprint for ASCII‑only content.
  6. // Capacity hint helps: avoids internal array resizes/copies
    int estimated = prefix.length() + parts.stream().mapToInt(String::length).sum();
    StringBuilder sb = new StringBuilder(estimated + 16);
    sb.append(prefix);
    for (String p : parts) sb.append(p);
    String out = sb.toString();
  7. Benchmark like a pro (JMH)

    Measure with proper warm‑up and isolation. Microbenchmarks are sensitive to dead‑code elimination and inlining.

    @State(Scope.Thread)
    public class ConcatBench {
      @Param({"10", "100"}) int n;
      String[] parts;
    
      @Setup public void setup() {
        parts = new String[n];
        for (int i = 0; i < n; i++) parts[i] = String.valueOf(i);
      }
    
      @Benchmark public String plusInLoop() {
        String s = "";
        for (String p : parts) s += p; // intentionally bad in loops
        return s;
      }
    
      @Benchmark public String builderOnce() {
        StringBuilder sb = new StringBuilder();
        for (String p : parts) sb.append(p);
        return sb.toString();
      }
    }

    Run with a proper harness and warmup:

    # Example commands
    mvn -DskipTests -Pjmh clean install
    java -jar target/benchmarks.jar ConcatBench -wi 5 -i 10 -f 2
  8. Edge cases and correctness

    • nulls: String.valueOf(x) yields "null" without NPE; Objects.toString(x, "")allows defaults.
    • Primitives: append(int),append(double) avoid boxing.
    • Unicode: Watch combining marks and surrogate pairs when slicing; concatenation itself preserves code units.
    • Formatting: String.formatobeys locale; specify Locale.ROOT for stable machine output.
    // Unicode combining example: two code points, three char units possible
    String accent = "é"; // e + combining acute
    String word = "caf" + accent;
    int cps = word.codePointCount(0, word.length()); // counts code points
    // Be careful when slicing by char index; prefer codePoint APIs for correctness.
  9. Alternatives and advanced patterns

    • String.join,StringJoiner,Collectors.joining for delimited joins.
    • StringWriter orFormatter for stream‑like assembly.
    • Pool StringBuilder only with care; reuse across methods risks leaks and hurts JIT assumptions.
    // Formatter with StringWriter for structured output
    try (StringWriter w = new StringWriter();
         Formatter f = new Formatter(w, Locale.ROOT)) {
      f.format("(%d, %.2f)", count, ratio);
      return w.toString();
    }
  10. Takeaways

    • Use + for simple, non‑loop expressions.
    • Use a single StringBuilder in loops.
    • On Java 9+, invokedynamic handles most cases efficiently; still avoid += inside hot loops.

Tips and pitfalls

  • Pre‑size builders when estimating result length is easy.
  • Benchmark with JMH, not ad‑hoc timing. Warm‑up matters.
  • Beware of hidden allocations from boxing andtoString() of complex objects.

Latest Posts

A right black arrow