12 KiB
File 4: Advanced Java Concepts
This file covers the topics that make Java powerful in long-running, high-throughput systems: concurrency, the memory model, garbage collection, streams, and performance thinking. These are the areas where beginner knowledge often stops and engineering maturity starts.
You do not need to become a JVM internals expert on day one, but you do need a solid mental model. Otherwise, concurrency bugs, latency spikes, and memory issues will feel mysterious. The goal here is to remove that mystery.
Multithreading and Concurrency
Intuition
A thread is an independent path of execution inside a process. Java supports multiple threads so a program can do more than one thing at the same time or keep making progress while other work waits on I/O.
In a backend service, threads may handle:
- incoming HTTP requests
- scheduled jobs
- asynchronous message processing
- background cleanup work
- database or network calls coordinated by worker pools
Why Concurrency Exists
Without concurrency, a service that waits on slow operations would waste time doing nothing. With concurrency, other work can proceed while one task waits.
Basic Example
Runnable task = () -> System.out.println("Processing in " + Thread.currentThread().getName());
Thread worker = new Thread(task);
worker.start();
This creates and starts a new thread, but in production systems you usually prefer executors and thread pools over manually creating raw threads.
Thread Pool Mental Model
Instead of constantly creating and destroying threads, a thread pool reuses a fixed or managed set of worker threads.
That reduces overhead and gives you control over concurrency level.
ExecutorService executor = Executors.newFixedThreadPool(4);
executor.submit(() -> processOrder(orderId));
Real-World Use Cases
- processing multiple independent tasks in parallel
- handling high request volume in servers
- offloading email or report generation from the request thread
- consuming events from a queue with worker threads
Common Pitfall
More threads do not automatically mean better performance. Too many threads can increase context switching, memory pressure, lock contention, and tail latency.
Thread Lifecycle
stateDiagram-v2
[*] --> New
New --> Runnable: start()
Runnable --> Running: scheduled by CPU
Running --> Blocked: waiting for lock or I/O
Blocked --> Runnable: lock or I/O available
Running --> Waiting: wait/sleep/join
Waiting --> Runnable: notified or timeout
Running --> Terminated: run completes
Why the Lifecycle Matters
If you debug a production incident involving stuck requests or slow jobs, understanding whether threads are runnable, blocked, waiting, or deadlocked becomes very important. Tools like thread dumps only make sense if you understand these states.
Synchronization
Concurrency becomes difficult when multiple threads access shared mutable state.
The Core Problem
Imagine two threads incrementing the same counter.
counter++;
This looks atomic, but it is actually multiple steps:
- read current value
- add one
- write new value back
If two threads interleave these steps, updates can be lost.
synchronized
Java's synchronized keyword provides mutual exclusion and visibility guarantees.
public class Counter {
private int value;
public synchronized void increment() {
value++;
}
public synchronized int getValue() {
return value;
}
}
Only one thread can execute a synchronized method or block on the same monitor at a time.
Why It Works
Synchronization does two important things:
- prevents unsafe concurrent access to the protected region
- ensures memory visibility so threads see up-to-date values when entering and leaving synchronized regions
Real-World Use Cases
- protecting shared in-memory caches or mutable counters
- coordinating state transitions in schedulers and worker systems
- ensuring thread-safe updates in domain objects used by multiple threads
Common Pitfall
Synchronizing too broadly can kill throughput. Synchronizing too narrowly can fail to protect the actual shared invariant.
Locks and Higher-Level Concurrency Utilities
Beyond synchronized, Java provides explicit lock types and concurrency utilities in java.util.concurrent.
ReentrantLock
This can be useful when you need features beyond simple intrinsic locking, such as timed lock acquisition or finer control.
Lock lock = new ReentrantLock();
lock.lock();
try {
updateSharedState();
} finally {
lock.unlock();
}
Other Important Utilities
AtomicInteger,AtomicLong: lock-free atomic updates for simple counters and stateConcurrentHashMap: concurrent map implementation for many common shared lookup patternsCountDownLatch: wait until several tasks completeSemaphore: limit concurrent access to a resourceCompletableFuture: compose asynchronous operations more cleanly
Production Relevance
These utilities are common in rate limiting, request fan-out, concurrent caches, asynchronous orchestration, and worker coordination.
volatile
volatile is often misunderstood.
What It Guarantees
A volatile field provides visibility. When one thread writes a new value, other threads reading that field will see the latest value.
private volatile boolean running = true;
This is useful for simple state flags.
What It Does Not Guarantee
volatile does not make compound operations atomic.
This is still unsafe:
volatile int count;
count++;
The increment is still a read-modify-write sequence.
Good Use Case
Stopping a background loop cleanly:
while (running) {
pollQueue();
}
Bad Use Case
Using volatile as a replacement for proper locking around shared mutable objects.
JVM Memory Model
The Java Memory Model explains how threads interact through memory and what guarantees exist around reads, writes, ordering, and visibility.
Intuition
In a multithreaded system, one thread can update a value, but another thread may not immediately observe that update unless the program uses the right synchronization rules.
This is because CPUs, caches, compilers, and runtimes all optimize memory access.
Diagram
flowchart LR
A[Main Memory] --> B[Thread 1 Working Memory]
A --> C[Thread 2 Working Memory]
B --> D[Local reads and writes]
C --> E[Local reads and writes]
F[synchronized or volatile] --> A
Why Engineers Care
Without the memory model, some code would appear to "work on my machine" and fail under load or on different hardware.
The important practical lesson is not to memorize the whole specification. The important lesson is:
- do not share mutable state casually between threads
- use proper synchronization tools when you do share it
- visibility and atomicity are different concerns
Garbage Collection
Java manages memory automatically through garbage collection, but automatic does not mean irrelevant.
Intuition
Instead of manually freeing objects, Java tracks which objects are still reachable. Unreachable objects become candidates for collection.
Simplified Heap View
flowchart TD
A[Application creates objects] --> B[Objects live on heap]
B --> C[Reachable from roots]
B --> D[Unreachable objects]
D --> E[Garbage collector reclaims memory]
What Are Roots
Objects are considered reachable if they can be reached from GC roots such as:
- active thread stacks
- static references
- JNI references and runtime internals
Why This Matters in Real Systems
GC behavior affects:
- latency spikes
- memory footprint
- throughput
- container sizing decisions
Common Memory Problems in Java
- retaining objects in caches without eviction
- storing large collections in long-lived singletons
- leaking listeners or thread-local data
- creating too many short-lived temporary objects in hot code paths
Misconception
"Java cannot have memory leaks because it has garbage collection" is false. Java absolutely can have memory leaks if your code keeps references to objects that are no longer useful.
Streams API and Functional Programming Style
Java's Streams API provides a declarative way to process collections and sequences of data.
Intuition
Instead of describing every loop step manually, you describe a pipeline of transformations.
List<String> emails = users.stream()
.filter(User::isActive)
.map(User::getEmail)
.sorted()
.toList();
What Happens Conceptually
The stream pipeline is lazy until a terminal operation runs. Operations like filter and map describe work. A terminal operation such as toList(), count(), or forEach() triggers execution.
Why This Is Useful
Streams can make transformation pipelines clearer when the logic is naturally data-oriented.
Real-world examples:
- filtering valid requests
- mapping entities to DTOs
- aggregating metrics
- grouping records for reporting
When Streams Help
- straightforward transformations and aggregations
- collection processing where each step is conceptually distinct
- code that benefits from a pipeline style
When Plain Loops Are Better
- highly stateful logic
- error handling that becomes awkward in a stream chain
- performance-sensitive paths where allocation and readability must be controlled carefully
Common Pitfalls
- overusing streams for complex business workflows that become unreadable
- assuming streams are always faster than loops
- using shared mutable state inside stream operations
- misunderstanding laziness and side effects
Performance Considerations
Performance in Java is not just about writing "fast code." It is about understanding tradeoffs across CPU, memory, I/O, latency, and concurrency.
Key Areas to Watch
- object allocation rate
- unnecessary boxing and unboxing
- poor data structure choice
- lock contention
- blocking I/O on critical threads
- large heap pressure and GC pauses
Example: Data Structure Choice Matters
If you repeatedly test membership in a list of one million IDs, a HashSet is usually a much better fit than a List because lookup behavior is different.
Example: Avoiding Work on Hot Paths
If a request path is called tens of thousands of times per second, avoid unnecessary logging, object churn, repeated parsing, or expensive string formatting.
Production Perspective
Java performance work should be evidence-driven. Good engineers measure before changing code. They use:
- metrics and tracing
- profiling tools
- thread dumps
- heap dumps
- realistic load tests
Premature optimization is a problem, but ignoring obvious bottlenecks is also a problem. Performance work is about judgment, not superstition.
How Advanced Java Shows Up in Real Systems
Consider a service that consumes messages from a queue:
- a thread pool pulls messages concurrently
- shared state such as rate limits or caches needs safe access
- visibility and ordering matter between worker threads
- processed payloads create object allocations on the heap
- stream pipelines may transform batches for enrichment or filtering
- GC and lock contention affect latency under load
This is why advanced Java matters. These are not academic concerns. They directly influence system correctness and performance.
Key Takeaways
- Concurrency is about safely making progress with multiple threads, not just spawning more work in parallel.
synchronized, locks, atomics, and concurrent collections exist because shared mutable state is hard to manage correctly.volatileprovides visibility, not full atomicity.- The Java Memory Model explains why synchronization rules matter for correctness across threads.
- Garbage collection removes manual memory management, but memory leaks and GC-related latency issues still exist.
- Streams are powerful for data transformation pipelines, but they are not automatically clearer or faster in every situation.
- Strong Java engineers combine runtime understanding with measurement rather than guessing about performance.