🌐 What is the Fork/Join Framework?
The Fork/Join Framework is a high-performance parallel computing framework designed to split large tasks into smaller subtasks, which can be processed in parallel on multiple CPU cores. After processing, results are combined (joined) to produce the final result.
Key Concept: Divide-and-Conquer
✅ Fork: Split big task into smaller pieces (subtasks).
✅ Join: Combine results from subtasks into final result.
✅ Why Use Fork/Join?
✔️ Automatically scales with number of available CPU cores (uses a work-stealing pool).
✔️ Efficient for recursive tasks, like processing large arrays, matrices, or directories.
✔️ Makes parallelism easier by providing built-in mechanisms for splitting, processing, and combining work.
📦 Key Classes
Class/Interface | Purpose |
---|---|
ForkJoinPool | Special thread pool designed for parallel divide-and-conquer tasks |
RecursiveTask<V> | For tasks that return a result |
RecursiveAction | For tasks that return no result (void work) |
🔎 Example — Summing Array Elements (RecursiveTask)
Step 1: Define Task (RecursiveTask)
import java.util.concurrent.RecursiveTask;
public class SumTask extends RecursiveTask<Long> {
private static final int THRESHOLD = 100; // When to stop splitting
private final int[] array;
private final int start, end;
public SumTask(int[] array, int start, int end) {
this.array = array;
this.start = start;
this.end = end;
}
@Override
protected Long compute() {
if (end - start <= THRESHOLD) {
// Small enough — compute directly
long sum = 0;
for (int i = start; i < end; i++) {
sum += array[i];
}
return sum;
} else {
// Split into two subtasks
int mid = (start + end) / 2;
SumTask leftTask = new SumTask(array, start, mid);
SumTask rightTask = new SumTask(array, mid, end);
// Fork both subtasks
leftTask.fork(); // Asynchronously compute left
long rightResult = rightTask.compute(); // Compute right in current thread
long leftResult = leftTask.join(); // Wait for left result
// Combine result
return leftResult + rightResult;
}
}
}
Step 2: Set Up Pool and Execute
import java.util.concurrent.ForkJoinPool;
public class ForkJoinExample {
public static void main(String[] args) {
ForkJoinPool pool = new ForkJoinPool(); // Uses all available cores
int[] array = new int[1_000_000];
for (int i = 0; i < array.length; i++) {
array[i] = i + 1; // Fill array with numbers 1 to 1_000_000
}
SumTask task = new SumTask(array, 0, array.length);
long result = pool.invoke(task); // Start the recursive process
System.out.println("Sum: " + result); // Expected: 500000500000
}
}
🔗 Explanation
Step | What Happens? |
---|---|
Create Task | SumTask created for whole array |
Fork/JoinPool.invoke() | Pool starts processing the task |
Task Splits | Task splits into two subtasks if size > threshold |
Fork | Subtasks are submitted back to pool |
Join | Parent waits for subtasks to complete |
Combine | Subtask results are combined (added) |
Final Result | Parent returns combined result |
⚡ Key Benefits
✅ Automatically manages thread usage.
✅ Uses work-stealing — idle threads steal work from busy threads.
✅ Optimized for CPU-bound tasks (not ideal for I/O).
📊 Summary Table — RecursiveTask vs RecursiveAction
Feature | RecursiveTask | RecursiveAction |
---|---|---|
Returns Result? | ✅ Yes | ❌ No |
Example Task | Sum numbers | Process files (no result needed) |
Typical Return Type | Integer, Long, etc. | void |
🚀 Work-Stealing — Built-in Optimization
- Each thread has its own queue of tasks.
- If a thread finishes its work and finds nothing in its own queue, it steals a task from another busy thread’s queue.
- This improves load balancing automatically.
🔥 When to Use Fork/Join
✅ Recursive, divide-and-conquer problems
✅ CPU-bound processing (arrays, matrices, graphs)
✅ Large tasks that can be split into independent subtasks
❌ When NOT to Use
🚫 For simple parallel loops, parallelStream()
is often better.
🚫 For blocking I/O tasks, regular ExecutorService
is better.
📣 Final Pro Tip for Interviews
✅ Explain:
“Fork/Join is designed for parallel divide-and-conquer problems. It splits tasks recursively into smaller parts, executes them in parallel, and combines results — with automatic work-stealing for optimal thread utilization.”
✅ If asked to compare with ExecutorService
:
Feature | Fork/Join | ExecutorService |
---|---|---|
Best for | Recursive CPU-bound work | Independent tasks (web requests, I/O) |
Work-Stealing | ✅ Yes | ❌ No |
Recursive Task Handling | ✅ Built-in | ❌ Manual split/merge required |
Connection to parallel streams
Yes, Fork/Join and parallel streams actually have quite a bit in common — but they operate at slightly different levels of abstraction. Let me break it down:
✅ What they have in common
1. Both rely on the ForkJoinPool
- Parallel Streams internally use the common ForkJoinPool to parallelize stream operations.
- If you create your own Fork/Join task, you can either use the common pool or a custom ForkJoinPool.
2. Both work with the divide and conquer approach
- Both frameworks split large tasks into smaller subtasks that can be processed in parallel, and then the results are combined.
- This recursive splitting and merging is central to both.
3. Both are about parallelism (multithreading across multiple cores)
- The goal is to break a workload into independent tasks that can be computed in parallel.
- Both benefit from multicore processors.