Just like a spoon full of sugar will cure your hiccups, running your JVM with -XX:+UseShenandoahGC will cure your Java garbage collection hiccups. Shenandoah GC is a new garbage collector algorithm developed for OpenJDK at Red Hat, which will produce much better pause times than the currently-available algorithms without a significant decrease in throughput. In this session, we'll explain how Shenandoah works and compare it to the currently-available OpenJDK garbage collectors.
10. Brief Intro to Compacting GCs
Heap after
several
binary tree
modifications
Heap after
Compaction
And reclamation of
Unreachable objects
Two phases:
1) Trace
2) Compact
11. Concurrent Tracing
● Solved Problem
● Snapshot At The Beginning (SATB)
– Used by several OpenJDK GC algorithms
● CMS
● G1
● Shenandoah
15. What we want to happen when the GC Thread
copies Foo
T1
T2
T3
Foo
● Before ● After
T1
T2
T3
Foo'
But Finding and Updating all the references to Foo takes time.
16. What Shenandoah does
● Before ● After
Almost as good, as long as all accesses go through
the Forwarding pointer.
Indirection
Pointer
Foo
Indirection
Pointer
Foo
Indirection
Pointer
Foo'
T1
T2
T3
T1
T2
T3
18. Read Barriers: Reading a Field
● Without Shenandoah
0x00007fffe1102cd1: mov 0x10(%rsi),%rsi ;*getfield value
; - java.lang.String::equals@20 (line 982)
● With Shenandoah
0x00007fffe1102ccd: mov -0x8(%rsi),%rsi read the contents of the indirection
pointer for the address contained in
register rsi back into rsi.
0x00007fffe1102cd1: mov 0x10(%rsi),%rsi ;*getfield value
; - java.lang.String::equals@20 (line 982)
Smart compiler
Will fill delay slots
19. But there is still a race condition
● Java Thread
Read ResolveLocation(Foo – 0x8)
…
Writes to Foo
● GC Thread
...
Copies Foo to Foo'
● Solution: Copying write barriers.
● Java Threads aid in evacuation, by not writing to objects targeted for evacuation.
20. Write Barrier
0x00007fffe1110318: movabs $0x7fffec0b92c0,%rax
0x00007fffe1110322: mov (%rax,%rbx,1),%al
0x00007fffe1110325: test $0x1,%al ← evacuation in progress?
0x00007fffe1110328: je 0x00007fffe1110339 ← if not jump to putfield
0x00007fffe111032e: xchg %rdi,%rax
0x00007fffe1110331: callq 0x00007fffe10ffd20 ; {runtime_call} ← else make a call out
to the runtime to copy the object to an evacuation region.
0x00007fffe1110336: xchg %rax,%rdi
0x00007fffe1110339: mov %esi,0x10(%rdi) ;*putfield count
; - java.util.Hashtable::addEntry@83 (line 436)
22. So, what do these barriers cost?
● Not as much as you might think….
– Barrier Optimizations
● New Objects
● Immutable Fields
● Array Size
● Class Pointers
● Read after Read
● Read after Write
● Hoisting
23. We ran several DaCapo Benchmarks Without
Any GC Activity
Benchmark Shenandoah G1 Percentage
Overhead
Avrora 2096ms 2052ms 2.1%
FOP 1103ms 1044ms 5.6%
LUIndex 861ms 832ms 3.5%
25. 25
Why not Generational?
● Generational hypothesis is the observation that, in most
cases, young objects are much more likely to die than
old objects.
– Memory management Glossary
26. Why Not Generational?
● LRU Benchmark
– Models a URL cache mapping URL to web page content.
– Generational GC pays a steep penalty for copying data.
Collector Total Time Total Pause
Time
Average
Pause Time
Max Pause
Time
Shenandoah 15167ms 3.81s 23.19ms 44.85ms
G1 178244ms 11.89s 116.60ms 230.573ms
28. Currently Available OpenJDK GC's
● Serial GC
– Small Footprint
– Minimal overhead
● Parallel GC
– High Throughput
● G1
– Managed Pause Times
– Compaction
● ParNew/CMS
– Minimal Pause Times
30. Shorter Pause Times
● We are moving more of our work into concurrent phases to meet
the original 10ms goal.
31. Shenandoah 2.0
● Observations
– Marking the entire heap takes a long time and touches rarely
used parts of memory.
– Garbage is only created by stack changes or writes to the heap.
X
32. Focus GC wherever writes are happening.
● Generational Application
– Writes happening in recently allocated regions
● LRU
– Writes happening in oldest regions
33. Shenandoah 2.0 Theory
● Keep track of writes to regions.
● Focus on regions which have changed
● Collect Region Sets together.
34. Table of Inter-Region References
Regions
0 1 2 3 4 5 6 7
0
1
2
3 X
4
5 X
6
7 X
Regions 3 & 5
collected
together
35. Table of Inter-Region References
Regions
0 1 2 3 4 5 6 7
0
1
2
3 X
4
5 X
6
7 X
Scan
Region 7 when
collecting region 6.
37. Partial Collections
● Scan Thread Stacks and Other Roots
● Scan Entire Region Group and Referencing
Regions.
38. Region Groups Help NUMA
● Regions that reference each other will be collected together.
39. NUMA Aware GC Threads
● NUMA node 1 ● NUMA node 2
Java
Threads
Concurrent
GC
Threads
N1 Region N2 Region
N1 Region N2 Region
Shared
Region
Shared
Region
Shared
Region
Empty
Region
Empty
Region
Empty
Region
Shared
Region
Shared
Region
Java
Threads
Concurrent
GC
Threads