Garbage Collectors (HotSpot VM)

Posted on September 25, 2018 by John Humphreys

There are a variety of options regarding which garbage collector to use in Java. For the Hotspot VM, the set of garbage collectors which can be used includes:

Serial GC (default)
Parallel GC (throughput)
CMS Concurrent Mark and Sweep GC (low latency)
G1 GC (Next Gen Replacement for CMS)

Determining which garbage collector is being used in your application is not necessarily straight forward; it depends on various settings and your machine hardware. You can use the following command line parameter to your java program invocation to ensure the selected GC is printed to console though:

java -XX:+PrintCommandLineFlags –version

Terminology

Eden –Where new objects are allocated.
Survivor Spaces – Where objects are moved and aged if they survive a minor GC.
Young Generation – Eden + Survivor Spaces
Old Generation – Where aged objects are promoted if they survive a threshold number of GCs.
Stop-The-World – An event that requires all application threads to pause during its execution.

The Serial GC

The serial garbage collector is the most basic one. Its minor and major collection stages are both stop-the-world events. After each cycle the Serial GC compacts the old generation’s heap to the bottom to prevent fragmentation. This allows new allocations (on object promotion to the old generation) to be very optimized; they just grab the top-of-the-heap pointer and move it forward the size of the new object, since they know everything forward of that pointer is free space. This GC does not parallelize its operation. It is good for client-side applications that can tolerate substantial pauses or in situations where processors aren’t highly available (to reduce thread contention).

The Parallel GC

The parallel garbage collector works much like the serial garbage collector. The only significant difference is that both minor and full garbage collections take place in parallel to increase throughput. It is suitable for server machines with lots of resources, and is primary used in batch processing or data crunching applications that care more about data throughput than pause/response time.

CMS – Concurrent Mark and Sweep GC

The CMS garbage collector has been the preferred GC for applications with rapid response time requirements for some time (though the new G1 GC is coming to replace it soon). It has greatly shortened stop-the-world phases by doing a substantial part of its marking work while the user is still using the system. Its implementation means the major GCs are much faster, but the minor ones are a little slower. Its young generation works the same way as the previous GCs, but its old generation works by:

Initial Marking – Stop the World – identifying objects immediately reachable from outside the old generation.
Concurrent Marking – Marks all live objects reachable from the initial set.
Pre-Cleaning – Revisits objects modified during concurrent marking, reducing work for remark.
Remark – Stop the World – Visits all objects modified during the Pre-Cleaning phase, ensuring none are missed.
Concurrent Sweep – All objects are now marked, this sweeps the heap and de-allocates garbage objects.

The CMS does not compact its old generation heap during the concurrent sweep; instead it maintains “free lists” to track free memory. This reduces the pause time, but makes each promotion to the old generation take longer. Fragmentation can occur, causing new allocations to fail; in that case, a special stop-the-world compacting phase can be triggered for this GC, resulting in a longer than standard pause.

The CMS requires a bigger heap as new objects are still being created during concurrent marking. Object counts only decrease during the sweep phase. It should also be noted that while the CMS is guaranteed to mark all live objects, it may miss some garbage objects if they become garbage during the concurrent phases. These objects will then “float” to the next cycle before being reclaimed. The CMS GC is widely used in web services and similar applications.

G1 GC – Replacement for CMS

According to the Oracle site documentation at http://www.oracle.com/technetwork/tutorials/tutorials-1876574.html, the G1 garbage collector is a server style GC for high powered machines. It is designed to meet pause time goals with a high probability while achieving higher throughput than the CMS. It is a compacting GC which avoids CMS’s fragmentation issues.

The G1 collector takes a different stance than the previous hotspot collectors; it does not partition to young, old, and permanent generation. Instead, equal sized heap regions are maintained in contiguous ranges. Region sets are given the same roles (Eden, Survivor, Old), but no fixed size is assigned, thus increasing flexibility. When a GC is executed, a concurrent global marking phase is executed, and that information is used to locate the regions which are mostly empty. These regions are targeted first; having their contents evacuated (copied) in parallel to other regions in the heap as part of a compacting process. Time to evacuate regions is monitored to maintain a reasonable calculation of the pause times which will be encountered while collecting future regions. This is used to meet the user-provided pause time expectation whenever possible.

References

This information is summarized/paraphrased from the following sources:

Hunt, Charlie; John, Binu (2011-10-04). Java Performance (Java Series)
Oracle GC Tuning
Oracle GC Tutorials

Concurrent Collections in Java

Posted on September 24, 2018 by John Humphreys

This post is intended to tell you what concurrent collections are, and why they are an improvement over the synchronized collections which have been available since the first versions of the JDK. You will also be introduced to some basic multi-threaded terminology regarding check-then-act sequences and composite operations.

If you are aware of what concurrent collections are and just want information on which ones are available and what their benefits are, you should view [this post] instead.

Synchronized Collections

Before version 1.5, the “thread-safe” collections provided with Java were not very sophisticated. They were called synchronized collections, and they worked by simply wrapping an instance of a collection (e.g. a hash map), and putting the “synchronized” keyword on all of the wrapper methods.

This might sound okay at first, but if we think about it, there are a number of issues:

Contention; every method with the “synchronized” keyword requires the monitor lock of the class. This means that while multiple threads may be attempting to use the collection at once, only one at a time can actually do so. The rest perform a blocking wait.
Composite operations; many useful operations require a check-then-act sequence, or another operation which is really composed of multiple operations. Synchronized methods guarantee that they will be executed without interference from other threads. They do not help you compose operations though. So, if you wanted two threads to (1) check if a value exists, and (2) add it if it didn’t already exist, then you would have to use additional client-side locking to make sure that those operations were executed together. If you didn’t use the additional locking, then the operations between the two threads could interleave resulting in:
- Thread 1 and thread 2 both check and find {x} does not exist.
- Thread 1 adds {x}.
- Thread 2 tries to add {x} and fails because it already exists.

An interesting thing to note here is that synchronised collections always synchronize on the reference to the underlying collection. This is actually a benefit in some ways, because it means that the client can lock that collection themselves to form their own composite operations. Since concurrent collections use various internal locking mechanisms, it is not possible to use client side locking to form your own composite operations with them. So, you get built-in, better-performing common composite operations with concurrent collections, but you lose the ability to make your own composite operations.

Concurrent Collections

In Java 1.5, concurrent collection classes were added to address the shortcomings of the synchronized collections. Unlike synchronized collections, concurrent collections are not implemented by serializing access to the underlying class via synchronized methods. Instead, they were designed to minimize the scope of internal locking wherever possible. They allow multiple parallel readers and a limited number of parallel writers. They also may implement different interfaces to allow them to provide additional operations useful in a concurrent setting. For example, common composite operations like put-if-absent and replace are built right into ConcurrentHashMap, which implements the ConcurrentMap interface.

Let’s take a look at the JavaDoc for the ConcurrentMap class to see an example: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentMap.html.

We can see the following:

Methods inherited from java.util.Map:

clear, containsKey, containsValue, entrySet, equals, get, hashCode, isEmpty, keySet, put, putAll, remove, size, values

New Methods Specified:

putIfAbsent : If the specified key is not already associated with a value, associate it with the given value.
remove : Removes the entry for a key only if currently mapped to a given value.
replace : Replaces the entry for a key only if currently mapped to some value

You can refer to the documentation link above to get more information; about these composite methods; this list was copied from there. The purpose of this post is just to inform you that they exist and are better supported by concurrent collections than synchronized collections.

Sample Composite Operation

Now… let’s look at an example of a composite operation. A typical composite operation is a check-then-act sequence where you (1) check a condition and then (2) act on that condition. Whenever you do this, you open the door for threading issues due to interleaving of operations as mentioned earlier. The “putIfAbsent()” operation in the ConcurrentHashMap interface is an example of a concurrent collection providing composite action support. It atomically performs the following operation to ensure no interleaving occurs:

V putIfAbsent(K key, V value):

              if (!map.containsKey(key))
                  return map.put(key, value);
              else
                  return map.get(key);

Wrapping Up

Now that you’ve been exposed to the concepts of synchronized collections and concurrent collections, you should take a look at [next post] to see the various kinds of concurrent collections, and the new operations they support. Knowing what’s available in the Java libraries in advance can save you from implementing many of these features yourself (and having to test them) in the future.

Binary Search Algorithm

Posted on September 24, 2018 by John Humphreys

What is a Binary Search?

The class below implements and tests a binary search algorithm. A binary search works by taking a sorted array and repeatedly taking the middle element of the array. If the middle element is the target, the index is returned. If the target is lower than the middle element, it proceeds to search the midpoint of the elements left of the last midpoint. If the target is higher than the middle element, it proceeds to search the midpoint of the elements higher than the last midpoint.

How Does it Perform?

This divides the result set in half on each search attempt which makes it log(n) performance. To put this in perspective, this means you can find the target element in a one-billion record array within 30 iterations (30 being the worst case scenario).

public class BinarySearch {

     //Perform binary search if the provided array actually
     //contains values.
     public static int binarySearch(int[] a, int target) {
         if (a == null || a.length == 0) return -1;
         return binarySearch(a, target, 0, a.length - 1);
     }

     //Recursive binary search algorithm taking in the array,
     //target value, and the start and end indices (inclusive
     //end is the last element, not one past the last element.
     private static int binarySearch(int[] a, int target,
             int start, int end) {

         //If we're at the last element, check it against the target.
         //Return index on match and -1 on failure.
         if (start == end) return a[start] == target ? start : -1;

         //Get the midpoint index and the value present in it.
         int midIndex = (start + end) / 2;
         int midValue = a[midIndex];

         //If the midpoint is the target, return its index.
         if (midValue == target) return midIndex;

         //If the midpoint wasn't the target, recurse to look lower
         //if the value was less than the middle value or recurse to
         //search higher if the value was greater than the midpoint.
         if (target < midValue)
             return binarySearch(a, target, start, midIndex - 1);

         return binarySearch(a, target, midIndex + 1, end);
     }

     //Test the algorithm and throw an exception on failure.  This
     //should be a JUnit test, but it is being left as a main in
     //this class to make the example standalone.
     public static void main(String[] args) throws Exception{

         //Make a sample array - must be sorted for search to work.
         int[] a = { 1, 4, 7, 9, 13, 15, 18, 21, 37, 56, 72 };

         final String error = "Unexpected result, algorithm failed.";

         //Run test assertions.
         if(!(binarySearch(a, 9) == 3)) throw new Exception(error);
         if(!(binarySearch(a, 4) == 1)) throw new Exception(error);
         if(!(binarySearch(a, 7) == 2)) throw new Exception(error);
         if(!(binarySearch(a, 1) == 0)) throw new Exception(error);
         if(!(binarySearch(a, 13) == 4)) throw new Exception(error);
         if(!(binarySearch(a, 15) == 5)) throw new Exception(error);
         if(!(binarySearch(a, 18) == 6)) throw new Exception(error);
         if(!(binarySearch(a, 21) == 7)) throw new Exception(error);
         if(!(binarySearch(a, 37) == 8)) throw new Exception(error);
         if(!(binarySearch(a, 56) == 9)) throw new Exception(error);
         if(!(binarySearch(a, 72) == 10)) throw new Exception(error);

         //Test some cases that should return -1 (not found).
         if(!(binarySearch(a, 27) == -1)) throw new Exception(error);
         if(!(binarySearch(a, 10) == -1)) throw new Exception(error);

         //Print the success log.
         System.out.println("All results were as expected.");
     }
 }

Dijkstra’s Algorithm (Java)

Posted on September 23, 2018 by John Humphreys

Dijkstra’s algorithm finds the shortest path from a source node to a target node in an undirected or directed graph.

It’s actually a pretty simple algorithm once you break it down properly.

Solution Explanation

It would be fair to summarize the entire solution as “do a breadth-first search of the graph from the source node and as we process each edge, update its target node if we have found a faster path to it with that edge.”

The specific steps in this approach can be implemented as follows:

Give all nodes in the graph a value of Integer.MAX_VALUE to indicate they have not been visited. Also, set their parent node index to -1 to indicate that we haven’t found their parent yet in the shortest path solution.
Give the node you are starting from a value of 0.
Put all nodes into a minimum priority queue (a data structure that always gives the minimum cost element out next). The starting node will automatically be the first in the queue as it has value 0.
While the queue is not empty:
- Take the next node N from the queue.
- If N the target/destination node then we’re done the hard part! Record that we found the target and break.
- Otherwise, loop over each edge E in N‘s adjacency list. We’ll call the node E points to P.
  - If the cost( node N + edge E ) < cost( node P ), then:
    - Set the parent of node P to node N. This means that we just found a better route from the source to node P.
    - Also, set the new cost of P to cost( node N + edge E ).
    - Also, remove and re-add P to the priority queue as we lowered its cost. This will change its place in the queue.
Assuming we did find the target node earlier, now we just have to print out the results. To do this, we just start at the target node and follow the parent indexes backwards to the source node (which will still have -1 as its parent).

Implementation

import java.util.*;

public class Dijkstra {

    public static final int V = 5;

    private static class Node {
        int minimumPathCost;
        int node;
        int prev;
    }

    private static class Edge {
        public int weight;
        public int next;
        public Edge(int w, int n) {
            weight = w;
            next = n;
        }
    }

    //Undirected graph, have to add edges in both directions.
    public static void addEdge(List<List<Edge>> graph, int src, int dest, int weight) {
        graph.get(src).add(new Edge(weight, dest));
        graph.get(dest).add(new Edge(weight, src));
    }

    public static List<Integer> dijkstra(List<List<Edge>> graph, int source, int target) {

        //Add all to priority queue
        PriorityQueue<Node> q = new PriorityQueue<>(Comparator.comparingInt(o -> o.minimumPathCost));
        Node nodes[] = new Node[V];
        for (int i = 0; i < V; ++i) {
            Node n = new Node();
            n.minimumPathCost = i == source ? 0 : Integer.MAX_VALUE;
            n.node = i;
            n.prev = -1;
            nodes[i] = n;
            q.add(n);
        }

        //While queue not empty, take next node, visit all adjacent
        //nodes, set their minimumPathCost to be this node's minimumPathCost + the cost
        //of getting to the node (if that's less than the current wight).
        //Also, when updating the minimumPathCost, record the source node.
        while (!q.isEmpty()) {
            Node c = q.remove();
            if (c.node == target || c.minimumPathCost == Integer.MAX_VALUE) {
                break;
            }

            for (Edge e : graph.get(c.node)) {
                int newWeight = c.minimumPathCost + e.weight;
                if (newWeight < nodes[e.next].minimumPathCost) {
                    nodes[e.next].minimumPathCost = newWeight;
                    nodes[e.next].prev = c.node;
                    q.remove(nodes[e.next]);
                    q.add(nodes[e.next]);
                }
            }
        }

        //Rebuild the path from the source node to the target node.
        List<Integer> path = new ArrayList<>();
        int current = target;
        while(current != -1) {
            path.add(current);
            current = nodes[current].prev;
        }
        Collections.reverse(path);
        return path;
    }

    public static void main(String[] args) {

        //Build a sample graph. The most efficient
        //solution from 0 ->4 is [0 -> 2 -> 3 -> 4].
        List<List<Edge>> graph = new ArrayList<>();
        for (int i = 0; i < V; ++i) {
            graph.add(new ArrayList<>());
        }
        addEdge(graph, 0, 2, 2);
        addEdge(graph, 0, 1, 18);
        addEdge(graph, 1, 3, 1);
        addEdge(graph, 2, 3, 3);
        addEdge(graph, 2, 4, 9);
        addEdge(graph, 3, 4, 1);

        //Execute the algorithm and print the solution path.
        System.out.println(dijkstra(graph, 0, 4));
    }
}

Verify New Code w/ Debugger & Loggers!

Posted on September 22, 2018 by John Humphreys

Quick Summary

This is just a short post that shows a real-life example of why it is often good to view the debug log and check things out in the debugger even when things seem to be working fine.

The Bug

Today I found a very small bug in some code with a very large impact (thankfully in our development environment).

Some Java code was mapping namespace strings to IDs. It did this by:

Checking a cache for the mapping.
Checking the database for the mapping if it didn’t exist in the cache.
Creating the mapping in the DB if it didn’t exist and putting it into the cache.

This had all been working great… until I refactored it to allow nested namespaces. Suddenly, the namespace being used in the key wasn’t a string, it was an object!

We were building the cache key explicitly by adding the namespace to the value. So, the fix was as simple as adding a toString() method to the new namespace class. Without that, the toString() method had the instance reference, and every single item coming in was therefore unique!

So… effectively we got no caching and had our caches growing rapidly with useless data (the second part of which was somewhat okay as they were LRU caches).

Finding the Bug

This bug definitely could have been found with unit tests had one been there. But… one wasn’t. We noticed the bug when a consumer dumped millions of records of data against our environment and pointed out that it was vastly slower than it had been before.

It was still working in a way that got it past our general testing; it was just slower.

Once the consumer pointed this out, I turned on debug logging and immediately saw that every single record was incurring the overhead of 6 database calls, even though this should have stopped after the first thousand records or so triggered data caching.

Sending just one request against my IDE immediately showed that the cache key we were using was garbage as per the reasons noted above.

Lesson Learned!

It immediately occurred to me how silly it is that I no longer had debug logging on in my IDE. Also, on top of that, it occurred to me how silly it is that I don’t habitually take 30 seconds to watch the new code execute in a debugger just to verify it does what I intended.

Again, unit tests would catch many of these situations… but some standard debug logging and debugger checks can definitely help to significantly limit errors in new code.

Coding Stream of Consciousness

by John Humphreys – Random code from my life.

Category Archives: general