Simple String Reversal (Java)

The purpose of this algorithm is to reverse a given string. It achieves this by advancing from the start of the string to the midpoint (or one before it) and swapping the current character with the one an equally distant from the end of the string. The string is treated as an array of characters during this process, and that array is converted back into a string when returned.

public class ReverseString {

     //Reverse the provided string.
    public static String reverse(String input) {

        //Turn the string into an array of characters so we don't
        //keep rebuilding immutable strings as we reverse the
        //contents.
        char[] chars = input.toCharArray();

        //Advance from the beginning to the middle of the array.
        //Swap the current index (l) from the left side of the
        //array with the one equally distant from the end of the
        //right side (r).
        for (int l = 0; l < chars.length / 2; ++l) {
            int r = chars.length - 1 - l;
            swap(chars, l, r);
        }

        //Turn the array back into a string and return it.
        return new String(chars);
    }

    private static void swap(char[] a, int l, int r) {
        char c = a[r];
        a[r] = a[l];
        a[l] = c;
    }

    public static void main(String[] args) {
        //Test with even and odd number of characters.
        System.out.println(reverse("Hello World!"));
        System.out.println(reverse("Hello World"));
    }
}

Wait & Notify Concurrency in Java

This example demonstrates how to use wait() and notify() in order to safely publish objects from one thread to another using blocking semantics. In real applications you should use higher level threading constructs, but understanding these lower level mechanisms is useful, especially if you ever need to implement your own higher level classes.

The DropBox Class

The key class in this program is the DropBox class which can be seen below. This class provides put() and take() functions which are implemented with blocking semantics. These blocking semantics are achieved using the wait() and notifyAll() functions which will be described later.

public class DropBox {

    private int value;
    private boolean empty;

    public DropBox() {
        value = -1;
        empty = true;
    }

    public synchronized void put(int value) {
        while (!empty) {
            try { wait(); }
            catch(InterruptedException ex) {}
        }
        empty = false;
        this.value = value;
        notifyAll();
    }

    public synchronized int take() {
        while (empty) {
            try { wait(); }
            catch(InterruptedException ex) {}
        }
        empty = true;
        notifyAll();
        return value;
    }
}

Guarded Blocks

Both put() and take() work in similar manners. They both start with what is known as a guarded block. In their guarded block, they poll a condition, in this case the empty variable, to see if it is okay to proceed with their work. If, it is not okay to proceed (like when the drop box is empty and we’re trying to take an element) then the wait() function is called. The guarded block is in a while loop as it is good practice to retest the condition being waited on after the wait is terminated. The functions that end the wait() call may be signalling any event, not necessarily the one that we were waiting for in this particular thread. So, the while loop ensures we immediately start waiting again if our desired event has not yet occurred.

The wait() and notifyAll() Functions

In most cases, you will use higher level threading mechanisms to achieve what we’re doing here with wait() and notifyAll(). But most of these higher level mechanisms are built on these lower level ones, so it is good to understand them. There are a few things you should know about these functions:

  • The wait() function blocks on the intrinsic monitor lock associated with “this” object.
  • That lock must be owned by the current thread in order for wait() to be called. That is the reason these functions are “synchronized”.
  • It will block forever after it is called.
  • The wait() block will be broken when another thread calls notifyAll() or notify() on “this” object
  • Once wait() begins blocking, it yields “this” object’s monitor lock so other threads can use it. Those threads in turn must own the lock in order to call notifyAll() or notify() to wake this thread up.
  • The notifyAll() function is typically used as the notify() method only signals one thread, and you cannot choose which thread that is. The notify() function is generally only useful in massively parallel applications when you don’t care which of a large number of threads does the work.

The Producer Class

The producer class is very simple since the DropBox class did most of the groundwork for us. It receives a reference to the DropBox it will use upon construction. It then implements Runnable and overrides the run() method. It uses the run method to loop 50 times, and on each loop it calls put(x) on the drop box, which does a blocking add of the next integer value. It then sleeps for 250ms just to make the application output easier to follow.

public class Producer implements Runnable {

    private final DropBox next;

    public Producer(DropBox next) {
        this.next = next;
    }

    @Override
    public void run() {
        for (int i = 0; i < 50; ++i) {
            next.put(i + 1);
            try { Thread.sleep(250); }
            catch (InterruptedException e) {}
        }
    }
}

The Consumer Class

The consumer class works just like the producer class, but in reverse. It simply implements runnable and does a blocking take() on the DropBox 50 times in a row before running to completion.

public class Consumer implements Runnable {

    private final DropBox next;

    public Consumer(DropBox dropBox) {
        this.next = dropBox;
    }

    @Override
    public void run() {
        for (int i = 0; i < 50; ++i) {
            System.out.println(next.take());
        }
    }
}

Running the Program

The following class creates a DropBox and then creates a Producer and a Consumer instance which both share that DropBox. Next, it creates two new threads, passing the Producer and Consumer Runnables to them. Finally, it starts both threads and joins on them in order to wait for them to complete before shutting down the application. Running this program will result in the Consumer printing the numbers 1-50, one line at a time. Once 50 is reached, both threads run to completion, and the join() calls finish blocking, allowing the application to terminate successfully.

public class ProducerConsumerSample {
    public static void main(String[] args) {
        DropBox dropBox = new DropBox();
        Consumer c = new Consumer(dropBox);
        Producer p = new Producer(dropBox);
        Thread t1 = new Thread(c);
        Thread t2 = new Thread(p);
        t1.start();
        t2.start();
        try {
            t1.join();
            t2.join();
        }
        catch (Exception ex) {
            System.err.println("Error joining thread.");
        }
   }
}

Stack w/ Extract Min Value Operation

Algorithm Description

Create a stack that supports an additional operation which can always return the minimum value present in the stack in constant time.

Considerations

This algorithm sounds pretty simple, but it can be a little trickier than it looks on the surface:

  • You must have constant time access to the minimum value.
  • Once the minimum value is popped, you must know the next smallest value.
  • You must know how many times the minimum value exists in the stack so you don’t pop it prematurely.

A good way to address these considerations is to use two stacks; one to hold the real values, and one to hold the minimum values. The minimum value stack must record how many times the current minimum exists in the stack. So, if we push {5, 4, 3, 3, 4} to the real stack, then the min stack would hold {5, 4, 3, 3}. The last 4 does not need to be recorded as it has no effect on the current minimum. Both 3’s need to be recorded as 3 is still the minimum even if we pop the 1st 3.

Note that the implementation can be simplified by always re-adding the current minimum to the min-stack whenever a new element is added to the normal stack. This way, both stacks always have the same number of values, and we always do a push or a pop on both stacks at once. This simplification comes at a slight cost though. If we were unlucky, we may add a million elements to the stack, and the first one may be the minimum. At this point, our chosen implementation would only have 1 element in the min stack, while the simplified implementation would have a million elements in the min stack. This is a typical tradeoff with algorithms.

Sample Implementation

 import java.util.ArrayDeque;
 import java.util.Deque;

 //This class extends the ArrayDeque class in order to create a
 //generic minimum stack implementation. The underlying ArrayDeque
 //is used to hold all of the elements of the real stack, and a
 //secondary ArrayDeque is used in order to track the minimum
 //elements on that stack so that the minimum can be retrieved
 //in constant time.  Whenever a new element is pushed to the stack,
 //we check if it is less than or equal to the current minimum, and
 //if it is, we also push it to the min stack.  The reverse is done
 //when popping elements from the stack.  We need to push new
 //elements equal to the current minimum so that we can record the
 //fact that the current minimum exists multiple times in the stack.
 public class MinStack>
         extends ArrayDeque {

     //Secondary minimum stack to track the minimum element.
     private Deque minStack;

     //Allocate the minimum stack when the primary is created.
     public MinStack() {
         this.minStack = new ArrayDeque<>();
     }

     @Override
     public void push(E e) {
     //Add the new element to the min stack if it is less than or
     //equal to the min stack's top element.
     if (minStack.size() == 0 || e.compareTo(minStack.peek()) <= 0)
         minStack.push(e);

         //Add the element to the real underlying stack.
         super.push(e);
     }

     @Override
     public E pop() {

         //Get the top element from the real stack if there is one.
         if (super.size() == 0) return null;
         E item = super.pop();

         //If the element we popped off the real stack was the top
         //element of the min stack, also pop it of of the min stack.
         if (item != null && item.compareTo(
                 minStack.peek()) == 0) {
             minStack.pop();
         }

         //Return the popped item from the real stack.
         return item;
     }

     //Retrieve the minimum element off of the min stack in
     //constant time.
     public E min() {
         return minStack.peek();
     }

     public static void main(String[] args) {

     MinStack ms = new MinStack<>();
         ms.push(4); ms.push(5); ms.push(3); ms.push(3);
         ms.push(7); ms.push(2); ms.push(1); ms.push(2);
         ms.push(2); ms.push(1);

         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
         System.out.println("Minimum element = " + ms.min());
         System.out.println("Popping off: " + ms.pop());
     }
 }

Garbage Collectors (HotSpot VM)

There are a variety of options regarding which garbage collector to use in Java. For the Hotspot VM, the set of garbage collectors which can be used includes:

  • Serial GC (default)
  • Parallel GC (throughput)
  • CMS Concurrent Mark and Sweep GC (low latency)
  • G1 GC (Next Gen Replacement for CMS)

Determining which garbage collector is being used in your application is not necessarily straight forward; it depends on various settings and your machine hardware. You can use the following command line parameter to your java program invocation to ensure the selected GC is printed to console though:

java -XX:+PrintCommandLineFlags –version

Terminology

Eden –Where new objects are allocated.
Survivor Spaces – Where objects are moved and aged if they survive a minor GC.
Young Generation – Eden + Survivor Spaces
Old Generation – Where aged objects are promoted if they survive a threshold number of GCs.
Stop-The-World – An event that requires all application threads to pause during its execution.

The Serial GC

The serial garbage collector is the most basic one. Its minor and major collection stages are both stop-the-world events. After each cycle the Serial GC compacts the old generation’s heap to the bottom to prevent fragmentation. This allows new allocations (on object promotion to the old generation) to be very optimized; they just grab the top-of-the-heap pointer and move it forward the size of the new object, since they know everything forward of that pointer is free space. This GC does not parallelize its operation. It is good for client-side applications that can tolerate substantial pauses or in situations where processors aren’t highly available (to reduce thread contention).

The Parallel GC

The parallel garbage collector works much like the serial garbage collector. The only significant difference is that both minor and full garbage collections take place in parallel to increase throughput. It is suitable for server machines with lots of resources, and is primary used in batch processing or data crunching applications that care more about data throughput than pause/response time.

CMS – Concurrent Mark and Sweep GC

The CMS garbage collector has been the preferred GC for applications with rapid response time requirements for some time (though the new G1 GC is coming to replace it soon). It has greatly shortened stop-the-world phases by doing a substantial part of its marking work while the user is still using the system. Its implementation means the major GCs are much faster, but the minor ones are a little slower. Its young generation works the same way as the previous GCs, but its old generation works by:

  • Initial Marking – Stop the World – identifying objects immediately reachable from outside the old generation.
  • Concurrent Marking – Marks all live objects reachable from the initial set.
  • Pre-Cleaning – Revisits objects modified during concurrent marking, reducing work for remark.
  • Remark – Stop the World – Visits all objects modified during the Pre-Cleaning phase, ensuring none are missed.
  • Concurrent Sweep – All objects are now marked, this sweeps the heap and de-allocates garbage objects.

The CMS does not compact its old generation heap during the concurrent sweep; instead it maintains “free lists” to track free memory. This reduces the pause time, but makes each promotion to the old generation take longer. Fragmentation can occur, causing new allocations to fail; in that case, a special stop-the-world compacting phase can be triggered for this GC, resulting in a longer than standard pause.

The CMS requires a bigger heap as new objects are still being created during concurrent marking. Object counts only decrease during the sweep phase. It should also be noted that while the CMS is guaranteed to mark all live objects, it may miss some garbage objects if they become garbage during the concurrent phases. These objects will then “float” to the next cycle before being reclaimed. The CMS GC is widely used in web services and similar applications.

G1 GC – Replacement for CMS

According to the Oracle site documentation at http://www.oracle.com/technetwork/tutorials/tutorials-1876574.html, the G1 garbage collector is a server style GC for high powered machines. It is designed to meet pause time goals with a high probability while achieving higher throughput than the CMS. It is a compacting GC which avoids CMS’s fragmentation issues.

The G1 collector takes a different stance than the previous hotspot collectors; it does not partition to young, old, and permanent generation. Instead, equal sized heap regions are maintained in contiguous ranges. Region sets are given the same roles (Eden, Survivor, Old), but no fixed size is assigned, thus increasing flexibility. When a GC is executed, a concurrent global marking phase is executed, and that information is used to locate the regions which are mostly empty. These regions are targeted first; having their contents evacuated (copied) in parallel to other regions in the heap as part of a compacting process. Time to evacuate regions is monitored to maintain a reasonable calculation of the pause times which will be encountered while collecting future regions. This is used to meet the user-provided pause time expectation whenever possible.

References

This information is summarized/paraphrased from the following sources:

Concurrent Collections in Java

This post is intended to tell you what concurrent collections are, and why they are an improvement over the synchronized collections which have been available since the first versions of the JDK. You will also be introduced to some basic multi-threaded terminology regarding check-then-act sequences and composite operations.

If you are aware of what concurrent collections are and just want information on which ones are available and what their benefits are, you should view [this post] instead.

Synchronized Collections

Before version 1.5, the “thread-safe” collections provided with Java were not very sophisticated. They were called synchronized collections, and they worked by simply wrapping an instance of a collection (e.g. a hash map), and putting the “synchronized” keyword on all of the wrapper methods.

This might sound okay at first, but if we think about it, there are a number of issues:

  • Contention; every method with the “synchronized” keyword requires the monitor lock of the class. This means that while multiple threads may be attempting to use the collection at once, only one at a time can actually do so. The rest perform a blocking wait.
  • Composite operations; many useful operations require a check-then-act sequence, or another operation which is really composed of multiple operations. Synchronized methods guarantee that they will be executed without interference from other threads. They do not help you compose operations though. So, if you wanted two threads to (1) check if a value exists, and (2) add it if it didn’t already exist, then you would have to use additional client-side locking to make sure that those operations were executed together. If you didn’t use the additional locking, then the operations between the two threads could interleave resulting in:
    • Thread 1 and thread 2 both check and find {x} does not exist.
    • Thread 1 adds {x}.
    • Thread 2 tries to add {x} and fails because it already exists.

An interesting thing to note here is that synchronised collections always synchronize on the reference to the underlying collection. This is actually a benefit in some ways, because it means that the client can lock that collection themselves to form their own composite operations. Since concurrent collections use various internal locking mechanisms, it is not possible to use client side locking to form your own composite operations with them. So, you get built-in, better-performing common composite operations with concurrent collections, but you lose the ability to make your own composite operations.

Concurrent Collections

In Java 1.5, concurrent collection classes were added to address the shortcomings of the synchronized collections. Unlike synchronized collections, concurrent collections are not implemented by serializing access to the underlying class via synchronized methods. Instead, they were designed to minimize the scope of internal locking wherever possible. They allow multiple parallel readers and a limited number of parallel writers. They also may implement different interfaces to allow them to provide additional operations useful in a concurrent setting. For example, common composite operations like put-if-absent and replace are built right into ConcurrentHashMap, which implements the ConcurrentMap interface.

Let’s take a look at the JavaDoc for the ConcurrentMap class to see an example: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentMap.html.

We can see the following:

Methods inherited from java.util.Map:

clear, containsKey, containsValue, entrySet, equals, get, hashCode, isEmpty, keySet, put, putAll, remove, size, values

New Methods Specified:

putIfAbsent : If the specified key is not already associated with a value, associate it with the given value.
remove : Removes the entry for a key only if currently mapped to a given value.
replace : Replaces the entry for a key only if currently mapped to some value

You can refer to the documentation link above to get more information; about these composite methods; this list was copied from there. The purpose of this post is just to inform you that they exist and are better supported by concurrent collections than synchronized collections.

Sample Composite Operation

Now… let’s look at an example of a composite operation. A typical composite operation is a check-then-act sequence where you (1) check a condition and then (2) act on that condition. Whenever you do this, you open the door for threading issues due to interleaving of operations as mentioned earlier. The “putIfAbsent()” operation in the ConcurrentHashMap interface is an example of a concurrent collection providing composite action support. It atomically performs the following operation to ensure no interleaving occurs:

V putIfAbsent(K key, V value):

              if (!map.containsKey(key))
                  return map.put(key, value);
              else
                  return map.get(key);

Wrapping Up

Now that you’ve been exposed to the concepts of synchronized collections and concurrent collections, you should take a look at [next post] to see the various kinds of concurrent collections, and the new operations they support. Knowing what’s available in the Java libraries in advance can save you from implementing many of these features yourself (and having to test them) in the future.