Java Algorithm: Pascal’s Triangle

Pascal’s Triangle

Pascal’s triangle is a problem where you want to print a triangle of a certain height where each element is the sum of the 2 elements above it.  The first row is 1, the second row is 2 1’s, and then the pattern builds from there with 1 on the ends and the other elements being the sum of their parents.

For Example:

        1
       1 1
      1 2 1
     1 3 3 1
    1 4 6 4 1

Generalized Solution

It’s always good to (first) try to solve algorithms yourself without looking at other peoples’ solutions so that you truly learn how to work them out yourself in real scenarios.

So, there may be a more efficient solution than this; but here was my approach:

  • Set a list to hold the previous row (initially empty).
  • Loop up to the required depth from 1 to D inclusively.
    • Loop for each item that should be in that level (level 1 has 1 number, level N has N numbers).
    • If it’s an end-number add “1” to the new row, otherwise add the sum of parents.
  • Print the new row.
  • Store the new row as the previous row so it can be used for the next depth level’s parent calculations.

I’m sure you can do this without storing the previous row as well mathematically, but this is pretty elegant and will only take extra space equal to the sizeof(int) * level-number which is really nothing.

Java Solution

package john.humphreys;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class PascalsTriangle {

    public static void main(String[] args) {
        printToDepth(20);
    }

    private static void printToDepth(int d) {

        //Row 1 has value 1, anything less is invalid.
        if (d < 1) return;

        //Keep track of the previous row.
        List<Integer> previousRow = new ArrayList<>();

        //Loop from 1 to target depth inclusively.
        for (int i = 1; i <= d; ++i) {

            //Create a new row to populate with our solution.
            List<Integer> newRow = new ArrayList<>();

            //If this is a row-end (0 or max in row) add 0, otherwise add the parents' sum.
            for (int ri = 0; ri < i; ++ri) {
                newRow.add(ri == 0 || ri == i - 1 ? 1 : previousRow.get(ri - 1) + previousRow.get(ri));
            }

            //Print out the space-separated row.
            System.out.println(newRow.stream().map(Object::toString).collect(Collectors.joining(" ")));

            //Store this as the previous row.
            previousRow = newRow;
        }
    }
}

If we take out comments, gratuitous spacing, and imports, it’s quite lean:

public class PascalsTriangle {

    public static void main(String[] args) {
        printToDepth(20);
    }

    private static void printToDepth(int d) {
        if (d < 1) return;
        List<Integer> previousRow = new ArrayList<>();

        for (int i = 1; i <= d; ++i) {
            List<Integer> newRow = new ArrayList<>();
            for (int ri = 0; ri < i; ++ri) {
                newRow.add(ri == 0 || ri == i - 1 ? 1 : previousRow.get(ri - 1) + previousRow.get(ri));
            }
            System.out.println(newRow.stream().map(Object::toString).collect(Collectors.joining(" ")));
            previousRow = newRow;
        }
    }
}

Java Regex Capture/Extract Multiple Values

Use Case

When you’re trying to parse complex log lines or extract data from complex strings, regular expression capture groups are about the most useful tool you could possibly ask for.

This example is taken from work where I had to parse and analyze some logs for loading data to a database. A log sample would look like this:

/data/SXF_SX_4906_2019-04-13.01.43.24.143.log:2019-04-13 01:43:28,320 INFO com.x.dc.db.schemagen.batch.listener.JobResultListener [tx.id=IF-TX-ID-a23c195c-673a-47ab-ab0c-7b8591821169] [main] Inside sendEmailNotification method: subject is prod alert:DB copy job STARTED for the dataset:4906

The Code

The relevant part of the code is here:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

private static final String capturePattern =
"^/.*/SXF_SX_(\\d+)_(\\d{4}-\\d{2}-\\d{2}.\\d{2}.\\d{2}.\\d{2}.\\d{3}).log:(.*) INFO.*" +
"copy job (.*) for the dataset:.*"

//Leaving out rest of class, this is just the regex parsing portion.
//isValid, fulLLogEntry, dataSetId, fileTimestamp, logTimestamp, status are all
//member variables in a class where this function is a member.
public DbLoadLog(String line) {

    isValid = true;

    Pattern r = Pattern.compile(capturePattern);
    Matcher m = r.matcher(line);

    //If you wanted to run over a multi-line-string/file, you could put
    //m.find() in a while loop and keep going; but I'm just analyzing specific lines.
    if (m.find()) {
        fullLogEntry = line;
        dataSetId = Integer.valueOf(m.group(1));
        fileTimestamp = m.group(2);
        logTimestamp = m.group(3);
        status = m.group(4);
    }
    else {
        isValid = false;
    }
}

 

Java – Regular/Scheduled Task, One Run at a Time

This will be a very short post, and I’m mostly writing it just so it sticks in my head better.

Common Use Cases

There are many times when you may find that you need to regularly run a task in Java.  Here are a few common examples:

  • You have a cache you need to refresh every X minutes to power a dashboard or something similar.
  • You need to prune old files from a file system once an hour.
  • You need to regularly update stats counters for monitoring.

Coding Options

There are a lot of ways to do this, but the recommended approach would be to use a scheduled executor.  Now… this part is easy to remember, but what is sometimes hard to remember is that you have two options when scheduling a task.  I often find myself picking the wrong one as it pops up in Intelli-sense and I forget there are 2 options.

  1. Run the task every X seconds/minutes/etc no matter what.
  2. Run the task every X seconds/minutes/etc *after* the previous task completed.

These two things can be very different.  If you have a task that only takes a couple of seconds, it probably doesn’t matter much.  But if you have a task that takes 2 minutes and you’re running it every 1 minute, then with option 1 you will always be running at least 2 copies of the task, whereas with option 2 you’ll just be running one copy at a time with a minute of buffer in between each task.

For both options, you can create the scheduled executor service the same way:

ScheduledExecutorService se = Executors.newSingleThreadScheduledExecutor();

But for option #1 (run every interval regardless of previous tasks), you would use this function:

se.scheduleAtFixedRate(this::refreshCache, 10, 120, TimeUnit.SECONDS);

And for option #2 (start counting after previous task completes), you would use this function.

se.scheduleWithFixedDelay(this::refreshCache, 10, 120, TimeUnit.SECONDS);

 

Does Spring JdbcTemplate Close Connections? … Not Always.

Common Advice – Correct?

Decent developers usually know that they have to try/catch/finally to ensure they clean up connections, file handles, or any number of things.  But then, for Java, you hear “just use JdbcTemplate! it does all this boilerplate for you!”.

Uncommon Scenario

Normally when you’re writing an average app, you generally want lots of queries to be able to run in parallel, efficiently, using the same user and password.  In this case, you can easily just use a connection pool and “not worry about it”.  Spring JdbcTemplates will just grab connections from your data source and pool them appropriately based on the data source.  You don’t have to worry about if they are opened, closed, or whatever.

I ran into a scenario today where that was not true though.  I have an app where each user connects to each back-end data-source using their own personal account which is managed by the application itself.  So, each user needs his or her own connection.  So… pooling would not make much sense unless each user had to do parallel operations (which they don’t).

What Happens to the Connections?

So, here’s the fun part.  I had, for the longest time, assumed that JdbcTemplates would clean up connections in addition to results sets.  In fact, you’ll see this online a lot.  But be careful!  This does not appear to be the case, or if it is, it is at least data source dependent… and that actually makes sense if you think about their purose.

Here is how I verified this. I created a JdbcTemplate which is based on a new data source each time (which is needed as the user/password change).

private NamedParameterJdbcTemplate getJdbcTemplate(String email, String password) {
    SimpleDriverDataSource ds = new SimpleDriverDataSource();
    ds.setDriverClass(HiveDriver.class);
    ds.setUrl(url);
    ds.setUsername(email);
    ds.setPassword(password);
    return new NamedParameterJdbcTemplate(ds);
}

Then I used the template for a number of queries in a normal manner (like this):

getDirectHiveJdbcTemplate(email, catalog)
.queryForList("describe extended `mytable`.`mytable`",
new MapSqlParameterSource())

Then I took a heap dump of the process with this command (run it from your command line in your JDK bin folder in Program Files or the Linux install location with minor changes):

jmap.exe -F -dump:format=b,file=C:\temp\dump.bin your-pid

You can get the PID easily by looking at your running process from JVisualVM (which is also in the bin directory).

Once the dump is complete, load the file into JVisualVM (you need to use the 3rd option of file type to make it go in, I think its pattern is . or something.

Finally, go to the classes tab, go to the very bottom of the screen, and search for the class of interest (in my case HiveConnection). I can see as many instances as I have run queries as each query made a new connection from a new data source. They are definitely not being cleaned up.

This surprised me because even though creating a new template/data-source each time is not normal, I expected them to clean up the connections when they were garbage collected or as part of normal operations.  After thinking about it more, I realize operations in my case would not me “normal”, but the lack of clean up when out of scope still definitely is a surprise to me.