Azure LB Dropping Traffic Mysteriously – HaProxy / NGNIX / Apache / etc.

Failure Overview

I lost a good portion of last week fighting dropping traffic / intermittent connection issues in a basic tier azure load balancer.  The project this was working on had been up and running for 6 months without configuration changes and had not been restarted in 100 days.  Restarting it did not help, so clearly something had changed about the environment.  It also started happening in multiple deployments in different azure subscriptions, implying that it was not an isolated issue or server/etc related.

Solution

After doing a crazy amount of tests and eventually escalating to Azure support, who reviewed the problem for over 12 hours, Azure support pointed out this:

https://docs.microsoft.com/en-us/azure/load-balancer/load-balancer-custom-probe-overview#types

“Do not translate or proxy a health probe through the instance that receives the health probe to another instance in your VNet as this configuration can lead to cascading failures in your scenario. Consider the following scenario: a set of third-party appliances is deployed in the backend pool of a Load Balancer resource to provide scale and redundancy for the appliances and the health probe is configured to probe a port that the third-party appliance proxies or translates to other virtual machines behind the appliance. If you probe the same port you are using to translate or proxy requests to the other virtual machines behind the appliance, any probe response from a single virtual machine behind the appliance will mark the appliance itself dead. This configuration can lead to a cascading failure of the entire application scenario as a result of a single backend instance behind the appliance. The trigger can be an intermittent probe failure that will cause Load Balancer to mark down the original destination (the appliance instance) and in turn can disable your entire application scenario. Probe the health of the appliance itself instead.”

I was using a load balancer over a scale set, and the load balancer pointed at HaProxy, which was designed to route traffic to the “primary” server.  So, I wanted Azure’s load balancer to consider every server up as long as it could route to the “primary” server, even if other things on this server specifically were down.

But having the health probe check HAProxy meant that the health probe was routed to the “primary” server and triggered this error.

This seems like an Azure quirk to me… but they have it documented.  Once I switched the health probe to target something not routed by HaProxy the LB stabilized and everything was ok.

 

Spring Time out REST HTTP Calls With RestTemplate

No Timeouts By Default!

Spring’s RestTemplate is an extremely convenient way to make REST calls to web services.  But most people don’t realize initially that these calls have no timeout by default.  This means no connection timeout and no data call timeout.  So, potentially, your app can make a call that should take 1 second and could freeze up for a very long time if the back end is behaving badly.

Setting a Timeout

There are a lot of ways of doing this, but the best one I’ve seen recently (from this stackoverflow post) is to create the RestTemplate in an @Configuration class and then inject it into your services.  That way you know the RestTemplate you are using everywhere was configured properly with your desired timeouts.

Here is a full example.

package com.company.cloudops.config;

import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.web.client.RestTemplateBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestTemplate;
import java.time.Duration;

@Configuration
public class AppConfig {

    @Value("${rest.template.timeout}") private int restTemplateTimeoutMs;

    @Bean
    public RestTemplate restTemplate(RestTemplateBuilder builder) {
        return builder
                .setConnectTimeout(Duration.ofMillis(restTemplateTimeoutMs))
                .setReadTimeout(Duration.ofMillis(restTemplateTimeoutMs))
                .build();
    }
}

To use this RestTemplate in another Spring bean class, just pull it in with:

@Autowired private RestTemplate template;

Connection Pooling With Spring 2.0 Hikari – Verify Idle Timeouts are Working

Use Case

I’ve been working on an odd API project where each user needs their own connection to various back-end databases/data-sources.  This is a break from the norm because in general, you set up a connection pool of, say, 10 connections and everyone shares it and you’re golden.

If you have 500 users throughout the day though and each one gets some connections, that would be a disaster.  So, in my case making sure the pool is of limited size and making sure the idle timeout works is pretty vital.  So, I started playing around to see how I can verify old connections are really being removed.

My Configuration

I had started with an Apache BasicDataSource (old habits die hard).  But then when I enabled debug I didn’t see connections being dropped, or info on them being logged at all for that matter.  Before bothering with trace, I started reading about Hikari which is a connection pool I see spring using a lot… and it looked pretty awesome! See some good performance and usage info right here.

Anyway! I switched to Hikari quick which was easy since its already in Spring Boot 2.X (which I habitually use for everything these days).

Here’s my Spring config class/code. I have it set in properties to allow a minimum of 0 connections, to time out connections after 60 seconds, and to have a maximum of 4 connections. Connections are tested with “select 1” which is pretty safe on most databases.

@Configuration
public class Config {

    //Configuration for our general audit data source.
    private @Value("${audit.ds.url}") String auditDsUrl;
    private @Value("${audit.ds.user}") String auditDsUser;
    private @Value("${audit.ds.password}") String auditDsPassword;

    @Bean
    public DataSource auditDataSource() {
        HikariConfig config = new HikariConfig();
        config.setJdbcUrl(auditDsUrl);
        config.setUsername(auditDsUser);
        config.setPassword(auditDsPassword);
        config.setMaximumPoolSize(4);
        config.setMinimumIdle(0);
        config.setIdleTimeout(60000);
        config.setConnectionTestQuery("select 1");
        config.setPoolName("Audit Pool");
        config.setValidationTimeout(10000);
        return new HikariDataSource(config);
    }

    @Bean
    public NamedParameterJdbcTemplate auditJdbcTemplate() {
        return new NamedParameterJdbcTemplate(auditDataSource());
    }
}

Verifying it Works

After sending a query to my API, where it uses a basic JDBC template to execute the query, I see the logs do this (note that I removed the date/time/class/etc for brevity).

Audit Pool - Before cleanup stats (total=0, active=0, idle=0, waiting=0)
Audit Pool - After cleanup stats (total=0, active=0, idle=0, waiting=0)
Audit Pool - Before cleanup stats (total=1, active=0, idle=1, waiting=0)
Audit Pool - After cleanup stats (total=1, active=0, idle=1, waiting=0)
Audit Pool - Before cleanup stats (total=1, active=0, idle=1, waiting=0)
Audit Pool - After cleanup stats (total=1, active=0, idle=1, waiting=0)
Audit Pool - After cleanup stats (total=0, active=0, idle=0, waiting=0)
Audit Pool - Closing connection PG...: (connection has passed idleTimeout)

So, we can see that it went from 0 connections total, to 1 connection total. The connection looks idle pretty quick because it was a short query that was done before the regular output log. Then after a minute, the connection gets closed and the total goes back to 0.

So, we’re correctly timing out idle connections using our settings. Also, we’re getting our pool name (Audit Pool) in the logs which is awesome too!

Java Algorithm: Pascal’s Triangle

Pascal’s Triangle

Pascal’s triangle is a problem where you want to print a triangle of a certain height where each element is the sum of the 2 elements above it.  The first row is 1, the second row is 2 1’s, and then the pattern builds from there with 1 on the ends and the other elements being the sum of their parents.

For Example:

        1
       1 1
      1 2 1
     1 3 3 1
    1 4 6 4 1

Generalized Solution

It’s always good to (first) try to solve algorithms yourself without looking at other peoples’ solutions so that you truly learn how to work them out yourself in real scenarios.

So, there may be a more efficient solution than this; but here was my approach:

  • Set a list to hold the previous row (initially empty).
  • Loop up to the required depth from 1 to D inclusively.
    • Loop for each item that should be in that level (level 1 has 1 number, level N has N numbers).
    • If it’s an end-number add “1” to the new row, otherwise add the sum of parents.
  • Print the new row.
  • Store the new row as the previous row so it can be used for the next depth level’s parent calculations.

I’m sure you can do this without storing the previous row as well mathematically, but this is pretty elegant and will only take extra space equal to the sizeof(int) * level-number which is really nothing.

Java Solution

package john.humphreys;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class PascalsTriangle {

    public static void main(String[] args) {
        printToDepth(20);
    }

    private static void printToDepth(int d) {

        //Row 1 has value 1, anything less is invalid.
        if (d < 1) return;

        //Keep track of the previous row.
        List<Integer> previousRow = new ArrayList<>();

        //Loop from 1 to target depth inclusively.
        for (int i = 1; i <= d; ++i) {

            //Create a new row to populate with our solution.
            List<Integer> newRow = new ArrayList<>();

            //If this is a row-end (0 or max in row) add 0, otherwise add the parents' sum.
            for (int ri = 0; ri < i; ++ri) {
                newRow.add(ri == 0 || ri == i - 1 ? 1 : previousRow.get(ri - 1) + previousRow.get(ri));
            }

            //Print out the space-separated row.
            System.out.println(newRow.stream().map(Object::toString).collect(Collectors.joining(" ")));

            //Store this as the previous row.
            previousRow = newRow;
        }
    }
}

If we take out comments, gratuitous spacing, and imports, it’s quite lean:

public class PascalsTriangle {

    public static void main(String[] args) {
        printToDepth(20);
    }

    private static void printToDepth(int d) {
        if (d < 1) return;
        List<Integer> previousRow = new ArrayList<>();

        for (int i = 1; i <= d; ++i) {
            List<Integer> newRow = new ArrayList<>();
            for (int ri = 0; ri < i; ++ri) {
                newRow.add(ri == 0 || ri == i - 1 ? 1 : previousRow.get(ri - 1) + previousRow.get(ri));
            }
            System.out.println(newRow.stream().map(Object::toString).collect(Collectors.joining(" ")));
            previousRow = newRow;
        }
    }
}

SCP/SSH With Different Private Key

If you need to use SSH or SCP with a different private key file, just specify it with -i.  For example, to copy logs from a remote server using a specific private key file and user, do the following:

scp -i C:\Users\[your-user]\.ssh\pk_file [user]@[ip-addr]:/path/logs/* .

This -i will work regardless of OS, but the example is SSHing to a Linux server from a Windows server assuming you store your private keys in your user .ssh directory.

Bash – Grep (or Run Other Command) Only On Files Created This Week, Day.

Use Case

I just ran into a simple problem where I had to grep files on a server, but the directory had TONS and TONS of files in it.  I just wanted to target files created within the last week or so.

Working Command

It turns out this find command is very handy for this occasion.  It was taken and lightly modified from this unix stack-exchange post after a fair bit of searching.

find . -mtime -7 -exec grep "my_search_string" {} \;

Basically, it finds everything in “.” (the current directory) that was created in the last 7 days (as in 24 hour days, not from-this-morning days), and it executes the grep expression on it.

You can modify the timing however you want with mtime as well as change the target directory or command to execute, and of course you can pipe the output to whatever you want :).

Angular 7 Material Modal

Overview

Getting modals working in Angular + Material took me a lot longer than I expected.  But I must confess that the documentation for them here -> https://material.angular.io/components/dialog/overview was spot on.  You just have to actually read all of it.

I’m going to provide a shorter crash course here showing 100% of what you need code-wise.  I suggest you refer to that main link to understand everything fully though.

Requirements Summary

To get a modal working, assuming you already have Angular + Material working, you need to do the following.  This assumes you are just using the root @NgModule in app.module.js, but you can use other modules if you like.

  • import MatDialogModule at the top of your app.module.ts and it to your imports array in the same file.
  • import {MatDialog, MatDialogRef, MAT_DIALOG_DATA} and {Inject} in your current page’s .ts file.
  • Create a new HTML file for your modal at the same level as your current page.
  • Add a dialog component into your typescript file.
  • Write code to trigger your dialog to open.
  • Import your dialog component back in your app.module.js and register it as a declaration *and* as an entry component (you probably have to add entry components as they’re not there by default).
    • This is because dialogs are created on-the-fly and angular needs extra information to deal with ad-hoc components.

Detailed Code Example

app.module.ts

Again, if you have a multi @NgModule angular app, you can still refer to this but you may put the content in other modules.

//... (normal imports left out for brevity)
import { MatDialogModule} from '@angular/material';
import { DialogOverviewExampleDialog } from "./cs-job-monitor/cs-job-monitor.component"

@NgModule({
  declarations: [
    ...,
    DialogOverviewExampleDialog
  ],
  imports: [
    ...
    MatDialogModule
  ],
  providers: [],
  bootstrap: [AppComponent],
  entryComponents: [
    DialogOverviewExampleDialog
  ],
})
export class AppModule { }

cs-job-monitor.component.ts

This is just one of the pages in my angular project as generated by the angular CLI. It just happens to be called cs-job-monitor but that isn’t important to you.

//... (normal imports left out for brevity)
import { Inject } from '@angular/core';
import {MatDialog, MatDialogRef, MAT_DIALOG_DATA} from '@angular/material';

//Your normal page component.
@Component({
  selector: 'app-cs-job-monitor',
  templateUrl: './cs-job-monitor.component.html',
  styleUrls: ['./cs-job-monitor.component.styl']
})
export class CsJobMonitorComponent {

  constructor(private http: HttpClient, public dialog: MatDialog) {
    //Normal work.
  }

  //In my case, I am opening the modal on the "on select row" event
  //of an angular grid (ag-grid).  But this is not important, just look
  /at how it opens.
  onSelectionChanged(event: Object) {
    const dialogRef = this.dialog.open(DialogOverviewExampleDialog, {
      data: event["api"].getSelectedRows()
    });
  }
}

//Here's your dialog component.  Mine is still named after the example one from
//angular's documentation page (I'll fix that!).  But it works fine.
@Component({
  selector: 'dialog-overview-example-dialog',
  templateUrl: 'dialog-overview-example-dialog.html'
})
export class DialogOverviewExampleDialog {

  constructor(
    public dialogRef: MatDialogRef,
    @Inject(MAT_DIALOG_DATA) public data: DialogData) {}

  onNoClick(): void {
    this.dialogRef.close();
  }
}

dialog-overview-example-dialog.html

Here is the HTML that appears in your dialog when it pops up. For now, I just have it displaying the object you gave it as data as JSON. In this case, as it will display the selected rows from the ag-grid I was using to call onSelectionChanged(). But I’m not bothering to add that here.

<pre>
  {{data | json}}
</pre>