Add a New UDF SQL Function to Presto – Support unix_timestamp From Hive

Some Context

We deploy presto in-house and manage a fork of the code base so that we can introduce company-specific features for auditing, security, or integrations.

One of the main features we provided is support for hive views (which I now see is coming into presto-sql version 330 (but that’s not really relevant here).

https://prestosql.io/docs/current/release/release-330.html

Add experimental support for executing basic Hive views. To enable this feature, the hive.views-execution.enabled configuration property must be set to true. (#2715)

A recent feature request for our implementation was to add support for the unix_timestamp function so that this expression worked:

"unix_timestamp"(a.creation_datetime, 'yyyy-MM-dd')

The Code

There is some good information on making functions here: https://prestosql.github.io/docs.prestosql.io/current/develop/functions.html.  But I mostly just referenced other DateTime functions in the DateTimeFunctions class within the presto code base (which also happens to be where I added the new unix_timestamp function implementation shown below.

    @ScalarFunction("unix_timestamp")
    @LiteralParameters({"x", "y"})
    @SqlType(StandardTypes.DOUBLE)
    public static double unixTimestamp(@SqlType("varchar(x)") Slice datetime, @SqlType("varchar(y)") Slice formatString)
    {
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat(formatString.toStringUtf8());
        simpleDateFormat.setTimeZone(TimeZone.getTimeZone("UTC"));
        try {
            Date parse = simpleDateFormat.parse(datetime.toStringUtf8());
            return parse.toInstant().toEpochMilli() / 1000d;
        }
        catch (ParseException e) {
            throw new PrestoException(INVALID_FUNCTION_ARGUMENT, "'" + datetime.toStringUtf8() +
                    " cannot be parsed by format string " + formatString.toStringUtf8() + ".");
        }
    }

This function takes a varchar date or column, and a varchar format string, and it parses out the date in UTC to the unix time (seconds since 1970). This should follow the hive behavior for this overload (hopefully!).

Now, you can call the function (for this overload) in presto-sql just like you can in hive. We still have to do more testing and we need to provide more overloads, but this is a working implementation, so I hope it helps you do the same.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s