Add a New UDF SQL Function to Presto – Support unix_timestamp From Hive

Some Context

We deploy presto in-house and manage a fork of the code base so that we can introduce company-specific features for auditing, security, or integrations.

One of the main features we provided is support for hive views (which I now see is coming into presto-sql version 330 (but that’s not really relevant here).

https://prestosql.io/docs/current/release/release-330.html

Add experimental support for executing basic Hive views. To enable this feature, the hive.views-execution.enabled configuration property must be set to true. (#2715)

A recent feature request for our implementation was to add support for the unix_timestamp function so that this expression worked:

"unix_timestamp"(a.creation_datetime, 'yyyy-MM-dd')

The Code

There is some good information on making functions here: https://prestosql.github.io/docs.prestosql.io/current/develop/functions.html.  But I mostly just referenced other DateTime functions in the DateTimeFunctions class within the presto code base (which also happens to be where I added the new unix_timestamp function implementation shown below.

    @ScalarFunction("unix_timestamp")
    @LiteralParameters({"x", "y"})
    @SqlType(StandardTypes.DOUBLE)
    public static double unixTimestamp(@SqlType("varchar(x)") Slice datetime, @SqlType("varchar(y)") Slice formatString)
    {
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat(formatString.toStringUtf8());
        simpleDateFormat.setTimeZone(TimeZone.getTimeZone("UTC"));
        try {
            Date parse = simpleDateFormat.parse(datetime.toStringUtf8());
            return parse.toInstant().toEpochMilli() / 1000d;
        }
        catch (ParseException e) {
            throw new PrestoException(INVALID_FUNCTION_ARGUMENT, "'" + datetime.toStringUtf8() +
                    " cannot be parsed by format string " + formatString.toStringUtf8() + ".");
        }
    }

This function takes a varchar date or column, and a varchar format string, and it parses out the date in UTC to the unix time (seconds since 1970). This should follow the hive behavior for this overload (hopefully!).

Now, you can call the function (for this overload) in presto-sql just like you can in hive. We still have to do more testing and we need to provide more overloads, but this is a working implementation, so I hope it helps you do the same.

Snowflake SQL compilation error: Object does not exist – Schema / Case Sensitive

Recently I was having strange issues while trying to grant a role access to a database schema in snowflake.

The schema was manually created after a migration from another database, and its name was in lower-case – e.g. MYDATABASE.”dbo”, “dbo” being the schema name.

Auto Upper Case + Schema Case Sensitivity

What I realized after a short while was that all SQL identifiers you place into Snowflake SQL are automatically made upper case.  Snowflake cares about schema case sensitivity though.

So, unless you’ve been going around and adding double-quotes around all your database/schema/table names while creating them, almost everything you have will be in upper case.

When you do create things in lower-case manually with quoting, you have to go around adding quotes to them in every query to ensure they are actually given in lower-case to the database.  For example, SELECT * FROM mydatabase.dbo.mytable will implicitly become SELECT * FROM MYDATABASE.DBO.MYTABLE.  So, if “dbo” is the real name and not “DBO” for the schema, you actually need to do SELECT * FROM MYDATABASE.”dbo”.MYTABLE instead.

Note, this assumes MYDATABASE and MYTABLE were created in upper-case or without quoting.

Final Thoughts

I personally feel that you should avoid quoting and let everything be upper case.  If you did have to create things in lower-case, then I suggest always using quoting everywhere.  Anything in between the two will get confusing.