sql | Coding Stream of Consciousness

Some Context

We deploy presto in-house and manage a fork of the code base so that we can introduce company-specific features for auditing, security, or integrations.

One of the main features we provided is support for hive views (which I now see is coming into presto-sql version 330 (but that’s not really relevant here).

https://prestosql.io/docs/current/release/release-330.html

Add experimental support for executing basic Hive views. To enable this feature, the hive.views-execution.enabled configuration property must be set to true. (#2715)

A recent feature request for our implementation was to add support for the unix_timestamp function so that this expression worked:

"unix_timestamp"(a.creation_datetime, 'yyyy-MM-dd')

The Code

There is some good information on making functions here: https://prestosql.github.io/docs.prestosql.io/current/develop/functions.html. But I mostly just referenced other DateTime functions in the DateTimeFunctions class within the presto code base (which also happens to be where I added the new unix_timestamp function implementation shown below.

    @ScalarFunction("unix_timestamp")
    @LiteralParameters({"x", "y"})
    @SqlType(StandardTypes.DOUBLE)
    public static double unixTimestamp(@SqlType("varchar(x)") Slice datetime, @SqlType("varchar(y)") Slice formatString)
    {
        SimpleDateFormat simpleDateFormat = new SimpleDateFormat(formatString.toStringUtf8());
        simpleDateFormat.setTimeZone(TimeZone.getTimeZone("UTC"));
        try {
            Date parse = simpleDateFormat.parse(datetime.toStringUtf8());
            return parse.toInstant().toEpochMilli() / 1000d;
        }
        catch (ParseException e) {
            throw new PrestoException(INVALID_FUNCTION_ARGUMENT, "'" + datetime.toStringUtf8() +
                    " cannot be parsed by format string " + formatString.toStringUtf8() + ".");
        }
    }

This function takes a varchar date or column, and a varchar format string, and it parses out the date in UTC to the unix time (seconds since 1970). This should follow the hive behavior for this overload (hopefully!).

Now, you can call the function (for this overload) in presto-sql just like you can in hive. We still have to do more testing and we need to provide more overloads, but this is a working implementation, so I hope it helps you do the same.

Auto Upper Case + Schema Case Sensitivity

What I realized after a short while was that all SQL identifiers you place into Snowflake SQL are automatically made upper case. Snowflake cares about schema case sensitivity though.

So, unless you’ve been going around and adding double-quotes around all your database/schema/table names while creating them, almost everything you have will be in upper case.

When you do create things in lower-case manually with quoting, you have to go around adding quotes to them in every query to ensure they are actually given in lower-case to the database. For example, SELECT * FROM mydatabase.dbo.mytable will implicitly become SELECT * FROM MYDATABASE.DBO.MYTABLE. So, if “dbo” is the real name and not “DBO” for the schema, you actually need to do SELECT * FROM MYDATABASE.”dbo”.MYTABLE instead.

Note, this assumes MYDATABASE and MYTABLE were created in upper-case or without quoting.

Coding Stream of Consciousness

by John Humphreys – Random code from my life.

Tag Archives: sql

Add a New UDF SQL Function to Presto – Support unix_timestamp From Hive

Some Context

The Code

Snowflake SQL compilation error: Object does not exist – Schema / Case Sensitive

Auto Upper Case + Schema Case Sensitivity

Final Thoughts