Snowflake SQL compilation error: Object does not exist – Schema / Case Sensitive

Recently I was having strange issues while trying to grant a role access to a database schema in snowflake.

The schema was manually created after a migration from another database, and its name was in lower-case – e.g. MYDATABASE.”dbo”, “dbo” being the schema name.

Auto Upper Case + Schema Case Sensitivity

What I realized after a short while was that all SQL identifiers you place into Snowflake SQL are automatically made upper case.  Snowflake cares about schema case sensitivity though.

So, unless you’ve been going around and adding double-quotes around all your database/schema/table names while creating them, almost everything you have will be in upper case.

When you do create things in lower-case manually with quoting, you have to go around adding quotes to them in every query to ensure they are actually given in lower-case to the database.  For example, SELECT * FROM mydatabase.dbo.mytable will implicitly become SELECT * FROM MYDATABASE.DBO.MYTABLE.  So, if “dbo” is the real name and not “DBO” for the schema, you actually need to do SELECT * FROM MYDATABASE.”dbo”.MYTABLE instead.

Note, this assumes MYDATABASE and MYTABLE were created in upper-case or without quoting.

Final Thoughts

I personally feel that you should avoid quoting and let everything be upper case.  If you did have to create things in lower-case, then I suggest always using quoting everywhere.  Anything in between the two will get confusing.

Postgres Schema Creation

Historically, I have not worked with Postgres much. So, when I had to start using it, one of my first questions was how to create a schema, and how to use it for my new tables and such.

Creating a schema is exactly what you expect:

create schema myschema;

But using it is not quite what I expected.  Of course, you can do the standard thing when you’re managing your objects and use . like this:

create table myschema.mytable (x int);

But what if you just want:

create table mytable (x int);

to go into myschema by default?  To do this in Postgres, you have to add the schema to your search path.  By default your search path will be just set to the public schema; you can view it like this:

SHOW search_path;

You can set it to one or more schemas in reality.  The first schema your query sees a the named table in will be the one it takes it from.  The first schema in the list will be the default one for when you create new objects too.  So, if you did this:

SET search_path TO myschema;
create table mytable (x int);

Then your table would in fact be created in the “myschema” schema properly.

Database Star Schemas and Snowflake Schemas

Schema Confusion

A lot of people very regularly work with databases (even high end ones), but get thrown by terms like star-schema, snowflake-schema, etc. due to lack of formal training or working with data warehousing technologies.

These same people will often be perfectly comfortable with indexing, query optimization, foreign keys, concepts of de-normalization and normal forms, etc.

I personally started working with the actual “Snowflake” database recently and had to review what a snowflake shema was when I started looking at it.

Useful Articles

I found an interesting article on Star schemas vs Snowflake schemas pretty quickly, and back tracked it to precursor articles digging into the Star and Snowflake schemas respectively.  Here are each in case you want the original content; I’m just going to paraphrase it below to give people a quick overview and/or refresher.

Star Schema

A star schema just means that your main table has a primary key made out of multiple columns, each of which is a foreign key to a “dimension” table.  Then you have one or more “fact” columns in addition to the primary key.

The dimension columns will be all the relevant attributes you may want to aggregate and/or query the main table on.  For example, you might have a table for the date which breaks out the year, month, day, and day-of-week so they can be directly used.  You may then have another dimension table for the geographical region with columns for the continent, country, and city, for example, so you can aggregate on those.

Each dimension table is NOT de-normalized though.  So, if you have “New York City” as the city for 1 million rows, you are literally repeating that a million times.  This makes queries easy to write but has a penalty in terms of data storage (which can be bad if you’re, say, in the cloud and paying more for more storage over time).

Snowflake Schema

Plain and simple; a snowflake schema is a star schema where the dimension tables are normalized.  This means that, for example, the geographical region dimension table itself would actually be turned into 4 tables (kind of its own star schema).  You would have one table for the continent, one for the country, one for the city, and one main table for the combination of the 3 as a primary key.

This makes queries more complex and possibly a little slower, but it means we have complete normalization and are not wasting any data storage.  Also, if say, a city changed its name, we would have exactly one database cell to update where as in a star schema we would have to update potentially millions of rows with copies of that name.

Why the Names?

If you think of a “Star Schema”, picture a main table with, say, 5 extra dimension tables around it like the 5 points of a star.  Makes sense, right?

Now, for a snowflake, picture each point being 5 tables by itself… so each point is its own star.  This starts to branch out like a snowflake.  Just think of fractals if you don’t believe me :).