VectorWise DB (Actian) PDT Issues

Ingres Vector(wise) uses a PDT (positional delta tree) to store in-memory updates to tables which have not yet been flushed to disk. When executing incremental updates to tables using standard SQL operations, instead of the Actian-recommended combine operations, sometimes errors occur due to data in the PDT being used improperly. These errors are just due to bugs in VW which Actian tends to address quickly – but knowing how to detect them and narrow your problems down to them is a useful skill when working with this product. It allows you to create a useful bug report and move on to other problems so you don’t waste too much of your own time.

When does this matter?

Usually these problems manifest themselves when you are moving data (inserts, updates, or deletions) from one table with pending in memory updates to another table. So, for example, if you stage your incremental updates in table X and go to merge them to table Y, you may find that table X’s in-memory changes conflict with the data you are currently working with.

What are the visible effects?

PDT issues can cause a number of problems to occur as you might imagine. You may be executing an update, and it may tell you that the update is ambiguous because you have a record in your current data set and a record in the PDT which both match the given key. Again, this should not happen, but it does sometimes due to various bugs. Similarly, you may be told that a duplciate TID exists while doing a deletion if a similar deletion has occurred in the past in the same table and it’s still in the PDT. Finally, you may also be told that a duplicate insertion is occuring for the same reasons. This last one can result in a primary key violation or you literally ending up with two records in your target table when only one was in your current data set (the PDT version and the version in the staging table both get inserted).

How do I tell if PDTs currently exist?

Assuming I create two test tables and insert two records into table ‘test_one’, the command “vwinfo -M ” displays information as follows:

 |schema_name                     |table_name                      |pdt_mem             |pdt_inserts_count   |pdt_deletes_count   |pdt_modifies_count  |
 |$ingres                         |iiprimitives                    |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_expr                   |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_expr_global            |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_io                     |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_io_global              |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_op                     |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_op_global              |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_parse_tree             |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_parse_tree_global      |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_query                  |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_query_global           |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_stage                  |                   0|                   0|                   0|                   0|
 |$ingres                         |iivwprof_stage_global           |                   0|                   0|                   0|                   0|
 |ingres                          |test_one                        |                2542|                   2|                   0|                   0|
 |ingres                          |test_two                        |                   0|                   0|                   0|                   0|

Most of this is arbitrary output for system tables, bu the bottom two lines show the tables created for this sample. We can see both inserts accounted for as well as the overall memory used for the PDTs for each table. Table 2 uses none since it hasn’t been altered yet. If we had performed deletions or updates over multiple transactions, they would be represented here as well. If you cancel out operations in the same transaction (insert and delete the same record, for example), those operations will not show up in the PDTs when the transaction is committed.

How can I flush the PDTs?

Since PDTs can cause problems, the best thing you can do is to flush them to disk if you know there is an issue, after which all values in this “vwinfo” output will be 0. Two operations can flush PDTs; the first is ‘combine’ and the second is ‘truncate’. The combine command “call vectorwise(combine ‘tablename’);” can be executed from any SQL prompt and it will flush in memory updates to disk. You can also use it to do table arithmetic like table_1 – table_2 + table_2 to delete all of the values of table 2 from table one and then reinsert them. Ingres actually recommends that you use combine to do incremental updates in the first place rather than using straight SQL. The second command, truncate, is of the form: “modify to truncated;” and can also be executed from any SQL prompt. You should not use the truncate command on your data tableas as it will wipe them out. You can ue them on your staging tables after every transaction to ensure their PDTs are totally removed though, assuming you’re okay with the staging tables having no data between transactions.

A new article will be written to talk about the usage of combine in more detail, so please look for that if you are interested in learning more.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s