Skip to content

SQL Hacks

Most of my SQL is usually something along the lines of:

CREATE TABLE ….
DROP DATABASE ….

But recently I have been working on a report that has been causing some trouble. The report issues a large SQL query for each of our partners and then a fancy perl script filters, sorts, aggregates, and formats the data on the client. This report now runs for over 20 hours per day. This is a problem. But just by reviewing the algorithm used for report generation it seems like there is some opportunity for improvement.

I have been working to convert this report into SQL so Postgres can handle the sorting, filtering, formatting, et al. It seemed like this strategy might run much faster than having all the logic done in Perl on a remote client. It’s too early to tell since I haven’t finished my port from Perl to SQL yet but I did learn a few things a long the way.

Temporary Tables
Temporary tables are awesome. Using a statement of the form:

CREATE TEMPORARY TABLE x AS SELECT …

You can save the results of a complicated SELECT statement and then use it later on for future JOIN fun. This way you can accomplish a couple of nice things.

  1. Logically separate the SQL so the next poor sap who has to debug this monster has any chance at all to understand the basics of what is going on.
  2. Temporary tables can be created early during a session for use as a lookup by subsequent queries. This can save the DB some work since it only has to generate the lookup table once instead of n times.

Crosstab & Pivot
The format of the report I need to generate got a bit hairy once I compared it to our DB schema. I found that I needed to translate rows into columns. This is a PITA but I RTFM and discovered the tablefunc module from postgres-contrib. The tablefunc module included a handy crosstab function that translates rows into columns. This sounded perfect for my needs but alas it didn’t work for me. You see the crosstab function was very particular about the number of columns it returned. (You have to specify the exact number of rows you are translating into columns). Unfortunately, for my dataset the number of rows and columns was not constant. It varied. This was demoralizing.

But I discovered a workaround. Before the tablefunc module there were lots of clever folks who figured out how to hack up a crosstab using ANSI-SQL. The best example I found was here: http://www.paragoncorporation.com/ArticleDetail.aspx?ArticleID=25

So I stopped using the fancy new crosstab built-in and implemented my solution in ANSI-SQL.

I hope this frankenstein performs better than our current Perl-based solution.

Post a Comment

Your email is never published nor shared. Required fields are marked *

Powered by WP Hashcash