This is an extension of my previous post (“Netezza Questions and Answers”).
What data type is most suited for zone maps
Zone maps are typically useful for integers, date and time, variations of this data type. Zone maps are useful for ordered that that are usually built into data that is loaded; for example, phone call logs. [Also check more on zonemaps in my original post below].
Similar to other databases, materialized views are defined against base tables. Just like in views, we can not insert, update or delete from MV. MV is automatically used by optimizer when appropriate. When base table data changes, MV gets automatically updated by Netezza. MV is based on a single base table, thus practically not as useful.
Mostly hash joins (note that Netezza may do hash join in memory or on the SPUs if memory is not sufficient for doing a hash join); in some cases sort merge or even some cases nested loops are performed (did not really come across much; In Netezza lingo it is called either function based join or something similar, not sure about the correct term). Of course, cross or product joins are possible like in any databases.
Imagine a situation in which a small table (us_states) is required for joining against a large table (creditcard_txn). Netezza may decide to read this table from all SPUs (remember, table data is spread on all SPUs) and “assemble” this table on host. This data is then broadcasted to all SPUs, resulting in a copy on each SPU. Note that an important db parameter “factrel_size_threshold” holds that triggering number; any table beyond these many number of rows is considered as a fact table. That is, if this parameter is set to 1 million, any table holding more than 1 million rows is considered by Netezza as a fact table and will not result in this replication or broadcast.
NO, unless all columns are used during a join process. Most likely, this results in a large amount redistribution of data during query execution.
YES, we will encounter performance problem as Netezza redistributes integer to varchar type and recreates the first table. This can be avoided using both tables distributed on order number as either integer or varchar.
Snippet is a small block of database operation, typically three to four operations, that are carried out on all SPUs where data is location. If a query results in these snippets: Snippet A, Snippet B, Snippet C, ….Snippet X; they are carried out in a sequential manner.
How can we find log (SQL) activity for a day.
We can find this under /nz/kit.x.y/log/postgres/pg.log file. Older files are named as pg.log.N (Where N starts from 1, after pg.log file this is the latest file). Assuming that we are looking for a week day within pg.log, we may run
$ cat pg.log | sed -n “/2010-02-01 00/,/2010-02-05 23:59/p” > pg.firstweekFeb2010.log
If this produces no data, look for corresponding log file based on the last update timestamp (ls -ltr sorts them in reverse time stemp order).
Please see my previous posting first. Here is a short list: load using nzload, SQL inserts (very slow), create table as command or inserts from other tables, and external tables. When insert is interrupted, all rows that are inserted are already committed unless we use transaction command.
Most Netezza commands come with a variety of sub commands. For nzsystem, we can list a number of parameters using “nzsystem showregistry”. To find quickly if Netezza is up and running, use “nzsystem showstate”. However, using these commands require you to set “NZ_USER” and “NZ_PASSWORD” at unix level. You can do this with ‘export NZ_USER=username’ and ‘export NZ_PASSWORD=userpassword’, and then run this command. If not, you can specify login and password combo at command level. I do not prefer this way, as any users can do “ps -ef” to check what commands you are using, which also gives login and password you entered at shell command level.
Two important bin locations for Netezza are: /nz/kit.x.y/sbin and /nz/kit.x.y/bin.
Some configurations are located in /nz/data directory.
How can we remove formatting with NZSQL
NZSQL by default shows text formatted. In cases in which we do not need white spaces, use “nzsql -A” option. For example, “nzsql -A -c “select count(*) from hertz.daily_bookings;” allows us to run SQL command direct without logging into nzsql interactively.
Is there a way to stop NZSQL command, if one of the SQL commands fail.
Yes. A similar feature exists in other databases too. In this case, ON_ERROR_STOP=true on nzsql command so that other commands do not get executed. For instance, we like to create a new table (CTAS) and later drop old table. If a new table creation fails, we certainly do not want to drop old table. In this case, this option is very useful.
Can we access data in other databases with the same NZSQL.
This depends on the version. Version 5 and upwards support selects against database.owner.table, provided that user has same login and password access. Inserts are allowed in the current database with data from other databases, not the other way (meaning, we cannot insert into another database from the database where we logged in).
How can we plan and corresponding CPP files.
We can just to ‘explain’ on a query to see how plan looks. At run time, plans are created under /nz/data/plans we will see corresponding plans generated during run time. Corresponding CPP code is located under /nz/data/cache/