Monday, May 28, 2007

Cheetah spot by spot: DIRECT_IO

Back to another article about the new features of IBM IDS 11.10 (code named Cheetah).
This will be a short article about a new feature that can end an old discussion. I'd bet that at least once, every Informix DBA had an argument about the advantages and disadvantages of using raw devices vs cooked files.
Raw devices are hard disk partitions or at least represent a physical device somewhere (either a local storage disk or a NAS pseudo-device). They are not used by any OS filesystem. The access is as direct and fast as possible. There is no filesystem layer or cache involved. The Informix instance requests the I/O operations directly to the underlying OS. Their advantages can be resumed as:

  • They're fast
  • You don't waste memory in FS cache (normally useless since IDS uses it's own cache)
  • They're the most reliable way to store your data. Since there is no FS layer, it's not possible to have FS corruption
  • They use Kernel Asynchronous I/O (KAIO) by default. This allows the I/O operations to be executed in the Kernel context. Simple AIO is asynchronous, but requires an user context that needs to be set up. IDS takes another advantage of KAIO because it is done on CPUVP threads, so you avoid process context switching (from CPU VP to AIO VP)
Cooked files are files you create on a filesystem and you use as Informix chunks. Their advantages are more or less the disadvantages of the raw devices:
  • They're easy to use. Just "touch chunk;chown informix:informix chunk;chmod 660 chunk" and you're ready
  • You can't accidentally create a filesystem over one of your raw devices
  • Nowadays several filesystems have options to don't use memory cache
These arguments have been more or less persistent along the time, so the decision alway s depended in great part of your own feelings. If you could live without top of the top performance you might consider cooked files. If you needed the maximum performance then you'd probably choose raw devices.

Once again, the R&D engineers have been alert... Th lastest features available on the newest filesystems and OS lead them to make IDS use a new feature. It's called DIRECT_IO and this is also the name of a new $ONCONFIG parameter. If you set it to 1, IDS will attempt to use the feature if it's available on the OS and the filesystem you're using, for the pagesize you choose.

Whenever this is possible, it means you'll get the following on your chunks I/O:
  • Buffer bypassing: Read/writes go directly to disk
  • KAIO is used: You'll get a serious performance boost
  • Concurrent I/O: Bypassing FS inode locking, gives you another performance boost
If you try to use it, be alert to any online.log messages you may get. I've tried this in a recently installed RHEL 4 system and it worked as expected. No messages in online log and I could see KAIO threads in the onstat -g ath output-
In a virtual machine running Fedora I got:

19:45:39 libaio.so.1: cannot open shared object file: No such file or directory
19:45:39 Please install libaio package. See machine notes.
19:45:39 Warning: Kernel Asynchronous I/O is disabled.


After a bit of yum'ing I installed libaio and tried again. This time with no errors and an onstat -g ath shows:


cheetah@PacMan.domus.online.pt:informix-> onstat -g ath
Your evaluation license will expire on 2007-11-24 00:00:00

IBM Informix Dynamic Server Version 11.10.UB7TL -- On-Line -- Up 00:00:59 -- 88872 Kbytes

Threads:
tid tcb rstcb prty status vp-class name
*2 47e1dd98 0 1 IO Idle 3lio lio vp 0
*3 47e32970 0 1 IO Idle 4pio pio vp 0
*4 47e47970 0 1 IO Idle 5aio aio vp 0
*5 47e5c970 0 1 IO Idle 6msc msc vp 0
6 47e89bd8 478b2018 3 sleeping secs: 1 1cpu main_loop()
*7 47e5cad0 0 1 running 1cpu soctcppoll
*8 47f76550 0 2 sleeping forever 1cpu soctcplst
9 47f769e8 478b25e0 1 sleeping secs: 1 1cpu flush_sub(0)
*10 47f76b48 0 3 IO Idle 1cpu kaio
11 4807d080 478b2ba8 2 sleeping secs: 1 1cpu aslogflush
12 4807d828 478b3170 1 sleeping secs: 4 1cpu btscanner_0
*28 4864b018 478b42c8 3 sleeping secs: 1 1cpu onmode_mon
*33 485c6080 478b3d00 1 sleeping secs: 14 1cpu dbScheduler
*34 485c58a0 478b4890 1 sleeping forever 1cpu dbWorker1
*35 482be7d0 478b4e58 1 sleeping forever 1cpu dbWorker2


So, this is good... What are the catches? There are a few, but most of them aren't related to IDS itself. It's more to do with OS and filesystem:
  • This only works on Unix/Linux (Windows has intrinsic differences)
  • The FS must support this funcionality
  • It may depend on the page size you choose (some may not be supported in the FS)
  • It can't be used for temporary dbspaces
If you decide to use it, you should check for any error messages and also you should look for KAIO threads. If the system starts to use KAIO for your normal chunks you'll probably want to reduce your AIO virtual processors. You may also want to use AUTO_AIOVPS (an ONCONFIG parameter that allows IDS to auto-configure the number of AIO virtual processors if it finds the number you define are not keeping up with the I/O needs)

So, this feature can really make a different in the old argument of cooked vs raw chunks. You may decide that it's possible to take the best of both worlds: Performance and manageability.

References:

Tuesday, May 01, 2007

Cheetah spot by spot: Better update statistics

I think I will get many of you to agree with me if I say that the only thing we really know about statistics is that they're really important.

Everything else seems a bit unclear and remains more or less a mystery. This article won't clear your doubts about update statistics, since I'll focus only on Cheetah's new features.

If you want to learn more I advise you to check these resources:

I have to agree that something should be done about the documentation of the UPDATE STATISTICS command given its complexity and its importance for achieving maximum performance from the RDBMS.

The information gathered by the UPDATE STATISTICS is the fuel that drives the query optimizer. You can't expect good choices for query plans without having good and up to date statistics about your data.

I'll make a mental note for a future article about these issues. For now, and since this is an article in the Cheetah Spot by Spot series, let's just see what's new.

Most of you will certainly have some contact with your company or clients development teams, and if that's the case, you've probably found some situations where they are confident they've created the correct indexes, but nevertheless the query performance is poor.

Many times they simply don't know, or forgot, that they have to UPDATE STATISTICS before the optimizer picks up the index.

If you lived this situation you surely wondered why the engine doesn’t do this on index creation... Well that's a good question and it led the IBM Informix R&D team to change this behavior.

Statistical information is stored in systables, sysfragments, sysindices, sysdistrib and syscolumns. Cheetah brings improvements in the following situations:

  • More information about statistics (more columns in the above tables)
  • More automatic statistical data collected when manipulating Indexes and temporary tables
  • More options for the UPDATE STATISTICS MEDIUM to define the sampling size
  • More information when using the SET EXPLAIN instruction

Let's see these points in detail.


More information about statistics

Cheetah will have more columns for storing information about UPDATE STATISTICS. Namely;

  • systables.ustlowts : Time stamp (DATETIME YEAR TO FRACTION(5)) when the last UPDATE STATISTICS LOW was run for this table
  • sysdistrib.constr_time : Time stamp (DATETIME YEAR TO FRACTION(5)) when this distribution was collected
  • sysdistrib.smplsize : real sample size used for this distributions calculation
  • sysdistrib.rowssmpld : real number of rows sampled for this distribution
  • sysdistrib.ustnrows : number of rows in the table when the distribution was calculated

More automatic statistical data collected when manipulating Indexes and temporary tables

In Cheetah, the engine will make the equivalent of running UPDATE STATISTICS LOW when an index is created. It will also compute histograms for the first index column. The distributions will not be calculated (this does not affect the LOW MODE) if any of these verifies:

  • The header column is an UDT (either built-in or not)
  • It's a functional Index
  • It's a VII (Virtual Index Interface) index

The distributions calculated are equivalent to what UPDATE STATISTICS generates in MEDIUM mode, with a resolution of 1% for tables of fewer than a million rows, and 0.5% for larger tables.

This means that without having to do anything else, after the index creation, the optimizer will have information about it, so the index can and will be used (if it's good for the query plans)

  • The following improvements were made on temporary table statistics:
  • The number of rows and pages will be maintained automatically
  • The creation of indexes will generate distributions and statistics for the table and index

More options for the UPDATE STATISTICS MEDIUM to define the sampling size

Another new feature is a new clause for the UPDATE STATISTICS MEDIUM: SAMPLING SIZE . This clause can be used with the resolution and confidence clauses and allows the establishment of a minimum number of rows to be sampled. For more information about the sampling size in MEDIUM mode please consult the above reference for John Miller's article on UPDATE STATISTICS. Many people don't know that the sample size is pre-defined given the parameters of resolution and confidence. From now on, we can change this with this new SAMPLING SIZE option. The number given will have two meanings:

  • When it's less or equal to 1: It's a percentage of the table rows
  • When it's greater than 1: It's the real number of rows to be used for sampling
More information when using the SET EXPLAIN instruction

Although this is not specific for the UPDATE STATISTICS statement, the SET EXPLAIN statement was also improved. In Cheetah you can define the pathname for the SQEXPLAIN file. This is achieved by the statement

SET EXPLAIN FILE TO "/your/favorite/path/with/fs/permissions"

Another improvement refers to the already existing in v10, but undocumented statement

SET EXPLAIN STATISTICS

This will give you much more information about the engine behavior when solving your queries. You can turn off this feature with an ONCONFIG called EXPLAIN_STAT. If set to 0 it will disable the enhanced EXPLAIN ON. As more and more ONCONFIG parameters you can change it dynamically with

onmode -wm EXPLAIN_STAT=1

onmode -wf EXPLAIN_STAT=1

This has also implications in the onmode -Y command. You can now give 3 values:

  • onmode -Y 0 SESSION_ID
    disables EXPLAIN for session SESSION_ID
  • onmode -Y 1 SESSION_ID
    enables EXPLAIN with STATISTICS for session SESSION_ID
  • onmode -Y 2 SESSION_ID
    enables EXPLAIN without STATISTICS for session SESSION_ID

One final note: Although I've written above that the automatic statistics will be gathered for CREATE INDEX and temporary tables, in reality there are more instructions that will benefit from this:

CREATE INDEX ...

ALTER FRAGMENT FOR TABLE ... INIT

ALTER FRAGMENT FOR INDEX ... INIT

ALTER TABLE ADD UNIQUE CONSTRAINT

ALTER TABLE ADD PRIMARY KEY...

ALTER TABLE ADD FOREIGN KEY...

As for now, it's all. But this is an area that will probably have new features in the future.

I'd like to end with two references for utilities for handling update statistics.
Art S. Kagel, a distinguished participant in the Informix community has developed an extremely useful utility called dostats. It will probably handle everything you want to do related to Informix statistics and works with versions 5 (Online), 6 (does it exist?!), 7, 9, 10 and 11. He recently updated it to handle the SAMPLING SIZE feature of IDS 11.10.

You can get it on IIUG's software archive in a package called utils2_ak. It's written in ESQL/C, so you'll have to compile it for your platform using CSDK or c4gl (7.20+).

I have also made two SHELL scripts (dbs_updstats and tbl_updstats) that will handle statistics for database and single tables. These scripts are based on older scripts that were made publicly available and modified by the Informix technical team in Portugal (namely António Lima, I and possibly others). The current versions are very different but I based my work on their efforts so a great "thanks!" is due.

These scripts don't have all the functionalities of Art Kagel's dostats, but being simple SHELL scripts make them easy to pass along and maintain. The current versions have a functionality you won't find in the current dostats. I called it a "quick low mode" and it allows the LOW mode to collect only systables data. This will prevent the cleaning of indexes that is done by LOW mode. This cleaning can take very long. Be aware that although this mode will make it run almost instantly, its speed is achieved by not doing the hardest part of the work.

If you're interested you can find this scripts at my homepage.