Scaling FrontBase

Scaling WebObjects
Example Apps
Step 0
Step 1
Step 2
Step 3
Step 4
Step 5
Step 6
Step 7
Step 8
Summary
Postscript

by Geert B. Clemmensen, President, Frontline Software

In the very interesting article "How much does WebObjects cost to deploy", Pierce T. Wetter III, Twin Forces, Inc. starts out by deploying with FrontBase (G3) and then later on switching to Oracle on Solaris. In many cases this makes sense and is of course a recommendable strategy should the performance requirements necessitate it. It is however, amazing how much you can get out of FrontBase running on a G3. Maybe the upgrade to e.g. Oracle could be postponed or even avoided. As the figures at the bottom of this note shows, FrontBase running on a 300 MHz G3 is able to sustain up to 30,000 SELECTs per minute!

In Pierce's article, he deliberately ignored upgrading the database machine except for occasionally adding more RAM, and upgrading the entire machine to a Solaris machine. This article talks about ways to upgrade your database performance, without necessarily upgrading your RAM or your machine.

Why should you tune your database server?

Database server performance is dependent on three things: CPU, RAM, and disk subsystem speed.

When your database is small, i.e. up to a hundred thousand rows per table, the performance is mainly dependent on the CPU speed: how fast can the server move data around. As your database grows larger, less and less of it fits in memory. Many databases are too large to fit in RAM at all times. Caches, caching strategies, and caching options are then what makes a database server tick. Eventually, your database becomes many times larger than the size of your RAM. At that point, your speed becomes most dependent on your disk speed: How fast can the server get to the data.

So often, rather than upgrading your machine, you upgrade your RAM capacity, and you upgrade your disk subsystem. You also spend time giving the database server hints on how to cache your data, so it can use the available RAM more effectively.

What to do and how to speed up your database

Most of the SELECTs that comes out of a WebObjects application typically include values for a PRIMARY KEY or an alternate key (UNIQUE), i.e. a database server can make use of indexes and thus lookup the result very fast. The bottleneck then quickly becomes the disk subsystem and/or too small caches.

FrontBase offers an elaborate caching strategy with two major components "accessible" and tunable by the database developer:

Table caching
Raw Device Driver

Improving performance using Table Caching

Table caching allows the developer or administrator to adjust how many of the rows of a given table that should be cached:

Min. row count, the minimum number of rows to be cached
Max. row count, the maximum number or rows to be cached
Percentage, the percentage of the total row count to be cached
Persistent, keep the cache across transactions
Preload, cache the table the first time the table is referenced.

You may wonder why the Persistent setting is available, i.e. why isn't it just always set? In some scenarios you may want to control the actual memory usage, while allowing certain bursts of memory usage to occur. An example is report generation, where each transaction can span many SELECTs and probably reference some tables that normally aren't in use that much. If Persistent is set to OFF for such tables, the cache will get loaded along the way and then flushed when the transaction is COMMITted.

The crucial part of caches are of course that the integrity offered by transactions is maintained, i.e. if data in the cache is updated, the same data is also written to the disk upon executing a COMMIT.

With today's fast computers, most performance issues in a database server are related to how fast you can get data off the hard disk. One way to increase the performance in this area, is to bypass the host OS filesystem. In FrontBase this is done through the deployment of the Raw Device Driver (RDD) module. RDD even allows you to specify a raw partition to be used as datastore. Additionally, RDD allows you to specify the size of its combined write-through and read cache.

When should you should bother with table caching and RDD?

This is pretty simple to detect: If hitting the database server continuously with SELECTs, makes the CPU utilization percentage drop significantly, you'd want to tune the caching mechanisms (the database server is waiting for the disk subsystem to deliver the data). This may mean that you need to add more RAM to your machine, but then so be it.

When would you use table caching? Table caching is typically used on smaller "feeder tables" which can be cached 50% to 100% without an explosion in memory usage.

When would you use RDD? RDD is typically used in combination with table caching such that smaller tables are cached 100%, while larger tables are cached via the RRD cache. An example is an indexing solution. You would typically have two smaller tables: WORDS and DOCUMENTS, with the third larger table being the relation table: HIT. In FrontBase you would thus cache WORDS and DOCUMENTS 100%, while the HIT table is managed through the RDD (maybe with 100-300 MB of cache memory).

Both table caching and RDD can be managed from the FBDatatabaseManager application that is included in each FrontBase distribution.

Back to the timing results

Back to the timing results. I took an existing indexing database with approx. 300,000 rows in the WORDS table (the HIT table has close to 15 million rows). The WORDS table was cached 100%. My hardware was a 300 MHz PowerBook with 192 MB of RAM. A typical SELECT looks like:

SELECT * FROM WORDS WHERE WORD = '<some word>';

This SELECT was timed to 2ms, which is measured from when the server receives the SQL statement and until the result is written back to the client. If the client is running on another computer, this translates into approx. 30,000 SELECTs per minute. Since the WORDS table is cached 100% there is no disk access (once the rows are loaded into the cache), i.e. the timing is determined only by the CPU speed. On a 400 MHz G3 the timing result will show a 30% gain in performance.

Allowing for a bit more complex SELECT statements and other activities on the database server computer, FrontBase should easily yield the 6,000 "hits" per minute, Pierce T. Wetter's article uses as reference.

1300 S. Milton Rd, Suite 206, Flagstaff, AZ. (520) 779-4227, tf@twinforces.com
Comments to pierce@twinforces.com