You are currently browsing the monthly archive for October 2007.

We’re having an Indian Summer in Oregon this year. The trees are alive with vibrant reds and bright yellows against a backdrop of Douglas Fir green and a bright blue sky. Balmy, clear weather in Oregon, in October – Fantastic! As the sun climbs up over the Cascades and lights up the maples in the parking lot, I’m thinking back on the fact that I spent the entire summer working on the tools for GLASS and it looks like I’ll be spending my Indian Summer doing the same thing.

While I was playing with GemStone/S and Seaside, Isaiah Perumalla spent some time porting Magritte to GemStone/S. Isaiah (with some help from Lukas Renggli) got to the point where all but a few of the tests were passing. Some GemStone-specific work was needed and I’ve stepped in to finish off the port, but the bulk of the work was done by Isaiah and Lukas.

In the midst of working through the remaining Magritte tests, I decided that I would get breakpoints working in the OmniBrowser tools. I’ve got them working now, but I now have to add a Breakpoint Browser to the list of tools that need to be implemented.

Also, Otto Behrens and Liliana Ivan from Finworks found time to extend Shout to do syntax highlighting for GemStone/S Smalltalk, which is very nice.

If you have 64 bit hardware and the time to explore GLASS right now, visit our website and drop us an email to let us know that you’re ready to rumba.

Meanwhile, I’m going heads down on the tools.

Philippe Marschall has anounced the final version of Seaside 2.8. You can find the final GemStone/S version on GemSource (Seaside2.8g1-dkh.522.mcz). Lukas Renggli has additional information .

Many of you have heard that there are GemStone/S applications running in production with thousands of vms and that other applications are doing thousands of commits per second. I bet you have been wondering what kind of performance you could get running Seaside on GemStone/S – I certainly was:)

Seriously, I spent most of the summer working on tools and with the exception of a few small tests, I didn’t take the time to focus on performance. When I came back from ESUG, I decided to run some scaling tests.

[For those of you who want to see the results and move on with your surfing, here are a couple of links: Goals, Results, Conclusions].

Background

From the tests that I had run over the summer it appeared that Apache was a limiting factor when trying to run at rates above 30 requests per second. I’d also seen some anomalies in the i/o on Linux, were we’d get flat spots in our performance graphs that appeared to be related to file system buffer flushing. If you have read the post on transactions, then you know that we write tranlog records on every commit (page request), so disk i/o can be a limiting factor.

With the help of our IS guy, it turned out that to get the best performance from Apache, we needed to use the MPM worker module. With the MPM worker module turned on, Apache performance fell way off the radar.

The issue with the i/o anomalies that we observed in Linux has not been as easy to resolve. I spent some time tuning GemStone/S to make sure that GemStone/S wasn’t the source of the anomaly. Finally our IS guy was able to reproduce the anomaly and he ran into a few other folks on the net that have observed similar anomalies.

At this writing we haven’t found a solution to the anomaly, but we are pretty optimistic that it is resolvable. We’ve seen different versions of Linux running on similar hardware that doesn’t show the anomaly, so it is either a function of the kernel version or the settings of some of the kernel parameters. As soon as we figure it out we’ll let you know.

For the purposes of these performance tests, I was able to work around the i/o anomaly by putting extents on raw partitions. In nearly all of the tests, the Shared Page Cache (SPC) is sized large enough to hold the entire working set for the test. Consequently there was very little read activity and the system was able to write dirty pages from the SPC fast enough so that random i/o to the raw extent partitions didn’t affect test results.

Goals

I had three goals in mind when I ran this set of tests.

  1. demonstrate anticipated performance of the GLASS appliance.
  2. demonstrate production performance for the Web Edition.
  3. demonstrate performance potential beyond the Web Edition.

For the GLASS applicance tests I wanted to illustrate what you could expect if you installed the VMWare image on a machine without paying particular attention to disk configurations. To simulate the GLASS applicance, I simply ran the tests using file-based tranlogs.

For the production-scale performance of the Web Edition, I wanted to illustrate what you could expect if you paid attention to the disk configuration (i.e., created some raw partitions and had a box with a minimum of 4 disk spindles).

I also wanted to illustrate what kind of performance improvements you could see if you were to add additional SPC or increase the number CPUs available.

Finally, based on a comment where Ramon Leon suggested I include some ‘speed comparisons between GemStone and Squeak’, I’ve included runs against a Squeak vm.

Test Strategy

I decided to base the performance tests on the Seaside Counter example. Since the Counter has dead-simple render logic and no significant application state, it is the perfect application for measuring baseline Seaside performance. For GemStone that means that when running tests against the Counter, we’ll be getting performance numbers that include the overhead of persisting and sharing session state (about 250 objects or 50k bytes per request).

Over the course of the summer I ran across siege, “an http regression testing and benchmarking utility” that is very easy to use. It can smack the heck out of a Web Application without putting too much of a load on the system running the test. It also provides some very basic stats about how your app withstood the barrage. Response Time, Transaction rate, and Concurrency being the most interesting.

Siege basically arranges to fire a number of concurrent requests at a given URL. In benchmark mode, as soon as a response is recieved another request is launched in its place. In this mode you can force an application to its knees – very nice for finding bottlenecks. In internet mode, siege waits a random amount of time before firing off the follow-on request, making for a simulation of what the end users of your application may experience.

I ran my tests using the URL ‘http://penny:8000/seaside/examples/counter’, which, as many of you Seasiders out there will recognize, means that a new Seaside session is created on each hit. For benchmarking purposes, that’s just fine – a little extra load never hurts. In the future, I plan to run some scaling tests that measures intra-session performance.

For quite a while now, I have felt that it was important for folks to plan on running multiple vms when they go into production using GLASS. For the best overall response times, one should plan on running at least one vm per concurrent request, which in practice should be 10 or more. For the benchmark tests, the Counter application is cpu bound, so when siege goes about slamming the web server into the ground the cpus are pegged and cpu contention between the processes becomes a factor. It turns out that with 10 vms running on a single cpu, there is a whole lot of contention going on. So after playing around a bit, I settled on using 5 vms for the benchmark tests while running with 10 concurrent requests. This combo gave good performance numbers in the single core tests while in the 4 core test when all 4 cpus were redlined we would minimize the amount of contention. I also ran a couple of internet tests using a single CPU and 20 vms to confirm that in the wild you could afford to run with more than 5 vms without suffering a performance hit. Finally I ran a benchmark test with 100 concurrent requests to see how the system behaved when it was being slashdotted.

Test Setup

For the hardware I used 3 machines foos, toronto, and penny.

Penny is a 2.6Ghz AMD Opteron, with 2 dual core cpus, running SUSE Enterprise 10, with 8Gb of ram. Penny was used to host Apache and Siege. We had a dedicated 1Gbs ethernet connection between penny and toronto.

Foos is a 2.4Ghz Intel, with 1 dual core cpu, running SUSE Enterprise 10, with 2Gb of ram and 2 disk drives (no raw partitions). Foos was used to simulate the GLASS appliance performance.

Toronto is a 2.2Ghz AMD Opteron with 2 dual core cpus, running SUSE Enterprise 10, with 8Gb of ram and 5 disk drives (raw partitions installed on 3 of the partitions). Toronto was used to simulate a typical production machine.

Siege was pointed at the Apache instance listening on port 8000. In addition to the mpm_worker_module, mod_proxy_balancer was used to round-robin requests to the various vms. GemStone was running with version Seaside2.8g1-dkh.490 of Seaside.

For the Squeak tests I used the latest development image from Damien Cassou (sq3.9-7067dev07.10.1), the 3.9 vm and loaded Seaside2.8a1-lr.492. I pointed siege directly at the Squeak image (both running on Toronto).

I did not enforce exclusive use on any of the machines (penny and toronto are shared by other folks in the company) during the tests. But between running most of the tests multiple times and keeping an eye out for anomalous events the numbers are good enough for government work.

Summary of Results

In all, I ran 15 different tests in 6 different categories (Squeak baseline, Squeak internet, Squeak benchmark, GemStone Web Edition, GemStone internet, and GemStone benchmark).

The following table is sorted by Req/Sec. Click on a Run number to jump to the category and a description of the test.

Run Req/Sec Core Gem VM std Siege Machine Notes
1 10 1 1 S -b -c 10 Toronto 1.0 ART
2 15 1 5 G 5 -b -c 10 Foos file-based, 1G SPC
3 16 1 1 S -i Toronto 0.3 ART
4 25 2 5 G 3 -b -c 10 Foos file-based, 1G SPC
5 28 1 20 G 5 -i Toronto raw, 1G SPC
6 28 2 20 G 5 -i Toronto raw, 5G SPC
7 29 1 1 G 6 -i Toronto 0.02 ART, raw,1G SPC
8 32 1 1 S -b -c 1 Toronto 0.03 ART
9 50 1 5 G 10 -b -c 10 Toronto raw, 1G SPC
10 75 1 5 G 13 -b -c 10 Toronto 0.1 ART, raw, 5G SPC
11 87 1 5 G 6 -b -c 100 Toronto 1.3 ART, raw, 5G SPC
12 91 1 1 G 6 -b -c 10 Toronto 0.1 ART, raw, 1G SPC
13 140 2 5 G 20 -b -c 10 Toronto raw, 5G SPC
14 185 3 5 G 37 -b -c 10 Toronto raw, 5G SPC
15 230 4 5 G 40 -b -c 10 Toronto raw, 5G SPC

ART in the Notes column is shorthand for Average Response Time, a stat from siege.

Squeak Results

Run1, Run3, and Run8 are scaling tests against the Squeak image. Run7 and Run12 are comparable tests run against a GemStone vm. Note that there is only 1 GemStone vm being used in these tests.

Run Req/Sec Core Gem VM std Siege Machine Notes
1 10 1 1 S -b -c 10 Toronto 1.0 ART
3 16 1 1 S -i Toronto 0.3 ART
7 29 1 1 G 6 -i Toronto 0.02 ART, 1G SPC
8 32 1 1 S -b -c 1 Toronto 0.03 ART
12 91 1 1 G 6 -b -c 10 Toronto 0.1 ART, raw,1G SPC
Baseline

Run8 showed the best results for Squeak with a rate of 32 request/second when hit with a siege from a single user.

Internet tests

For the internet test (Run3), the Squeak vm hit 16 requests/second (0.34 seconds average response time and an average of 6 concurrent requests). In a comparable test (Run7), the GemStone vm hit 29 requests/second (0.02 seconds average response time and an average of 0.6 concurrent requests).

Benchmark tests

When 10 concurrent users slammed the Squeak vm (Run1), performance dropped to 10 requests/second. Siege doesn’t collect stats on the standard deviation, but I observed a wide range of response times around the 1 second average reponse time. In the comparable GemStone test (Run12), the GemStone vm hit 91 requests/second, a standard deviation of 6 and an average response time of 0.1 seconds.

Under load it appears that GemStone is about 10 times faster processing Seaside requests than Squeak (Run1 comparied to Run12). While the GemStone vm is certainly faster than the Squeak vm, I don’t think that the GemStone vm is that much faster. I haven’t tried to analyze what might be going on, but my best guess is that under load, the garbage collector is siphoning cpu cycles away from the processing of requests.

GemStone Results

Web Edition tests

Run2, Run4, and Run9 are intended to illustrate the kind of performance you might expect when running a version of the Web Edition (i.e., 1 core, 1G SPC, and a 4G extent).

Run Req/Sec Core Gem VM std Siege Machine Notes
2 15 1 5 G 5 -b -c 10 Foos file-based, 1G SPC
4 25 2 5 G 3 -b -c 10 Foos file-based , 1G SPC
9 50 1 5 G 10 -b -c 10 Toronto raw, 1G SPC

Run2 and Run4 used file-based tranlogs and extents and Run9 used raw tranlogs. You can see that Run9 is over 3 times faster than Run2. Using raw I/O for tranlogs makes a big difference.

Even with 2 cores, Run4 is still slower than Run9. Confirming that with file-based tranlogs, the test is i/o bound.

A sustained rate of 15 requests/second (24×7) is about the top rate that we’d recommend when using the Web Edition. In Run2 the disk-based garbage collector kept pace with the expiration of session state consuming roughly 1/5 of the 4G repository before it was garbage collected. During Run10 (at a rate of 75 requests/second) a 4G extent was consumed in 15 minutes!

Internet tests

Run5 and Run6 illustrate what you might expect for production performance with real world loads.

Run Req/Sec Core Gem VM std Siege Machine Notes
5 28 1 20 G 5 -i Toronto raw, 1G SPC
6 28 2 20 G 5 -i Toronto raw, 5G SPC

The difference between Run5 and Run6 is that I used 2 cores and a 5G SPC in Run6. It is clear that neither a larger SPC or more cores are needed to sustain this rate.

These tests were run using 20 vms. In GemStone/S a vm handles a single http request at a time (each http request coming into the Seaside vm acquires the transaction mutex for the duration of a request), so in order to get concurrent handling of requests you need to have multiple vms running. A quick look at Run7, which runs at the same rate with a single vm, shows that you don’t sacrifice performance by spreading the load across 20 vms, while gaining the ability to handle up to 20 requests concurrently.

Benchmark tests

Run10, Run11, Run13, Run14, and Run15 are intended to illustrate how GemStone/S stands up to siege in benchmark mode and to illustrate the scaling performance as you add cores into the mix.

Run Req/Sec Core Gem VM std Siege Machine Notes
10 75 1 5 G 13 -b -c 10 Toronto 0.1 ART, raw, 5G SPC
11 87 1 5 G 6 -b -c 100 Toronto 1.3 ART, raw, 5G SPC
13 140 2 5 G 20 -b -c 10 Toronto raw, 5G SPC
14 185 3 5 G 37 -b -c 10 Toronto raw, 5G SPC
15 230 4 5 G 40 -b -c 10 Toronto raw, 5G SPC

Run10, Run13, Run14, and Run15 show a nearly linear progression in performace as more cores are added.

With Run11 I cranked siege up to 100 concurrent requests to see what would happen. The fact that Run11 averaged a little more than Run10 even thought they are running on the same configuration, means to me that the cpu wasn’t running flat out in Run10. With 100 concurrent requests, though, the cpu was truly hammered. If you look at the Average Response Time for the two runs, you’ll see that with 10 times more concurrent requests, it took 10 times longer to respond to a request, which makes sense since the extra concurrent requests ended up being queued up behind the 5 vms.

At rates approaching 200 requests/second, I started seeing indications that the data structure used to store session state (RcKeyValueDictionary) was reaching the limit of its effectiveness. As we move forward I will be looking at data structures that are better suited to these rates.

Conclusions

You can expect very reasonable performance numbers with the Web Edition – pretty good value for your money:). Along with transparent persistence, the Web Edition will support rates up to 50 requests/second across a large number of vms. You will have to keep an eye on the size of your repository, if your sustained rate starts averaging near 15 requests/second. But hey, you can serve an awful lot of donuts at these rates.

If you start getting more traffic, it looks like GemStone/S can be scaled to some pretty respectable rates without having to change your application.

Check out a podcast that Monty Williams, James Foster and I had with the folks from Cincom (Michael, and Arden, Dave Buck and James Robertson) where we talked about GLASS and Seaside.

This afternoon Mike Culbertson (our IS guy) and I got a version of GemSource running with lighttpd, with no changes required on the GemStone side (way to go James). The interesting thing here is that lighttpd comes with FastCGI built-in (no compiling as required for mod_fastcgi in Apache) and lighttpd automatically load balances the FastCGI requests.

Here’s the config file we used (no load balancing needed with GemSource at the moment). ss is the GemSource application and config is the Configuration Editor (requiring authorization):

server.modules   += ( "mod_fastcgi" )
fastcgi.server    = (
"/ss" =>
  ((
  "host" => "10.80.250.190",
  "port" => 9765,
  "check-local" => "disable",
  "mode" => "responder",
  )),
"/config" =>
  ((
  "host" => "10.80.250.190",
  "port" => 9765,
  "check-local" => "disable",
  "docroot" => "/htdocs",
  "mode" => "authorizer",
  ))

)

Another webserver option for GLASS. Does that mean we’ll have to call it GLlSS, or maybe GLLASS?

With this post I am starting a series of articles on some of the fundamental concepts that you should have at least a passing familiarity with if you are planning on working with GemStone/S. These articles aren’t intended to replace our most excellent documentation, but to serve as a tantalizing introduction to the wonderful delicacies one can enjoy if he or she decides to crack the shell of a fresh SAG(pdf), open a case of Topaz(pdf) and sit by a cheery fire some chilly evening this autumn.

First on the menu is the transaction. GemStone/S transactions are light on the nose and have a warm, smokey flavor with echoes of strawberries and glazed donuts that reverberates off the palate with the force of a Dandelion archene….ahem…one shouldn’t write a blog on an empty stomach.

GemStone/S is a full-fledged database complete with ACID properties. Not ACID as in Lucy in the Sky with Diamonds, but ACID as in a guarantee that the Smalltalk objects in your transaction have been completely and correctly written to disk: Atomicity, Consistency, Isolation, and Durability.

In GemStone/S transactions are used for persistence, but they are also useful for sharing objects between vms, via the Shared Page Cache (SPC).

I will warn you that I’ll get a little geeky during the following discussion and it will probably help if you take a peek at the Glossary to familiarize yourself with some of the terms I’ll be using, but I promise to try to avoid dragging you too far down the rabbit hole.

Abort

All transactions in GemStone start with an abort. When an abort is performed the following steps are performed in a gem:

  1. Invalidate objects
  2. Acquire new view

Invalidate objects

All of the dirty objects (i.e., persistent objects previously modified) in the vm are marked as invalid. The list of dirty objects in a vm is called a writeSet. All objects that have been changed since the last time the gem updated its view (called a writeSetUnion) are also marked as invalid.

A subsequent reference to an invalid object will cause a fresh version of the object to be copied into the vm.

Each persistent object is identified by a unique id called an OOP. An OOP is a 61 bit value. The three extra bits in a 64 bit word are used as tag bits. Tag bits differentiate between regular objects (which need to be physically stored in the data base) and special objects. The value of a special object is encoded in the 61 bits of the OOP itself. SmallIntegers, SmallDoubles, and Characters are examples of special objects.….Was that a rabbit????

One or more objects are stored on a data page. Objects that are larger than a page are transparently broken up into page-sized chunks. Because of this chunking it is possible to reference objects in a million element array without having to load the entire million element array into memory.

Objects are read from and written to extents on disk in units of pages. Not suprisingly pages are cached in memory in the Shared Page Cache.

The Object Table (OT) is a Btree that maps OOPs to pages. The OT is stored in a set of pages, cached in the SPC and written to an extent just like the data pages.

Acquire new view

The gem contacts the stone and gets a new view of the database. A view is a reference to the latest OT along with some other bookkeeping information.

Transaction Body

As the vm executes code following an abort, it starts keeping track of all of the objects that are modified during the transaction in the writeSet.

Object references are stored in the body of an object as an OOP, so when an instance variable is accessed in a persistent object and the object is not in the vm or the object has been invalidated, the OT is consulted, and the SPC is checked to see if the page containing the object is already present. If the page isn’t in the SPC, then the page is loaded from disk. Finally, the object is copied from the page into the vm.….A white rabbit????

Commit

When a commit is performed the following steps are performed in the gem:

  1. Flush dirty objects
  2. Write transaction log entries
  3. Acquire commit token
  4. Check for conflicts
  5. Finalize commit

Flush dirty objects

During this step all of the objects in the writeSet and all of the newly created objects that are reachable from persistent, dirty objects are copied from the vm into new pages in the SPC.

The gem’s copy of the OT is then updated with the new OOP to page mapping. The OT data structure is designed to do a copy on write, so only the portion of the Btree that is changed needs to be written to new OT pages. In practice large portions of the OT are shared amongst multiple views.….Nope, it’s a Dormouse. I’m sure of it.

The new pages in the SPC, containing the latest state of the modified objects are not written directly to disk as part of the transaction. Doing so would take too much time, as disk writes are notoriously slow. A separate process (AIO pageServer) asynchronously writes the ‘dirty’ data pages to disk.

At periodic intervals a concerted effort is made to ensure that all ‘dirty’ pages written before a certain point in time are flushed to disk. This is called a checkpoint.

Write transaction log entries

In order to ensure Durability we do have to write something to disk as part of the transaction. It turns out that we can write a minimum amount of information about the changes to objects much faster than we can write the entire object to disk.

During this step, tranlog records are written by the stone for all of the changed and new objects. Asynchronous i/o is used (when available on the host os) to write transaction logs, so that commit processing can go on while the tranlog records make their way to disk. In a performance sensitive installation, the transaction logs are located on a raw partition or on optimized disk arrays for the fastest i/o possible.….Oh Oh, now there’s a wacky, little guy in a top hat.

In the event of a system crash, one can recover the database by replaying all tranlog records written since the last checkpoint.

Acquire commit token

Up to this point commit processing in multiple gems can occur in parallel, but in this final phase of the commit, only one gem can proceed at a time. The stone manages the queue of gems by handing out a commit token to one gem at a time.

Check for conflicts

When a gem gets the commit token, it begins checking for commit conflicts (i.e., a valid transaction). It does this by comparing the gem’s current writeSet with the writeSetUnion (the union of all writeSets from the transactions that occurred since the gem acquired its original view) and if any of the OOPs are in both sets a conflict has occurred and the commit fails. If the commit fails, the gem gives up the commit token and either aborts or attempts to recover from the commit failure.….Little balls of ….. hedgehogs????

Finalize commit

If there are no conflicts, then the gem returns the commit token to the stone along with a copy of its writeSet (for writeSetUnion processing in other gems) and a reference to the gem’s updated OT. The stone is then free to pass along the commit token, but the gem must still wait until the stone informs it that the asynchronous transaction log i/o has completed.

….a toothy smile appears floating in mid-air and the Cheshire Cat slowly materializes, hands you a precious stone and fades away completely….

Oh well, maybe I went farther down the rabbit hole than I originally planned, but I did warn you!

There’s an interesting story in Waters about Kapital(pdf) and The Value of Smalltalk: valuing and risk management in VisualWorks and GemStone(pdf), a financial risk management and pricing system that was implemented in the early 90’s using VisualWorks and GemStone/S at JPMorgan. The system has been in use for over 13 years at JPMorgan and will continue to be used ‘for the foreseeable future’, largely because the system was written in Smalltalk and is giving the developers at JPMorgan the ability to stay ahead of the competition.

As a Gemstoner, I like the following quote (emphasis mine):

A small PCS installation of a few hundred CPUs will remain to support the Smalltalk GemStone database and those processes that are closely linked to it. “This will maintain the flexibility that developers have found so valuable,” says Verdier. “It is the best of both worlds.”

Check it out.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 445 other followers

Categories

RSS GLASS updates

  • An error has occurred; the feed is probably down. Try again later.

RSS Metacello Updates

  • An error has occurred; the feed is probably down. Try again later.

RSS Twitterings

  • An error has occurred; the feed is probably down. Try again later.
October 2007
M T W T F S S
« Sep   Nov »
1234567
891011121314
15161718192021
22232425262728
293031