jcrawler is a good tool for load testing Seaside applications – in theory. Unfortunately, it takes a little bit of work to modify jcrawler to get it to the point where it can be used to load test Seaside applications. So here’s my story…

The Story

In the last week I’ve started putting together some examples of persistence for GLASS (I’ll write a post about them when I’m happy with the examples). As I have mentioned before, you need to use different techniques to manage concurrency in GemStone/S, because you will be using multiple vms to serve web pages and Semaphores can’t be used. The examples illustrate several of the techniques that you can use to avoid transaction conflicts. As part of the exercise, I needed to find a way to test for transaction conflicts.

In order to create a transaction conflict, you need to have two web requests hit your web server at exactly the same time. I’ve used siege in the past for load testing, but siege uses constant URLs, not too useful for banging arbitrary URLs buried in the depths of your Seaside application.

To effectively beat on a Seaside application (especially if you want to expose concurrency bugs) you need a load tester that will crawl through your site, pick up the dynamically generated URLs and feed them back into the mix.

I knew that WAPT had been used by several folks for Seaside Load Tests, but I didn’t see site crawling mentioned in the feature list for WAPT. Beside that I’m doing my work on Linux boxes, so a Windows-only tool would not be convenient.

Without trying too hard, I found a site that listed a ton of Web Test Tools and up near the top of the of the Load and Performance Test Tools section there was a listing for jcrawler:

An open-source stress-testing tool for web apps; includes crawling/exploratory features. User can give JCrawler a set of starting URLs and it will begin crawling from that point onwards, going through any URLs it can find on its way and generating load on the web application. Load parameters (hits/sec) are configurable via central XML file; fires up as many threads as needed to keep load constant; includes self-testing unit tests. Handles http redirects and cookies; platform independent.

Just the ticket, huh? Well, if it was that easy, I wouldn’t be writing a blog post would I? Haha!

The Work

I grabbed the download from SourceForge and proceeded to build jcrawler.

You need ant, too. But that’s easily fixed.

The build completed and I was ready to slam my Seaside apps and its only been a couple of minutes! But run.sh failed:

Exception in thread “main” java.lang.UnsupportedClassVersionError: com/jcrawler/Main (Unsupported major.minor version 49.0)

It turns out that you must use JDK 5.0. Another download and some monkey business with my environment variables:

export JAVA_HOME=/home/dhenrich/jdk1.5.0_14
export PATH=$JAVA_HOME/bin:$PATH

and we’re off to the races. I launched jcrawler against:

http://172.16.172.138/seaside/examples/persistence/rcTally

a variant on WACounter running on my copy of the appliance.

The Problem

Things appeared to running okay. jcrawler was spinning away dumping log entries like the following to stdout (sorry about the line wrapping):

2568 [THREAD#93 CREATED 10:40:01::602] INFO com.jcrawler.UrlFetcher – Fetching URL http://172.16.172.138/seaside/examples/persistence/serial?
_s=lKtxmydVdpOJAjpt&_k=ggcZGlQC&1

However, as I interactively poked at rcTally, I noticed that jcrawler wasn’t hitting the + + or – – links, because the shared value was not getting updated.

After an excruciating amount of debugging, I noticed that the URLs extracted from the web page contained the sequence ‘&‘ instead of ‘&‘…geez it has been just about as hard to get WordPress to display the dang ‘&‘ string in my post (can’t use rich editor) as it was to find the problem in jcrawler (sorry about line wrapping):

http://172.16.172.138/seaside/examples/persistence/serial?
_s=lKtxmydVdpOJAjpt&_k=ggcZGlQC&1

The Fix

  1. Download an HTML Parser from SourceForge.
  2. Copy the jars from the HTML Parser into jcrawler:

    cd /home/dhenrich/htmlparser1_5/lib
    cp *.jar /home/dhenrich/jcrawler/lib

  3. Edit the jcrawler source and insert the following line after line 120 in com/jcrawler/UrlFetcher.java (in the jcrawler src directory) to convert the encoded HTML:

    content = org.htmlparser.util.Translate.decode(content);

  4. add htmlparser.jar to the list of jars in run.sh (in the jcrawler misc directory).
  5. Rebuild jcrawler and you are off to the races!

The Payoff

At the end of the day, you’ve got yourself a version of jcrawler, that can be used to randomly poke around in the nooks and crannies of your Seaside application and give it a pretty thorough workout.

As I work on the GemStone examples, I’ll learn more about jcrawler’s quirks and features, but for now it does pretty much what I need.

If there’s another load tester out there that can crawl through a Seaside website, I’d appreciate hearing about it.