jcrawler for Seaside Testing

jcrawler is a good tool for load testing Seaside applications – in theory. Unfortunately, it takes a little bit of work to modify jcrawler to get it to the point where it can be used to load test Seaside applications. So here’s my story…

The Story

In the last week I’ve started putting together some examples of persistence for GLASS (I’ll write a post about them when I’m happy with the examples). As I have mentioned before, you need to use different techniques to manage concurrency in GemStone/S, because you will be using multiple vms to serve web pages and Semaphores can’t be used. The examples illustrate several of the techniques that you can use to avoid transaction conflicts. As part of the exercise, I needed to find a way to test for transaction conflicts.

In order to create a transaction conflict, you need to have two web requests hit your web server at exactly the same time. I’ve used siege in the past for load testing, but siege uses constant URLs, not too useful for banging arbitrary URLs buried in the depths of your Seaside application.

To effectively beat on a Seaside application (especially if you want to expose concurrency bugs) you need a load tester that will crawl through your site, pick up the dynamically generated URLs and feed them back into the mix.

I knew that WAPT had been used by several folks for Seaside Load Tests, but I didn’t see site crawling mentioned in the feature list for WAPT. Beside that I’m doing my work on Linux boxes, so a Windows-only tool would not be convenient.

Without trying too hard, I found a site that listed a ton of Web Test Tools and up near the top of the of the Load and Performance Test Tools section there was a listing for jcrawler:

An open-source stress-testing tool for web apps; includes crawling/exploratory features. User can give JCrawler a set of starting URLs and it will begin crawling from that point onwards, going through any URLs it can find on its way and generating load on the web application. Load parameters (hits/sec) are configurable via central XML file; fires up as many threads as needed to keep load constant; includes self-testing unit tests. Handles http redirects and cookies; platform independent.

Just the ticket, huh? Well, if it was that easy, I wouldn’t be writing a blog post would I? Haha!

The Work

I grabbed the download from SourceForge and proceeded to build jcrawler.

You need ant, too. But that’s easily fixed.

The build completed and I was ready to slam my Seaside apps and its only been a couple of minutes! But run.sh failed:

Exception in thread “main” java.lang.UnsupportedClassVersionError: com/jcrawler/Main (Unsupported major.minor version 49.0)

It turns out that you must use JDK 5.0. Another download and some monkey business with my environment variables:

export JAVA_HOME=/home/dhenrich/jdk1.5.0_14
export PATH=$JAVA_HOME/bin:$PATH

and we’re off to the races. I launched jcrawler against:

http://172.16.172.138/seaside/examples/persistence/rcTally

a variant on WACounter running on my copy of the appliance.

The Problem

Things appeared to running okay. jcrawler was spinning away dumping log entries like the following to stdout (sorry about the line wrapping):

2568 [THREAD#93 CREATED 10:40:01::602] INFO com.jcrawler.UrlFetcher – Fetching URL http://172.16.172.138/seaside/examples/persistence/serial?
_s=lKtxmydVdpOJAjpt&_k=ggcZGlQC&1

However, as I interactively poked at rcTally, I noticed that jcrawler wasn’t hitting the + + or – – links, because the shared value was not getting updated.

After an excruciating amount of debugging, I noticed that the URLs extracted from the web page contained the sequence ‘&‘ instead of ‘&‘…geez it has been just about as hard to get WordPress to display the dang ‘&‘ string in my post (can’t use rich editor) as it was to find the problem in jcrawler (sorry about line wrapping):

http://172.16.172.138/seaside/examples/persistence/serial?
_s=lKtxmydVdpOJAjpt&_k=ggcZGlQC&1

The Fix

Download an HTML Parser from SourceForge.
Copy the jars from the HTML Parser into jcrawler:

cd /home/dhenrich/htmlparser1_5/lib
cp *.jar /home/dhenrich/jcrawler/lib
Edit the jcrawler source and insert the following line after line 120 in com/jcrawler/UrlFetcher.java (in the jcrawler src directory) to convert the encoded HTML:

content = org.htmlparser.util.Translate.decode(content);
add htmlparser.jar to the list of jars in run.sh (in the jcrawler misc directory).
Rebuild jcrawler and you are off to the races!

The Payoff

At the end of the day, you’ve got yourself a version of jcrawler, that can be used to randomly poke around in the nooks and crannies of your Seaside application and give it a pretty thorough workout.

As I work on the GemStone examples, I’ll learn more about jcrawler’s quirks and features, but for now it does pretty much what I need.

If there’s another load tester out there that can crawl through a Seaside website, I’d appreciate hearing about it.

February 2008
M	T	W	T	F	S	S
	1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29

8 comments

Comments feed for this article

March 12, 2008 at 2:52 pm

GLASS 101: Ready… « (gem)Stone Soup

[…] 12, 2008 in Appliance, GLASS, Scalability, Seaside, Smalltalk Before we point jcrawler at a Seaside application I would like to talk about what you should […]

March 13, 2008 at 12:39 pm

GLASS 101: …Aim… « (gem)Stone Soup

[…] 13, 2008 in GLASS, Seaside, Smalltalk Once you’ve fixed jcrawler so that it can follow dynamic URLs like those used in Seaside, you are ready to aim it at a Seaside […]

September 25, 2008 at 3:22 pm

GLASS Beta Update: GLASS-dkh.122 & GLASS.230-dkh.162 « (gem)Stone Soup

[…] I had the basic prototype working I needed to really hammer the server, but neither siege nor jcrawler does a good enough job (siege doesn’t crawl a site and jcrawler sends concurrent requests for […]

October 30, 2008 at 4:13 pm

GLASS 101: Simple Persistence « (gem)Stone Soup

[…] for navigating the treacherous waters of transaction conflicts. I even monkeyed around with jcrawler, so that I’d have a tool that could be used to expose concurrency flaws in the […]

October 1, 2009 at 7:22 am

sebastian

Hi Dale, thanks for sharing this.
Your step by step is valuable. Specially the patch.
It’s strange that there are no more tools for this kind of job, isn’t it?
I’ve found selenium interesting but the firefox add on only allow manual run of pre-recorded tests. I don’t know yet if it’s useful for a load test.

October 1, 2009 at 7:42 am

for the record: http://www.wareprise.com/2008/10/29/list-of-top-web-application-load-and-stress-test-tools/

October 1, 2009 at 9:09 am

Dale Henrichs

Sebastian,

I am surprised as well. Seaside has special needs (session awareness) and I started on a load tester that would be award of sessions, but I got side-tracked with more pressing things:)… Perhaps that the norm as opposed to the exception when it comes to developing load testing tools:)

October 1, 2009 at 1:27 pm

Deploying Seaside: load testing the setup | The command line

[…] reproduced by everyone so that you can check your setup and compare it to others. Finally read the posts from Dale Henrichs blog about Seaside scaling but using Gemstone/S […]

jcrawler for Seaside Testing

The Story

The Work

The Problem

The Fix

The Payoff

Email Subscription

Categories

Top Posts

Answers

Blogroll

Interesting Links

Seaside2.8 Tutorials

Tutorials

Meta

GLASS updates

Metacello Updates

Twitterings

Archives

Pages

8 comments

Leave a comment Cancel reply

Blog Stats

jcrawler for Seaside Testing

The Story

The Work

The Problem

The Fix

The Payoff

Share this:

Related

Email Subscription

Categories

Top Posts

Answers

Blogroll

Interesting Links

Seaside2.8 Tutorials

Tutorials

Meta

GLASS updates

Metacello Updates

Twitterings

Archives

Pages

8 comments

Leave a comment Cancel reply

Blog Stats