Last week Dario Trussardi submitted a bug report to the GLASS Beta mailing list (subscription required, but feel free to subscribe if you’re interested). The bug involved Magritte auto accessors – a feature whereby new instance variables (and accessor methods) are dynamically added to a class on demand. Dario was triggering the new instance variable creation during the rendering phase of a Seaside component and getting an Internal Server Error from Seaside:
InterpreterError 2403: Cannot commit, <‘a previous commit attempt failed with an error, this transaction must be aborted’>
This error doesn’t have the most intuitive explanantion and the stack dumped into the object log was rooted in the standard commit that occurs right before the HTTP response is returned to the browser – not much help.
I had tangled with the auto accessors before, in the Magritte unit tests, but Dario’s scenario wasn’t covered there. Dario provided a simple test case and I was off to the races. In the end I was able to fix the problem in Dario’s test case (at this writing I’m waiting for confirmation from Dario – when I get it I’ll publish a Beta Update), but I thought that it would be instructive to cover the issues involved in case anyone else wanders into this territory.
If you’re using Magritte auto accessors or are seeing 2403 errors or want to geek out on GemStone for a bit or have already settled down with a cup of coffee to “read another long post of mine” (yes I’m talking about you, Gera:), then by all means read on.
- #readUsing: and #writeObject:using:
- Magritte auto accessor and Seaside
#addInstVarName: is the instrument used by Magritte to dynamically add new instance variables to a class.
From a pure Smalltalk perspective nothing magical goes on: when you add an instance variable to a class, you expect that all instances of the class (and instances of all the subclasses) will be migrated to the new version of the class. In GemStone/S, this boils down to the following steps:
- Add an instance variable to a class and create a new version of the class.
- Collect the instances of the old version of the class (#allInstances).
- Migrate each instance to the new version of the class (#migrateInstances:to:).
- Creating an instance of the new version of the class,
- Copy the instance variables in the old instance to the new instance (see Section 8.3 in the Programming Guide for the details – you can customize the instance variable mappings)
- use #become: to cause all references to the old instance to reference the new instance.
At first blush this doesn’t seem to be problematic, however, the devil is in the details.
GemStone/S is designed for dealing with huge repositories, a billion objects and more. #allInstances scans all objects in the repository, you can imagine that for a large repository, we need to be as efficient as possible. For large repositories the key to efficiency is to minimize disk i/o.
ObjectTable (OT) lookups for each object can be expensive (potentially multiple page faults per lookup), however, data pages can be read from disk in big chunks and processed in-memory very quickly. In order for this technique to work, we have to know that all of the objects on a particular data page are valid (thus avoiding OT lookups). With the DPNSUnion we can tell which data pages can be scanned without an OT lookup. Finally, the DPNSUnion has to be calculated and to do that we must get a fresh view of the OT, which requires a commit or abort.
If you attempt an #allInstances without doing an explicit commit or abort, you’ll get the folllowing error message:
InterpreterError 2412: An attempt to execute a method that requires an abort or prevents further commits would result in lost data.
Since #allInstances requires an abort or a commit, the implication is that a #addInstVarName: requires a commit or abort as well – this will become important later.
Since GemStone/S uses an ObjectTable (OT), the implementation of #become: is essentially an OOP swap in the OT. Except (there had to be an except) when one of the objects involved in the #become: is on the stack as either a message receiver or as self in an ExecutableBlock.
This restriction for #become: isn’t a fundamental requirement; it’s more of an optimization. With this restriction, the interpreter can assume that the class of an object on the stack won’t change leading to the elimination a handful of extra instructions during method execution.
It is possible that this restriction will be eliminated for GemStone/S 64 3.0, but until then we have to live with this.
We are finally getting close to the meat of this post.
The restriction on #become: is problematic for Magritte (and by extension Pier), because the auto accessor feature (which may dynamically add instance variables) is triggered when a message (#readUsing: or #writeObject:using:) is sent to an object that is being managed via Magritte descriptions. If you happen to trip across this problem you’ll get an error message like the following:
InterpreterError 2322: The object <> is present on the GemStone Smalltalk stack as “self”, and cannot participate in a become.
In fixing Dario’s bug, I was able to get around this problem by taking advantage of the fact that #readUsing: is implemented by effectively doing a double dispatch to an MADescription as follows:
readUsing: aDescription "Dispatch the read-access to the receiver using the accessor of aDescription." ^ aDescription accessor read: self
The implication is that the object being dynamically modified doesn’t have to be on the stack as self. We can arrange for an error handler to wrap the #readUsing: call and if a #become: is issued (only needed when the initial modification occurs) make an in-line call of #readUsing: something like the following:
readObject: anObject using: aDescription | obj retry | retry := false. obj := [ anObject readUsing: aDescription ] on: ErrCantBecomeSelfOnStack do: [:ex | retry := true. ex return ]. retry ifTrue: [ obj := aDescription accessor read: anObject ]. ^obj
A similar technique is used for #writeObject:using:. The exception ErrCantBecomeSeofOnStack is thrown for error number 2322.
Fortunately in the Magritte/Pier infrastructure all of the direct calls to #readUsing: and #writeObject:using: are done by wrapper objects so it was relatively easy to replace the calls to #readUsing: to #readObject:using:.
If you happen to be directly calling either #readUsing: or #writeObject:using: in your own application and using the auto accessor feature of Magritte, then you may have to do a similar refactoring in your application.
Well, remember the bit about #allInstances requiring a commit or abort? You really don’t want to do any commits or aborts while processing an HTTP request in GLASS, at best you’ll get a failed commit and at worst you’ll get undetected inconsisties in your data structures (the session lock is dropped on the transaction boundaries).
So how does one do something like #allInstances that requires a commit or an abort while processing an HTTP request in Seaside/GLASS?
It turns that the technique used when an object lock is denied will work, with a slight twist. When an object lock is denied, we abandon the current request and retry the HTTP request, after a short delay. So it follows that if we absolutely have to do a commit or abort to (in this case) add an instance variable to a class and migrate it’s instances, we should be able to:
- abandon the current request
- do an abort
- execute #addInstVarName:, committing if successful
- retry the HTTP request
When the request is retried, the new instance variable will have been added so it won’t be necessary to do that operation again.
In order to facilitate this operation, I added a new Notification called SafelyPerformBlockRequiringAbort. This notification takes a block and in it’s default action, it does a commit and executes the block. In the GLASS/Seaside framework, I’ve arranged to catch SafelyPerformBlockRequiringAbort in a spot where it is safe to start a new transaction and after evaluating the block, cause the HTTP request to be retried.
The caveat is that you don’t really want to be doing a lot of #allInstances calls in your application – I’ve seen noticeable delays in even small repositories. However, in Magritte/Pier-based applications the number of times that the shape of a class is changed is relatively small, so the cost of the #allInstances call is more than offset by the advantage that end users gain by being able to customize their application.
I also think that SafelyPerformBlockRequiringAbort is general enough that it can be used in other cases where an #allInstances call just can’t be avoided.