Monday, May 05, 2008

SDE 9.2's ST_GEOMETRY: Part Two, The Empire Strikes Back

I've been investigating ESRI SDE's ST_GEOMETRY support and performance on some simple sample data sets. I'm embarassed to say I don't recall where they came from, but they are all US counties (about 3,000) and all US ZIP codes (about 30,000). My initial reactions were excitement at the speed of spatial joins on small datasets, disappointment at slow performance on large datasets, followed by theorizing about the cost of out-of-process calls.

Well, in these things it's not the journey, it's the destination. I thought I better compare to Oracle Spatial's SDO_GEOMETRY to make sure I was comparing apples to apples, as it were. The results are interesting.

I'm trying two simple tests of throughput, one a spatial join which selects all the ZIP codes which overlap a given (semi-random) set of counties, the other a raw select which forces all geometries to be converted to WKB format for consumption by a putative 3rd party tool.

For Oracle

select count(*) from ozip zi join ocounty co on (sdo_relate(zi.shape, co.shape, 'mask=ANYINTERACT') = 'TRUE') where co.objectid between 1200 and 1500; select sum(dbms_lob.getlength(sdo_util.to_wkbgeometry(shape))) from ozip;

For SDE

select count(*) from zip zi join county co on (st_intersects(zi.shape, co.shape)=1) where co.objectid between 1200 and 1500; select sum(dbms_lob.getlength(sde.st_asbinary(shape))) from zip;

What should the baseline be? I figure it's the existing geoprocessing and rendering tools from ESRI. So for the spatial join I just used the Intersect toolbox on the feature classes with the 'objectid between 1200 and 1500' as the definition query for counties, as above. For raw rendering, I simply averaged 3 successive refreshes of a map full of all 30,000 zip codes. This makes the baseline look even slower than it really is, as it also times rendering and context switches. SDE's fast, people; we knew that.

The somewhat surprising results are as follows (all times in seconds).

OperationSDE/SDOSDE/STST_GEOMETRYSDO_GEOMETRY
Join25256.0?
Scan15535

I got tired of waiting after 15 minutes for Oracle to convert its 30,000 zip codes to WKB, so I gave it a score of ∞. All queries appeared to be CPU bound, which makes sense as the entire data set fits into the memory of even this old laptop.

Update 2008-May-5: In amusing and ironic twist, Paul Ramsey has notified me that Oracle doesn't let you release benchmark results without making sure they've been dolled up and faked by their sales engineers, er, excuse me, cleared by their legal department. I've removed some of the actual figures from the charts above. Infinity's hard to hide in a closet, even with slick marketing.

The lessons are certainly mixed.

Joins. For the kinds of in-SQL-on-the-fly joins that make geometry data types tempting, SDO_GEOMETRY might be a clear winner (better ask Oracle legal) but not by a massive amount (?? seconds versus 6.0 seconds, making ST_GEOMETRY ??% slower). Perhaps this is because of the process switches endured by implementing ST_GEOMETRY in st_shapelib, perhaps not. Perhaps it is the difference in indexing schemes. More complex and varied tests would need to be done. For back-of-the-envelope estimates, they're roughly equivalent. For most applications, if 10 seconds is acceptable, so is ??. ST_GEOMETRY and SDO_GEOMETRY are both certainly far smarter than the Intersect toolbox, which spends most of its time querying more data than it needs.

Scans. If you want to grab data from these systems and process them using open standards, you lose either way. With ST_GEOMETRY, you can do the work using ST_ASBINARY, but the performance is unimpressive: roughly 1000 shapes per second. (Compare at 6000 per second for SDE querying ST_GEOMETRY itself and ArcMap rendering them.) SDO_GEOMETRY goes completely to lunch; they're not taking WKB very seriously. My 3:1 showing for SDE rendering of ST_GEOMETRY versus SDO_GEOMETRY layers throws a lot of these numbers into question. Would ESRI really let their Oracle Spatial implementation be that much slower?

My provisional conclusion is that ST_GEOMETRY holds promise for spatial SQL, as much as SDO_GEOMETRY, though it probably needs more tuning from ESRI's side. No one is going to be writing useful GIS tools which use the WKB/WKT forms of these geometries anytime soon. If you want fast scanning of data, you've got to get under the covers and read the data natively. That is, I suppose, the next experiment. I'd like to know how easy it will be to read ST_GEOMETRY data natively to .NET, where my particular bread is buttered. More to come.

Sunday, May 04, 2008

SDE 9.2's ST_GEOMETRY: Part One, perhaps Part Last

Last time, I played a little with SDE 9.2's new ST_GEOMETRY support and was glad I waited for SP5 (!). More experimentation revealed that writing spatial SQL was easy. I've waited a long time to use ESRI-sanctioned methods for asking things like

select * from zip zi join county co on (st_intersects(zi.shape, co.shape)=1) where co.population > 1000000; .

And it does work. The spatial indexing gets used intelligently, and you can throw (variants of the above) into ArcMap and render spatial queries on the fly. I even started experimenting with doing on-the-fly geoprocessing with things like shape intersections.

I soon noticed that doing anything with more than a few hundred shapes was slow. Of course at first I blamed my queries, the Oracle optimizer, the spatial indexes, anything. But this ain't my first rodeo. That stuff was all fine. No, I was noticing that no matter what I did, ST_GEOMETRY couldn't deal with more than about 1000 shapes per second. Fancy spatial indexes don't do much good when the final geometry-to-geometry filter maxes out on so few shapes.

Why oh why?

Part of the problem seems to be because all ST_GEOMETRY functionality is implemented with an external C DLL, st_shapelib. This is how Oracle wants you to do it. Every call to an ST_GEOMETRY function is made out of process from the Oracle process serving your connection (oracle.exe on Windows) to a spawned executable (extproc.exe on Windows) via pipes or sockets. When intercepting a CREATE TABLE command and jumping in to create an index, these two process switches are no big deal. When determining whether one polygon overlaps another during a join operation over tens of thousands of individual polygons, it is murder. Quantifying the cost of a context switch is tricky, but at the very least it is thousands of cycles. Figure four thousand cycles, plus a few thousand for copying the data back (and forth), plus the (perhaps greater) cost of killing cache locality and dumping all your registers, and it's not a pretty picture. Compare that to the cost of doing the few dozen or hundred operations needed for determining overlap of my typical polygons and it's likely this out-of-process trick is killing performance by a factor of ten or more.

To make sure this wasn't just a fancy theory, I pulled up trusty perfmon.exe and had it count context switches. Here it is ticking along for a while on my idle machine, then executing a very simple query for a couple of seconds.

Can you spot the query? The average number of context switches per second hovers around 850, then spikes to 20,000 during the query, then back.

For completeness, I wrote a test program which simply created two threads which did nothing but relinquish control and loop, trying to get an upper bound on the number of switches I could do. Trusty F# with its concision was useful here.

open System.Threading;; let rec cswitch () = Thread.Sleep(0); cswitch();; let makethread () = (new Thread(cswitch)).Start();; makethread();; makethread();;

And I was off to about a million context switches per second. That's an unrealistically high number, since the above code is just an infinite loop with no overhead whatsoever. But I would guess that 1 million no-overhead process switches translates to roughly the ~100,000 order of magnitude real-world process switches I was seeing with the st_shapelib calls. I also suspect it means st_shapelib is doing more than a trivial amount of work, which might mean there's room for performance enhancements, too. Just don't know.

But the context switching seems to be a fatal flaw. Oracle's EXTPROC isn't meant for this kind of fine-grained work. The next investigations involve checking out whether Oracle can be sweet talked into running these external processes in-process, but I'm willing to guess the answer is no. So, it was promising, but there's no way this architectural decision gives anyone the kind of speed they're looking for.

It appears you can't write code in C in Oracle without being sent off to EXTPROC, so perhaps Oracle Spatial will suffer from the same problems. That'll be the next set of investigations.

(This post was updated a few hours after being published with a few more thoughts on Oracle extensions.)