Friday, August 10, 2007

REST for Rasters: a RANGE of options

Hey, I know a dogpile when I see one. I'll jump in. (Seriously, follow that link first, then read the response.)

Sean's spot on. Resources can have subcomponents. I even—while addressing a different but related point—have opined about file-based formats not forming a sensible basis of REST rasters. Dividing rasters into interesting slices as Sean has suggested is the right big idea. As far as suggestions, I've been meaning to write about the following because we've been working on them on one of my projects.

RANGEd (partial) GETs. HTTP GET supports ranged requests. In other words, I can ask for specific bytes of a file. In this case, a smart raster client could indeed notice that a resource was a GIF, and only ask for the first few hundred bytes, read critical header information, and use random access to obtain the actual pixel values. A naive approach would end up being awfully chatty, and it leaves the resource a bit more opaque than you might like (after all, the byte range is not part of the URL) but it is a practical solution.

Standard Raster Accessors. Libraries like GDAL or ESRI make very very similar assumptions when dealing with raster data. Everyone agrees that rasters have bands of data, might hold data of different types (ints, floats, bits, etc.), are conveniently arranged in tiles, etc. It is probably not too hard to agree on a standard header for raster information to be returned to a HEAD request. And then a fragment/anchor identifier language could be used to agree on raster chunks to be returned in an appropriate binary format. (That's the part after the '#' in the resource.)

  • http://example.org/seafloor.png. The whole shootin' match. Ask for a HEAD or do smart partial GET requests.
  • http://example.org/seafloor.png#*.0,0.100,100. Gets all bands (*) and all pixels between (0,0) and (100,100) pixel coordinates.

The actual syntax above is obviously only a stab in the dark, and perhaps there are already better examples in things like the OGC Web Coverage Service, but the fact is indexing into even very large rasters doesn't take a lot of data, and it'd fit nicely in the fragment/anchor. Which would be very RESTy, very URL chic.

Thursday, August 09, 2007

spatialreference.org: Let the Sun Shine!

Every once in a while someone spends the weekend finishing something that should have just been taken care of years ago. A few weekends ago, Howard Butler and Cristopher Schmidt did that with spatialreference.org. Someone needed to take the EPSG spatial reference database and web-enable it. OGC's solution was to demand a solution so obscured by me-too XML-ism, it could only be built over many many years by their favorite XML promulgators, Galdos. And in fact, it's still not there. Howard & Christopher realized what Steve Jones pointed out: in 2007, CRUD applications should be considered "dull, boring and uninteresting". Yes, it should be possible to CRUD-ify the EPSG database in a matter of hours and get on to the interesting stuff. Thank goodness they did.

Yes, they've taken the latest EPSG codes and made them available for download and upload. But hey, here's the extra cool stuff they can do now this CRUD's out of the way and trivially accessible via an easy to program REST service.

  • Get spatial references in a variety of formats! Yes, it's all well and good you & I agree we are talking about my favorite coordinate system, Aratu / UTM Zone 24S. But I love the fact that by just asking for the same URL postfixed by "ESRIWKT" I can serve my corporate overlords by downloading a PRJ file so I can do the math with the ESRI projection engine. Or by postfixing with "proj4", I can work instead with everyone's favorite open source engine. That's value add.
  • Those usage rectangles hiding in the EPSG database can be used to tell me where that projection is preferred or valid! No sooner did I predict to some colleagues that these crafty Python programmers would find it easy to make the usage areas available (Tuesday) than those crafty Python programmers just did it (Thursday!), complete with nifty map showing valid areas! Yep, Aratu is used off the coast of Brazil.
  • I can search for different coordinate systems without having a copy of MS Access on my machine.

And because they're programming simple REST in Python, they can churn out features in hours instead of months. Let's hear it for spatialreference.org.

I am sure that spatialreference.org will be seen as a dangerous influence. I have already seen stiff resistance to the concept. As a GIS programmer stitching lots of systems together, I've been waiting years for a standard authority that's actually user editable and supports multiple formats. So please forgive the following rant. Here's what the geodetic elites will say: We can't have amateurs doing this, can we? How can we be sure these crazy Python people aren't secretly corrupting the EPSG database they're supplied with by, um, EPSG? How can we be sure they're even keeping it up to date and avoiding errors? (GDAL's enormous world of contributers and commercial support apparently not withstanding. Sorry, Frank, I guess.) We recommend not using it; you must wait for the OGC/Galdos officially sanctioned SOAP monster from hell this September!. What if all the EPSG database needed was a better technical solution from people with technical common sense, instead of years of hand-wringing by otherwise very smart people? Look, these standards improve because they're widely used. Locking up EPSG codes in an Access database begs them to be copied offline, abused and misused. Opening them up to the harsh daylight of global interoperability will get them cleaned up in a jiffy. Sunlight is the best disinfectant.

Things I'd love to help patch into this beauty:

  • Units of measure. The EPSG database defines a wealth of linear and angular units of measure, but a nice user-defined database of other units would sure be handy. Degrees Celsius per millisecond, anyone?
  • Coordinate Transformations. These are a bug-bear for everyone. Hyperlinked lists of transforms appropriate for different projections and areas of the world are sorely needed.
  • Secure Endorsements & Selections. Companies often invent new reference systems and transforms. They should be able to upload them and sign them cryptographically so that on download they can be sure they have not been altered. EPSG might even sign their own with a public certificate as part of a formal review. Also, many users may wish to browse the database in a way that limits their view to only common systems in use in their area. A Brazilian oil company is going to be endlessly fascinated by Aratu and the Illustrious South American Datum 1969, but annoyed if they have to wade through four screens worth of Xian 1980 / 3-degree Gauss-Kruger zones. Why not let that oil company create their own lists of preferred systems (and transforms, don't forget the transforms!) and sign those lists? Coolness all around.
  • More Authority Translations. What is the Blue Marble factory code for Aratu? Mentor? These take a bit of input from the companies themselves and the geodetically minded. spatialreference.org merely provides the clearing house and brings the discussion out into the open where it can be vetted.
  • Warnings of Incomplete Translations. Aratu is one of my favorite examples because it often highlights differences in projection engines. ESRI knows that the Aratu datum is based on the International 1924 ellipsoid. But it's a datum, not a pure ellipsoid. PROJ4 doesn't know Aratu, it just calls it "+intl". The PROJ4 codes given by spatialreference.org (by way of GDAL, if I understand correctly) for "Aratu" and "Unknown datum based upon the International 1924 ellipsoid" are the same ("+proj=longlat +ellps=intl +no_defs"). Yes, the latter is marked clearly "Not recommended", but the Aratu page's PROJ4 version merrily drops the Aratu-ness right out of Aratu. Icky. Again, this isn't an issue the Python boys can merrily solve; it's a thorny data issue. But by providing a framework for that to be annotated, good things can happen.

Let the Sun Shine!