Geometry Handling in pygp versus arcgisscripting
A year or so ago we gave a presentation at a local Esri Developer Meetup that showcased some of the advanced functionality we have implemented in our pygp package. For those unaware of pygp, it stands for Python Geoprocessing and is a relatively large suite of api-like code (and documentation and class diagrams) that allows for the same code to operate Python versions 2.4 through to 2.7 and Esri ArcGIS 9.2 through to 10.x (yes, that includes the dozen or so service packs in between too).
pygp contains a large number of functions and classes that are hugely beneficial when developing applications that need to be more than a typical 5 lines of code exported from Model Builder. The main benefit of pygp is that we use it in our day-to-day business, this means it is constantly under development with new capabilities being added weekly. Not only that, we implement things in a Pythonic and natural manner i.e. as professionals tasked with automating complex workflows how would we prefer to work?
Here for example is an example adapted from the Esri help documentation circa 9.2 and 9.3. Recall that this is the vintage where lists were actually enumerations and cursors had to be stepped over using a while loop.
# -*- coding: ascii -*- """ Raises ValueError on GetObject for Null point in 9.2 and 9.3 and Returns None in 10. """ __author__ = 'Jason Humber - Integrated Informatics Inc.' __maintainer__ = '$LastChangedBy$' __vcs_id__ = '$HeadURL$' from os.path import join as osjoin from arcgisscripting import create from pygp.dataelement.extended import FeatureClass __copyright__ = 'Copyright (c) 2009-2012, Integrated Informatics Inc.' __license__ = 'LGPLv3' __version__ = '0.1.0' __email__ = 'email@example.com' POLYGON = r'sample_data\fgdb_with_data.gdb\prj_a' def example_geometry_help(path): """ Example from ESRI help documentation on accessing parts and points. Code adapted from ESRI Help for 9.3, reading geometries This approach avoids the ValueError :param path: Workspace Path :type path: str """ gp = create() shape_field_name = gp.Describe(osjoin(path, POLYGON)).ShapeFieldName srows = gp.SearchCursor(POLYGON) srow = srows.Next() while srow: geom = srow.GetValue(shape_field_name).GetPart() part_num = 0 part_count = range(geom.PartCount) while part_num < part_count: part = geom.GetPart(part_num) pnt = part.Next() pnt_count = 0 while pnt: print pnt.x, pnt.y pnt = part.Next() pnt_count += 1 if not pnt: pnt = part.Next() if pnt: print "Interior Ring:" part_num += 1 srow = srows.Next() del srow, srows # End example_geometry_help function
The help documentation spells out what this results in and this example is essentially just showing how to get the points and parts of a geometry while looping over the rows in a feature class. Nothing magical here other that to point out that this is a lot of code and that the approach does not raise a ValueError like this second approach (which should work by returning a null point).
def example_geometry_value_error(path): """ Example Showing how ValueError was raised when using GetObject on a Null Point (i.e. the separator for exterior/interior rings) in releases before ArcGIS 10. As of ArcGIS 10 the Null point is returned as None. :param path: Workspace Path :type path: str """ gp = create() shape_field_name = gp.Describe(osjoin(path, POLYGON)).ShapeFieldName srows = gp.SearchCursor(osjoin(path, POLYGON)) srow = srows.Next() # This approach will hit the ValueError in 9.2/9.3 while srow: geom = srow.GetValue(shape_field_name).GetPart() part_num = 0 part_count = range(geom.PartCount) while part_num < part_count: part = geom.GetPart(part_num) for i in range(part.count): try: pnt = part.GetObject(i) print pnt.x, pnt.y # This ValueError was reliable until 10 # Now it no longer gets raised, code broken!!! except ValueError: print "Interior Ring:" continue part_num += 1 srow = srows.Next() del srow, srows # End example_geometry_value_error function
So this is annoying. Accessing points in the geometry one way works fine and in another manner it bombs. For those of you wondering, this issue was around for all of 9.x but then finally (and unexpectedly) it got fixed in 10! That’s great! But what about code that needs to work in both 9.x and 10.x, that becomes the issue. Not only that, it is still a huge amount of code to write just to get the points out of a geometry. Part of me wishes for the simplicity of Avenue when geometries were fast and super easy to work with…so with a little motivation here is what we did:
def example_geometry(path): """ Example showing use of Geometry helper class that does the heavy lifting on the geometry object and returns something quite similar to WKT/GeoJSON Structure is simple, a tuple of tuple of Point objects, very similar to Avenue days of geometry and WKT MultiLineString etc. (((0, 498266, 6100519, None, None), (0, 499775, 6100281, None, None), (0, 500224, 6098694, None, None), (0, 499616, 6097662, None, None), (0, 498346, 6096789, None, None))) :param path: Workspace Path :type path: str """ feature_class = FeatureClass(osjoin(path, POLYGON)) for srow in feature_class.search(): print srow.get_value(feature_class.shape_field_name).as_tuple() # End example_geometry function
Let me step through the subtlties of this code. First of all, it is only 3 lines (could have made it one if the motivation was to distract). See how we have a nice FeatureClass object, this is quasi similar to a DEFeatureClass in ArcObjects but implemented for Python.
The next thing to see is that we can loop over the rows in the feature class using a search method (cool!). Not only that, we can loop over it using a for loop, now that is hot!
On the third line we are using get_value, a method on our SearchRow object that is smart enough to know how to return geometry in a nicely usable way, that is, by returning our Geometry object. Also see that when we want to return a special field the FeatureClass object we can get the name using a property of the FeatureClass object (for example, shape_field_name).
Last thing is the use of the as_tuple method, a beautiful method that knows how to return geometry as a set of nested tuples of Point objects that can just as easily be reassembled.