Archive

Archive for February, 2012

Inventorying Feature Classes

February 26, 2012 Leave a comment

Over on another blog we saw a post about getting a listing of feature classes by recursively navigating over the file structure.   Pretty good post but it was kind of like reading a snippet of code from how we used to do things back in early days of geoprocessing and Python and especially before pygp became what it is today.

What exactly were the notable items that made us cringe?

  1. Missing parameter documentation for descriptions and types
  2. Use of an argument named workspace where really a folder is the only legit input
  3. Storage of existing environment settings prior to overwriting them
  4. Multiple calls to listing functions (e.g. ListWorkspaces) with different arguments
  5. Listing feature classes on a Feature Dataset (aka slow)
  6. If statements to test if an iterable has content

So what do we suggest other than blatant promotion of pygp?  Roughly in order of the items above we would recommend:

    1. Documentation, docstrings are good but not the full story, learn to use epydoc or reStructuredText and your code becomes self documenting and readability through the roof.  Take it another level and use doctest (topic for another day).
    2. Choose your argument names carefully to allow fellow developers to easily recognize scope of use, i.e. folder is more precise than workspace.
    3. Context management, a beautiful thing that can be used for more than just file opening and da cursors.  The with statement can be your pal if you learn the pattern.
    4. Step it up and use objects that know their types, getting past use of strings for return values for almost everything (except of course where we want them)
    5. There are much faster ways to get the feature classes (and other elements) within a feature dataset than to list the feature classes.
    6. No need to test for content in the iterable, just iterate over it in the list comprehension, and if it is empty it will be a “do nothing” action

Here is roughly how we would code this type of recursive listing of feature using pygp these days in half as much code and three times as much documentation and a little bit of messaging to improve the user experience.

# -*- coding: ascii -*-
"""
Recursive Listing of Feature Classes
"""


__author__ = 'Jason Humber - Integrated Informatics Inc.'
__maintainer__ = '$LastChangedBy$'
__vcs_id__ = '$HeadURL$'


from os import walk
from pygp import gp
from pygp.environment import env, SwapContext
from pygp.dataelement.extended import FolderWorkspace


__copyright__ = 'Copyright (c) 2012, Integrated Informatics Inc.'
__license__ = 'LGPLv3'
__version__ = '0.1.0'
__email__ = 'gis@integrated-informatics.com'


def seek_feature_classes(folder):
    """
    Recursively searches a folder/directory for Feature Classes, returning a
    list of all Feature Classes found.

    :param folder: Root Folder from which the recursive seek commences
    :type folder: str
    :return: List of Feature Class objects
    :rtype: list
    """
    folder_workspaces = [folder]
    feature_classes = FolderWorkspace(folder).feature_classes
    with SwapContext('workspace', folder):
        for root, _, _ in walk(folder):
            if root not in folder_workspaces:
                continue
            gp.add_message('Seeking feature classes inside %s...' % root)
            env.workspace = root
            for name, workspace in gp.list_workspaces().items():
                if isinstance(workspace, FolderWorkspace):
                    folder_workspaces.append(name)
                feature_classes.update(workspace.feature_classes)
    return feature_classes.values()
# End seek_feature_classes function


if __name__ == '__main__':
    pass

In this example we’ve left out the feature type and wild card portions but if we were to include it would be done as post process on the feature class list.

Let’s talk a little more about this example. We begin by keeping track of folders that are in fact folder workspaces because we only want to recursively dive into folders and not all workspaces that are folder-like (an argument could be made about folders underneath coverage workspaces). At this time we give ourselves a container to store the feature classes based on the feature classes in the specified folder.

Now we use a context manager for environment settings. This way if there is an exception during the processing we will get back to our original workspace settings. Looping over the folders using a walk from the os module is a very nice way to recursively access directory names but it will return file geodatabase and coverage workspaces too, so we test the directory name and if it is not in the folder workspaces we skip onto the next folder. Follow this by a little love to the user to brighten their day.

At this point we have not done much so we set the workspace to be the current folder and then find the workspaces within it and in our case this returns nicely crafted objects that represent workspace objects of varying types. In the case of FolderWorkspaces we want to be sure to capture these names for subsequent tests in the loop (i.e. root not in folder_workspaces) and in all cases we want to pull the feature classes. After the recursive looping is done we exit the context manager and the environment settings are put back to what they were before we began (cool eh).

An important note here is that our feature_classes property on a workspace object knows that feature classes (and other objects for that matter) can reside in a feature dataset so we do not need to concern ourselves with manually diving in. The symmetry that this provides is the simplicity for which we strive.

Esri Python Community is Live!

February 21, 2012 Leave a comment

Today the Esri Python Community went live!  Check it out here and while you’re there also look for our (Integrated Informatics Inc) featured article about the Integrated Geodetics Toolkit for Seismic and Well Data.

Categories: ArcGIS, GIS, Python Tags: , , , , , ,

The Main, De Main, Duh Main, Doh Main, Domain

February 19, 2012 Leave a comment

In many of the workflows we automate much of the effort is around development of a good data model that encapsulates the workflow inputs and outputs and provides a level of consistency that (usually) is not already in place.  Domains within a geodatabase are awesome for ensuring consistency within data models and for easing the effort (and improving consistency) for data management and maintenance.  No shortage of good merits for domains both coded and range alike.

When you work with domains often enough through Python you quickly come across some limitations on usability, or perhaps better stated, you come across the ways that you want to use it that aren’t readily exposed to Python.  Here are some simple use cases we run into all the time.

    >>> from pygp.dataelement.extended import GeodatabaseWorkspace
    >>> # Open up a geodatabase in a case insensitive state
    >>> gdb = GeodatabaseWorkspace(GDB_PATH_NAME, nocase=True)
    >>> # Use case #1: Check if a domain exists in a geodatabase
    >>> domain_name = 'test_domain_name'
    >>> domain_name in gdb.domains
    True
    >>> # Use case #2: Get a domain from a field in a table
    >>> field = table.fields['MY_FIELD_NAME']
    >>> domain = field.domain
    >>> # Use Case #3: What type of domain is this?
    >>> # Only two types of domains, coded and range
    >>> domain.is_range
    True
    >>> # Use Case #4: Is the value ok for the Domain?
    >>> 1000 in domain
    False
    >>> 5 in domain
    True

Visualizing Data Source Inventory Across Layers and Maps using Data Inventory Diagrams (DID)

February 12, 2012 Leave a comment

This has been one of those nagging items that comes up every so often in conversations around inventorying datasets in an enterprise environment and understanding which datasets are actually in use and which ones are used a lot (i.e. referred to by many map documents or layer files).

When were upgrading our Integrated Repoint and Repath tool last year we consciously implemented and exposed some items that would allow for this to be fairly easy.  The first was implementation of a full suite of Layer types and with this we also sweetened the return values from the data_source property – instead of just returning a string we actually return a Data Element object and do it in a manner that works for data sources that are unreachable (i.e. broken layers and table views).  The second was to put together a function that would seek out layer files and map documents.  The third thing we did was abstract Map Documents and Layers into the concept of a MappingObject (ok, perhaps not the best name but it was intended to be generic) so that these could be handled in virtually the same manner from a file standpoint.  There of course were a few more things we did but that probably covers the major items that were done with inventorying in mind.

More recently we started thinking some more about how to generate meaningful outputs from an inventory that could highlight pinch points and orphans at the same time.  I mean you could just write an inventory to spreadsheet and then do a bunch of filters and sorts.  Or if you were more savvy perhaps put the data into database and run some summaries using GROUP BY’s and so forth.  While useful overall those approaches seem to be just one way of slicing the inventory and what we were looking for is something that tells us, like slaps us in the face with it, which datasets are the most used/critical datasets.

What we came up with is a Data Inventory Diagram (DID) that depicts the files (Map Documents and Layer Files) and the data sources they contain and color coded by usage (how many referrals).  For example, this is a Data Inventory Diagram for just a single file:

did_pygp_sum_root

The file reference is in the center of the image and shown as a grey ellipse and the data sources (in this case) radiate out from the file and are shown as rectangles.  Light blue is just normal one reference and the purplish blue is simply showing that these data sources are used more frequently than once. 

Running on one file is a pretty easy case and really the sole intention behind what we want to accomplish – what we want to do is inventory large networks.  Thankfully we code for the complex and it works for the simple cases (scales down nicely!) so when we want to run this on set of folders we just do so with no additional code.  This example is mildly larger than the previous and touches approximately 50 Layer Files and Map Documents.

did_pygp

Here you can see that by running this on a larger set of files you begin to immediately realize which datasets are being used the most just by glancing at the diagram (i.e. pink and purple boxes). 

Note that in this diagram we’ve increased the information on each of the data source boxes to indicate how many times that the dataset was found and to make the diagram even easier to read we color-code the arrows from the files (grey ellipses) to the data source.

did_pygp_crop

Can’t forget about the orphans either, you should be able to see that there are some data sources that are only used by one file, important information again if you want to understand impact or need to prioritize efforts.

Does this scale up?  Sure thing, remember above that we implemented the return value from data_source as a Data Element object.  What this means, well, what it means because pygp  handles it for us, is that we can readily segregate these Data Inventory Diagrams by Workspace (now that’s cool).  Man do I wish I had this years ago when doing a pile of infrastructure upgrades for ArcGIS Server and ArcSDE – read the network and tell me what is important. 

In case you’re wondering, this functionality is available via our Integrated Repoint and Repath tool and is also bundled into our Spatial Data Sniffer and for followers of the blog, yes, it is all built around our Python Geoprocessing module (pygp).

Class Location Analysis–Performance Metrics

February 8, 2012 Leave a comment

We’ve been revamping several tools over the last few months and one that we’ve revisited lately is our Class Location Analyst tool.  What we’ve mostly been working on is improvements to documentation and to link into the latest version of pygp (a year brings a lot of nice improvements). 

Really happy with the latest performance tests too – now when we (or our clients for that matter) perform a class location assessment using CSA Z662 or CFR 192 we are seeing about 8-10 miles per second when structures are approximately on average 5 per mile.  The 8-10 miles per second includes the time for performing the gathering area analysis as well as resolving the overlaps between class.

Given the amount of rigor built into the application we are really pleased with this performance, even the overhead we have included for caring about datums and projections (gasp) really does not degrade performance and instead improves overall quality.

The implementation details behind this would make for a nice presentation at Esri PUG someday when they allow vendors to talk about commercial matters.

Calculate a mean value from a field

February 8, 2012 Leave a comment

I saw a few nice posts over on another blog that I thought would be fun to re-implement using pygp, in this particular case, “Calculate a mean value from a field”. Modification are 1) checking the field type to ensure it is numeric (re-use of some validation and enumerations) 2) checking that the field exists, and 3) filtering out any None values (but keeping 0′s). This code runs on ArcGIS 9.2 to 10.1 beta.

# -*- coding: ascii -*-
"""
Calculate a mean value from a field
"""


__author__ = 'Jason Humber - Integrated Informatics Inc.'
__maintainer__ = '$LastChangedBy$'
__vcs_id__ = '$HeadURL$'


from numpy.ma.core import mean
from pygp.core.field import NUMBERS


__copyright__ = 'Copyright (c) 2012, Integrated Informatics Inc.'
__license__ = 'LGPLv3'
__version__ = '0.1.0'
__email__ = 'gis@integrated-informatics.com'


def calculate_mean_value(table, field):
    """
    Calculate the Mean Value from a field

    :param table: Table or Feature Class
    :type table: pygp.dataelement.extended.Table
    :param field: Field Object
    :type field: pygp.core.field.Field
    :return: Mean value 
    :rtype: float
    """
    if field.type_ in NUMBERS and field in table.fields:
        return mean([v for v, in table.as_list(field) if v is not None])
# End calculate_mean_value function


if __name__ == '__main__':
    pass

Non-rigorous Comparison of Cursors

February 6, 2012 1 comment

Following up to a comment left earlier today we decided to put together a quick comparison between a pure arcgisscripting cursor, one of our pygp cursors, and our pygp validated cursor.

The pygp cursor can be thought of as a wrapper around the arcgisscripting cursor but with some lightweight checks built into it, for example, if you send in a field name that does not exist in the underlying table or feature class you’ll get a sensible error (i.e. not RuntimeError). The validated cursor goes much deeper, checking values, types, fields, domain ranges, coded domain values, etc. and also allows for the validation messages to be returned (super helpful in testing and equally as beneficial when user input is expected). Note also that validation can be done on a per field basis and does not need to be done at the cursor level.

As you might expect each one of these has progressively more overhead because there is more “stuff” done in the validation. But the amount of overhead imposed is pretty small overall. I can only begin to tell you how much simpler coding on cursors has become by a) validation prior to inserting the row and b) removal of try/except blocks because validation is centralized. So nice!

# -*- coding: ascii -*-
"""
Non-rigorous Comparison of Cursors
"""


__author__ = 'Jason Humber - Integrated Informatics Inc.'
__maintainer__ = '$LastChangedBy$'
__vcs_id__ = '$HeadURL$'


from time import clock
from string import ascii_uppercase
from random import choice
from arcgisscripting import create
from pygp.toolboxes.management import delete_records
from pygp.dataelement.extended import Table


__copyright__ = 'Copyright (c) 2012, Integrated Informatics Inc.'
__license__ = 'LGPLv3'
__version__ = '0.1.0'
__email__ = 'gis@integrated-informatics.com'


TABLE = r'sample_data\fgdb.gdb\test'
FIELD = 'A_FIELD'  # This field is Text 50
CHARACTER_LENGTHS = range(30, 50)
ROW_COUNT = 50000


gp = create(10)


# Test out bare bones arcgisscripting cursors
start = clock()
irows = gp.InsertCursor(TABLE)
irow = irows.NewRow()
for i in xrange(ROW_COUNT):
    irow.SetValue(FIELD, ''.join(
        choice(ascii_uppercase) for x in range(choice(CHARACTER_LENGTHS))))
    irows.InsertRow(irow)
del irow, irows
print 'Pure arcgisscripting', clock() - start

# Remove records from the table before the next test
delete_records(TABLE)

start = clock()
irows = Table(TABLE).insert()
irow = irows.new()
for i in xrange(ROW_COUNT):
    irow.set_value(FIELD, ''.join(
        choice(ascii_uppercase) for x in range(choice(CHARACTER_LENGTHS))))
    irows.insert(irow)
irows.close()
print 'pygp cursor sans validation', clock() - start

# Remove records from the table before the next test
delete_records(TABLE)

# Now allow the length to be longer than the field so that
# validation has to do some checking
CHARACTER_LENGTHS = range(35, 55)

# Using a validating cursor, alternatively can set at the field level
start = clock()
irows = Table(TABLE).insert(validate=True, show_messages=True)
irow = irows.new()
for i in xrange(ROW_COUNT):
    irow.set_value(FIELD, ''.join(
        choice(ascii_uppercase) for x in range(choice(CHARACTER_LENGTHS))))
    irows.insert(irow)
irows.close()
print 'pygp cursor with validation', clock() - start

# Final cleanup
delete_records(TABLE)

The results look something like this (running the code above over through 20 loops of 50000 inserts on each test)

Pure arcgisscripting 5.18s
pygp cursor sans validation 5.62s
pygp cursor with validation 6.82s

Which basically says, if you don’t want to litter your code with try/excepts and pile of validation code then on 50,000 inserts you pay a penalty of 1.64 seconds or 32.8 microseconds per insert.

Of course the real magic behind all of this is the ability to get field names off of an insert row ;)

Unique Spatial References from a Workspace

February 6, 2012 Leave a comment

Some other examples we demonstrated at the meetup back in 2010 was some advanced functionality implemented on spatial reference objects.  What got is into this in the first place was a less than obvious result from a logic statement.  Here is the original example:

# -*- coding: ascii -*-
"""
Compare spatial reference objects
"""


__author__ = 'Jason Humber - Integrated Informatics Inc.'
__maintainer__ = '$LastChangedBy$'
__vcs_id__ = '$HeadURL$'


from os.path import join as osjoin
from arcgisscripting import create
from pygp import gp


__copyright__ = 'Copyright (c) 2012, Integrated Informatics Inc.'
__license__ = 'LGPLv3'
__version__ = '0.1.0'
__email__ = 'gis@integrated-informatics.com'


POLYGON = r'sample_data\fgdb_with_data.gdb\prj_a'


def example_spatial_reference(path):
    """
    Example showing Spatial Reference from a describe object.

    :param path: Workspace Path
    :type path: str
    """
    gp = create()
    srs = gp.Describe(osjoin(path, POLYGON)).SpatialReference
    srs_compare = gp.Describe(osjoin(path, POLYGON)).SpatialReference
    if srs == srs_compare:
        # Only get here if srs_compare has been set to srs.
        # Comparison is done on object address, not on the spatial
        # reference properties.
        pass
    else:
        # Non-intuitive to get here, must compare properties  and not just
        # projection name
        pass
# End example_spatial_reference function

Now it may not seem like a natural thing to compare the spatial reference against the same spatial reference from the same feature class but in our case it made sense – we had lists of feature classes coming from different sources and when we were testing to see if there were any repeats in the lists we had to check not just the name of the feature class but also the spatial reference.  On first implementation of the overall code that performed this element-wise comparison, getting a False return from the logic test screwed things up royally.

After some investigation we soon saw what was happening:

>>> srs = gp.Describe(osjoin(path, POLYGON)).SpatialReference
>>> srs
<geoprocessing spatial reference object object at 0x028F79F8>
>>> srs_compare = gp.Describe(osjoin(path, POLYGON)).SpatialReference
>>> srs_compare
<geoprocessing spatial reference object object at 0x028F78D8>

The comparison between spatial references is done entirely on the object address (id).  When you ask for the spatial reference twice from the same feature class and you get two objects and while this makes sense it does not help our cause.

The remedy to this situation took a little bit of effort but certainly worthwhile in the end. What we ended up doing is implementing comparison capability on the spatial references such that Projected and Geographic and Unknown spatial references (nothing for Vertical at this point in time) could be compared.  The original example now looks like this and behaves as desired:

def example_extended_objects(path):
    """
    Example showing a richer implementation of the spatial reference objects.

    :param path: Workspace Path
    :type path: str
    """
    srs = gp.describe(osjoin(path, POLYGON)).spatial_reference
    srs_compare = gp.describe(osjoin(path, POLYGON)).spatial_reference
    if srs == srs_compare:
        # Comparison implemented on the spatial reference classes by
        # testing the projection definition and underlying geographic
        # coordinate system (or just the geographic coordinate system properties
        # when working with a GCS)
        pass
    else:
        # In this example we do not get in here at all
        pass
# End example_extended_objects function

The side benefit of using comparison operators is that it lends itself nicely for further use and more advanced use for asking questions like “return the unique spatial references from workspace” and being able to do this using objects and not string comparisons.  Here is how we get unique spatial references from a workspace (i.e. a FolderWorkspace, GeodatabaseWorkspace, or InMemoryWorkspace):

def unique_spatial_references(workspace):
    """
    Generate a set of Unique Spatial References in a Workspace

    :param workspace: Workspace
    :type workspace: pygp.dataelement.extended.FolderWorkspace,
        pygp.dataelement.extended.GeodatabaseWorkspace,
        pygp.dataelement.extended.InMemoryWorkspace
    :return: Set of Spatial References in a workspace
    :rtype: frozenset
    """
    return frozenset(e for e in workspace.data_elements.values() 
                     if hasattr(e, 'spatial_reference'))
# End unique_spatial_references function

Results Across Versions

February 4, 2012 Leave a comment

Between versions 9.2 and 9.3 of ArcGIS the return values from tools went from strings to Geoprocessing Result objects.  This was mostly geared towards ArcGIS Server but manifested itself in the desktop product as well.  At the time it was a hassle and one that made code between 9.2 and 9.3 incompatible (so much for the promise for backwards compatibility). 

# -*- coding: ascii -*-
"""
Return a String in 9.2 and a Result Object for 9.3+.  Shows use of decorated
functions and wrapping with custom class
"""


__author__ = 'Jason Humber - Integrated Informatics Inc.'
__maintainer__ = '$LastChangedBy$'
__vcs_id__ = '$HeadURL$'


from operator import isNumberType
from types import StringTypes
from os.path import join as osjoin
from arcgisscripting import create
from pygp import gp
from pygp._decorators import Result
from pygp.dataelement.extended import FeatureClass


__copyright__ = 'Copyright (c) 2009-2012, Integrated Informatics Inc.'
__license__ = 'LGPLv3'
__version__ = '0.1.0'
__email__ = 'gis@integrated-informatics.com'


POLYGON = r'sample_data\fgdb_with_data.gdb\prj_a'

def example_mixed_return(path):
    """
    Example showing the mix of values returned from tool calls

    :param path: Workspace Path
    :type path: str
    """
    gp = create()
    gp.OverwriteOutput = True
    feature_class = gp.CopyFeatures_management(
        osjoin(path, POLYGON), osjoin(path, 'iii'))
    if isinstance(feature_class, StringTypes):
        # Get here for 9.2
        return feature_class
    if hasattr(feature_class, 'GetOutput'):
        # Get here for 9.3+
        return feature_class.GetOutput(0)
# End example_mixed_return function

Looking at the positive parts of this change (who am I kidding, at the time this change came out it caused so much code rework it was unbelievable) it gave us a really good reason to dive into using function decorators. For example, this one works by sensing the type of value returned from a function and then wraps the results accordingly to give the same return value regardless of the version of ArcGIS.

def geoprocessing_result(func):
    """
    Decorator function to add timing to a function
    """
    def decorator(*args, **kwargs):
        """
        Decorator function
        """
        fnc = func(*args, **kwargs)
        if isinstance(fnc, Result) or isNumberType(fnc):
            return fnc
        elif hasattr(fnc, 'GetOutput') or isinstance(fnc, StringTypes):
            return Result(fnc)
        else:
            raise TypeError('Results not understood')
    # End decorator function
    decorator.__name__ = func.__name__
    decorator.__doc__ = func.__doc__
    return decorator
# End geoprocessing_result function

And this is how this decorator is used:

@geoprocessing_result
def copy_features(in_features, out_feature_class, config_keyword=None,
                  spatial_grid_1=0., spatial_grid_2=0., spatial_grid_3=0.):
    """
    Copy features from a feature class, or layer.

    This is a decorated function where the decorator (geoprocessing_result)
    will test and wrap the result.

    :param in_features: Input Feature Class
    :type in_features: str
    :param out_feature_class: Output Feature Class
    :type out_feature_class: str
    :param config_keyword: Configuration Keyword
    :type config_keyword: str
    :param spatial_grid_1: Spatial Grid 1
    :type spatial_grid_1: float
    :param spatial_grid_2: Spatial Grid 2
    :type spatial_grid_2: float
    :param spatial_grid_3: Spatial Grid 3
    :type spatial_grid_3: float
    """
    gp = create()
    return gp.CopyFeatures_management(
        in_features, out_feature_class, config_keyword, spatial_grid_1,
        spatial_grid_2, spatial_grid_3)
# End copy_features function

Once the decorator has been implemented it then means we can write code that is version agnostic, allowing us to do things like this (see examples below) where the return type is always a Result object (no guessing on the return type of extra inspections to determine the type) andt then the second example shows how by standardizing the return type we can then implement additional standardization within our classes like FeatureClass.

def example_version_agnostic(path):
    """
    Example showing version agnostic return value

    * Overwrite Output is not set because the pygp package defaults to True
    * Create gp is not required since it occurs as part of the pygp package

    This will always return a Result object regardless of version

    :param path: Workspace Path
    :type path: str
    :return: Result
    :rtype: Result
    """
    return copy_features(osjoin(path, POLYGON), gp.create_fresh_name())
# End example_version_agnostic function

def example_extended(path):
    """
    Example showing extended capability by wrapping the result

    Feature Class class is instantiated and knows how to interpret a
    Result Object into something more useful.

    :param path: Workspace Path
    :type path: str
    :return: Copied Feature Class
    :rtype: FeatureClass
    """
    return FeatureClass(copy_features(
        osjoin(path, POLYGON), gp.create_fresh_name()))
# End example_extended function

Copyrights matter…

Geometry Handling in pygp versus arcgisscripting

February 4, 2012 5 comments

A year or so ago we gave a presentation at a local Esri Developer Meetup that showcased some of the advanced functionality we have implemented in our pygp package.  For those unaware of pygp, it stands for Python Geoprocessing and is a relatively large suite of api-like code (and documentation and class diagrams) that allows for the same code to operate Python versions 2.4 through to 2.7 and Esri ArcGIS 9.2 through to 10.x (yes, that includes the dozen or so service packs in between too).

pygp contains a large number of functions and classes that are hugely beneficial when developing applications that need to be more than a typical 5 lines of code exported from Model Builder.  The main benefit of pygp is that we use it in our day-to-day business, this means it is constantly under development with new capabilities being added weekly.  Not only that, we implement things in a Pythonic and natural manner i.e. as professionals tasked with automating complex workflows how would we prefer to work?

Here for example is an example adapted from the Esri help documentation circa 9.2 and 9.3.  Recall that this is the vintage where lists were actually enumerations and cursors had to be stepped over using a while loop.

# -*- coding: ascii -*-
"""
Raises ValueError on GetObject for Null point in 9.2 and 9.3 and
Returns None in 10.
"""

__author__ = 'Jason Humber - Integrated Informatics Inc.'
__maintainer__ = '$LastChangedBy$'
__vcs_id__ = '$HeadURL$'

from os.path import join as osjoin
from arcgisscripting import create
from pygp.dataelement.extended import FeatureClass

__copyright__ = 'Copyright (c) 2009-2012, Integrated Informatics Inc.'
__license__ = 'LGPLv3'
__version__ = '0.1.0'
__email__ = 'gis@integrated-informatics.com'

POLYGON = r'sample_data\fgdb_with_data.gdb\prj_a'

def example_geometry_help(path):
    """
    Example from ESRI help documentation on accessing parts and points.

    Code adapted from ESRI Help for 9.3, reading geometries
    This approach avoids the ValueError

    :param path: Workspace Path
    :type path: str
    """
    gp = create()
    shape_field_name = gp.Describe(osjoin(path, POLYGON)).ShapeFieldName
    srows = gp.SearchCursor(POLYGON)
    srow = srows.Next()
    while srow:
        geom = srow.GetValue(shape_field_name).GetPart()
        part_num = 0
        part_count = range(geom.PartCount)
        while part_num < part_count:
            part = geom.GetPart(part_num)
            pnt = part.Next()
            pnt_count = 0
            while pnt:
                print pnt.x, pnt.y
                pnt = part.Next()
                pnt_count += 1

                if not pnt:
                    pnt = part.Next()
                    if pnt:
                        print "Interior Ring:"
            part_num += 1
        srow = srows.Next()
    del srow, srows
# End example_geometry_help function

The help documentation spells out what this results in and this example is essentially just showing how to get the points and parts of a geometry while looping over the rows in a feature class.  Nothing magical here other that to point out that this is a lot of code and that the approach does not raise a ValueError like this second approach (which should work by returning a null point).

def example_geometry_value_error(path):
    """
    Example Showing how ValueError was raised when using GetObject on a
    Null Point (i.e. the separator for exterior/interior rings) in releases
    before ArcGIS 10.  As of ArcGIS 10 the Null point is returned as None.

    :param path: Workspace Path
    :type path: str
    """
    gp = create()
    shape_field_name = gp.Describe(osjoin(path, POLYGON)).ShapeFieldName
    srows = gp.SearchCursor(osjoin(path, POLYGON))
    srow = srows.Next()

    # This approach will hit the ValueError in 9.2/9.3
    while srow:
        geom = srow.GetValue(shape_field_name).GetPart()
        part_num = 0
        part_count = range(geom.PartCount)
        while part_num < part_count:
            part = geom.GetPart(part_num)
            for i in range(part.count):
                try:
                    pnt = part.GetObject(i)
                    print pnt.x, pnt.y
                # This ValueError was reliable until 10
                # Now it no longer gets raised, code broken!!!
                except ValueError:
                    print "Interior Ring:"
                    continue
            part_num += 1
        srow = srows.Next()
    del srow, srows
# End example_geometry_value_error function

So this is annoying.  Accessing points in the geometry one way works fine and in another manner it bombs.  For those of you wondering, this issue was around for all of 9.x but then finally (and unexpectedly) it got fixed in 10!  That’s great!  But what about code that needs to work in both 9.x and 10.x, that becomes the issue.  Not only that, it is still a huge amount of code to write just to get the points out of a geometry.  Part of me wishes for the simplicity of Avenue when geometries were fast and super easy to work with…so with a little motivation here is what we did:

def example_geometry(path):
    """
    Example showing use of Geometry helper class that does the heavy lifting
    on the geometry object and returns something quite similar to WKT/GeoJSON

    Structure is simple, a tuple of tuple of Point objects, very similar to
    Avenue days of geometry and WKT MultiLineString etc.
    (((0, 498266, 6100519, None, None), (0, 499775, 6100281, None, None),
      (0, 500224, 6098694, None, None), (0, 499616, 6097662, None, None),
      (0, 498346, 6096789, None, None)))

    :param path: Workspace Path
    :type path: str
    """
    feature_class = FeatureClass(osjoin(path, POLYGON))
    for srow in feature_class.search():
        print srow.get_value(feature_class.shape_field_name).as_tuple()
# End example_geometry function

Let me step through the subtlties of this code.  First of all, it is only 3 lines (could have made it one if the motivation was to distract).  See how we have a nice FeatureClass object, this is quasi similar to a DEFeatureClass in ArcObjects but implemented for Python.

The next thing to see is that we can loop over the rows in the feature class using a search method (cool!).  Not only that, we can loop over it using a for loop, now that is hot!

On the third line we are using get_value, a method on our SearchRow object that is smart enough to know how to return geometry in a nicely usable way, that is, by returning our Geometry object.  Also see that when we want to return a special field the FeatureClass object we can get the name using a property of the FeatureClass object (for example, shape_field_name).

Last thing is the use of the as_tuple method, a beautiful method that knows how to return geometry as a set of nested tuples of Point objects that can just as easily be reassembled.

Categories: ArcGIS, GIS, Python
Follow

Get every new post delivered to your Inbox.