When designing a spatial data model we have many more concerns that those of just pure relational models. Sure we can inherit some of the base modeling approaches and in most cases this makes sense, but what concepts do we need to understand and write rules around for migrating from a relational model into a spatial model. Or more specifically, how do we map concepts and objects from the relational realm into that of the spatial world?
Let’s begin by laying down some base assumptions rooted in relational database modeling:
- Normalization is your friend, the order is to be determined
- Super classes and classes can be useful when behavior or attribution differs
- Check constraints and restrictive vocabulary (look-up tables) are important for data integrity
- Primary and Foreign keys are an absolute must for referential integrity and data maintenance
- Naming conventions for objects and attributes simplify the implementation
- More…
Hard to argue many, if any, of those — just good common sense when it comes to data modeling — and, for the most part, these are going to be the assumptions we use for spatial database modeling as well.
Instead of building an exhaustive list of assumptions (there’s a pile of great books available on these topics) our time is probably better invested in building up a set of assumptions that challenge the relational rules and/or patterns used in well developed relational database models such as PPDM and PODS.
Over the years we’ve “spatialized” many databases so we’ve seen a few patterns emerge and, remarkably, the more databases we see the fewer patterns emerge. Here’s a collection of what we normally encounter and need to handle when moving from relational into spatial, for now we’ll just identify the key items and elaborate mildly, just enough to give you a flavor, the details of each item will be handled in its own upcoming blog entry.
Musical interlude….If I haven’t already mentioned it, the goal here is to develop a series of processes or Python modules that can magically crawl through the metadata of your database (Oracle in my case) and spit out a Geodatabase structure with minimal intervention….elevator music concludes…
Relational Patterns
- Coordinate pairs stored as numbers with associated projection/datum code – normally used in cases where point geometry would represent the object location
- Listing of coordinate pairs as with an associate grouping identifier — normally used for lines or polygons that have their full geometry defined as an ordered set of coordinates, could be in WKT or a 1:M relationship.
- No coordinate pairs, locations are referenced by a legally defined survey system, address, linear reference system, and even linear reference systems within linear reference system (events of events)
- Minimal coordinates (e.g. bounding box) with offset definitions from a local orthogonal coordinate system
- Coordinate Collections can have different spatial references defined or may have unknown spatial reference (how is this information even remotely useful — direct linkage with quality)
- Multi-column primary keys, very common in PPDM, not so common in PODS. In either case, this is a bit of a limitation in the Geodatabase world but thankfully the use of GUID’s can help.
- Super classes — a legacy approach to spatializing PPDM suggested that all Polygons be managed in one feature class and then joined to the necessary business table using the key values, makes sense in theory but performance is a pain (same was suggested for points and lines). Things have improved…tied in directly with this item is the natural ask, what about sub-classes…yup, that’s where subtypes come into the mix
- Versioning — ’nuff said
- Quality — capturing this at the element level versus the classic GIS approach of generalizing across the whole dataset. Another concept here is returning preferred data only, multiple sources and hierarchical business rules on which elements are the most valuable for use
- Constraints — when should these be migrated and how? As relational database constraints only? As relationship classes? Or maybe as domains?
- Lookup tables — classic approach in databases to ensure efficient storage and high fidelity data — do domains provide everything needed in the geodatabase world?
- Business objects with variable spatial location depending on aggregation level, the roll-up of information may make the underlying data applicable to a more well defined location or a more general location — think summary statistics for census data as an example.
- More…
Alright, that should be enough to keep us busy for the next few months. Tune in for updates as we hack through these ideas and present our findings.
No Comments Yet so far
Leave a comment
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <pre> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>