Data Management

Data management involves the theories and techniques for managing the entire data lifecycle, from data collection to data format conversion, from data storage to data sharing and retrieval, to data provenance, data quality control and data curation for long-term data archival and preservation.

Topics in this Knowledge Area are listed thematically below. Existing topics are in regular font and linked directly to their original entries (published in 2006; these contain only Learning Objectives). Entries that have been updated and expanded are in bold. Forthcoming, future topics are italicized

 

Spatial Databases Genealogical Relationships, Linkage, and Inheritance Georeferencing Systems
Spatial Database Management Systems Conflation & Related Spatial Data Integration Techniques Approximating the Earth's Shape with Geoids
Use of Relational DBMSs Standardization & Exchange Specifications Geographic Coordinate Systems
Object-Oriented DBMSs Spatial Access Methods Planar Coordinate Systems
Extensions of the Relational DBMS Data Retrieval Methods Tesselated Referencing Systems
Topological Relationships Spatial Indexing Linear Referencing Systems
Database Administration Space-driven Structures: Grid, linear quadtree, and z-ordering tree files Vertical Datums
Conceptual Data Models Data-driven structures: R-trees and cost models Horizontal Datums
Logical Data Models Modeling Unstructured Spatial Data Georegistration
Physical Data Models Modeling Semi-Structured Spatial Data Spatial Data Infrastructures
NoSQL Databases Query Processing Metadata
Problems with Large Spatial Databases Optimal I/O Algorithms Content Standards
Representations of Spatial Objects Spatial Joins Data Warehouses
The Raster Data Model Complex Queries Spatial Data Infrastructures
Classical Vector Data Models Spatial Data Quality U.S. National Spatial Data Infrastructure
The Topological Model Vagueness Common Ontologies for Spatial Data & Their Applications
The Spaghetti Model Mathematical Models of Vagueness: Fuzzy and Rough sets   
The Network Model Error-based Uncertainty  
Modeling 3D Entities Spatial Data Uncertainty  
Field-Based Models    
Fuzzy Models    
Triangulated Irregular Network Models    

 

DM-69 - Exchange specifications
  • Describe the characteristics of the Geography Markup Language (GML)
  • Explain the purpose, history, and status of the Spatial Data Transfer Standard (SDTS)
  • Identify different levels of information integration
  • Identify the level of integration at which the Geography Markup Language (GML) operates
  • Describe the geospatial elements of Earth science data exchange specifications, such as the Ecological Metadata Language (EML), Earth Science Markup Language (ESML), and Climate Science Modeling Language (CSML)
  • Import data packaged in a standard transfer format to a GIS software package
  • Export data from a GIS program to a standard exchange format
DM-05 - Extensions of the relational model
  • Explain why early attempts to store geographic data in standard relational tables failed
  • Evaluate the adequacy of contemporary proprietary database schemes to manage geospatial data
  • Describe standards efforts relating to relational extensions, such as SQL:1999 and SQL-MM
  • Evaluate the degree to which an available object-relational database management system approximates a true object-oriented paradigm
  • Describe extensions of the relational model designed to represent geospatial and other semistructured data, such as stored procedures, Binary Large Objects (BLOBs), nested tables, abstract data types, and spatial data types
DM-23 - Fields in space and time
  • Define a field in terms of properties, space, and time
  • Formalize the notion of field using mathematical functions and calculus
  • Recognize the influences of scale on the perception and meaning of fields
  • Evaluate the field view’s description of “objects” as conceptual discretizations of continuous patterns
  • Identify applications and phenomena that are not adequately modeled by the field view
  • Identify examples of discrete and continuous change found in spatial, temporal, and spatio-temporal fields
  • Relate the notion of field in GIS to the mathematical notions of scalar and vector fields
  • Differentiate various sources of fields, such as substance properties (e.g., temperature), artificial constructs (e.g., population density), and fields of potential or influence (e.g., gravity)
DM-41 - Fuzzy logic
  • Describe how linear functions are used to fuzzify input data (i.e., mapping domain values to linguistic variables)
  • Support or refute the statement by Lotfi Zadeh, that “As complexity rises, precise statements lose meaning and meaningful statements lose precision,” as it relates to GIS&T
  • Explain why fuzzy logic, rather then Boolean algebra models, can be useful for representing real world boundaries between different tree species
DM-27 - Genealogical relationships: lineage, inheritance
  • Describe ways in which a geographic entity can be created from one or more others
  • Discuss the effects of temporal scale on the modeling of genealogical structures
  • Describe the genealogy (as identity-based change or temporal relationships) of particular geographic phenomena
  • Determine whether it is important to represent the genealogy of entities for a particular application
DM-47 - Geographic coordinate system
  • Distinguish between various latitude definitions (e.g., geocentric, geodetic, astronomic latitudes)
  • Explain the angular measurements represented by latitude and longitude coordinates
  • Calculate the latitude and longitude coordinates of a given location on the map using the coordinate grid ticks in the collar of a topographic map and the appropriate interpolation formula
  • Mathematically express the relationship between Cartesian coordinates and polar coordinates
  • Calculate the uncertainty of a ground position defined by latitude and longitude coordinates specified in decimal degrees to a given number of decimal places
  • Use GIS software and base data encoded as geographic coordinates to geocode a list of address-referenced locations
  • Locate on a globe the positions represented by latitude and longitude coordinates
  • Write an algorithm that converts geographic coordinates from decimal degrees (DD) to degrees, minutes, seconds (DMS) format
DM-56 - Georegistration
  • Differentiate rectification and orthorectification
  • Identify and explain an equation used to perform image-to-map registration
  • Identify and explain an equation used to perform image-to-image registration
  • Use GIS software to transform a given dataset to a specified coordinate system, projection, and datum
  • Explain the role and selection criteria for “ground control points” (GCPs) in the georegistration of aerial imagery
DM-71 - Geospatial Data Conflation

Spatial data conflation is the process of combining overlapping spatial datasets to produce a better dataset with higher accuracy or more information. Conflation is needed in many fields, ranging from transportation planning to the analysis of historical datasets, which require the use of multiple data sources. Geospatial data conflation becomes increasingly important with the advancement of GIS and the emergence of new sources of spatial data such as Volunteered Geographic Information.

Conceptually, conflation is a two-step process involving identifying counterpart features that correspond to the same object in reality, and merging the geometry and attributes of counterpart features. In practice, conflation can be performed either manually or with the aid of GIS with varying degrees of automation. Manual conflation is labor-intensive, time consuming and expensive. It is often adopted in practice, nonetheless, due to the lack of reliable automatic conflation methods.

A main challenge of automatic conflation lies in the automatic matching of corresponding features, due to the varying quality and different representations of map data. Many (semi-)automatic feature methods exist. They typically involve measuring the distance between each feature pair and trying to match feature pairs with smaller dissimilarity using a specially designed algorithm or model. Fully automated conflation is still an active research field.

DM-11 - Hierarchical data models
  • Illustrate the quadtree model
  • Describe the advantages and disadvantages of the quadtree model for geographic database representation and modeling
  • Describe alternatives to quadtrees for representing hierarchical tessellations (e.g., hextrees, rtrees, pyramids)
  • Explain how quadtrees and other hierarchical tessellations can be used to index large volumes of raster or vector data
  • Implement a format for encoding quadtrees in a data file
DM-52 - Horizontal datums
  • Discuss appropriate applications of the various datum transformation options
  • Explain the difference between NAD 27 and NAD 83 in terms of ellipsoid parameters
  • Outline the historical development of horizontal datums
  • Explain the difference in coordinate specifications for the same position when referenced to NAD 27 and NAD 83
  • Explain the rationale for updating NAD 27 to NAD 83
  • Explain why all GPS data are originally referenced to the WGS 84 datum
  • Identify which datum transformation options are available and unavailable in a GIS software package
  • Define “horizontal datum” in terms of the relationship between a coordinate system and an approximation of the Earth’s surface
  • Describe the limitations of a Molodenski transformation and in what circumstances a higher parameter transformation such as Helmert may be appropriate
  • Determine the impact of a datum transformation from NAD 27 to NAD 83 for a given location using a conversion routine maintained by the U.S. National Geodetic Survey
  • Explain the methodology employed by the U.S. National Geodetic Survey to transform control points from NAD 27 to NAD 83
  • Perform a Molodenski transformation manually
  • Use GIS software to perform a datum transformation

Pages