Data Management

Data management involves the theories and techniques for managing the entire data lifecycle, from data collection to data format conversion, from data storage to data sharing and retrieval, to data provenance, data quality control and data curation for long-term data archival and preservation.

Topics in this Knowledge Area are listed thematically below. Existing topics are in regular font and linked directly to their original entries (published in 2006; these contain only Learning Objectives). Entries that have been updated and expanded are in bold. Forthcoming, future topics are italicized


Spatial Databases Genealogical Relationships, Linkage, and Inheritance Georeferencing Systems
Spatial Database Management Systems Conflation & Related Spatial Data Integration Techniques Approximating the Earth's Shape with Geoids
Use of Relational DBMSs Standardization & Exchange Specifications Geographic Coordinate Systems
Object-Oriented DBMSs Spatial Access Methods Planar Coordinate Systems
Extensions of the Relational DBMS Data Retrieval Methods Tesselated Referencing Systems
Topological Relationships Spatial Indexing Linear Referencing Systems
Database Administration Space-driven Structures: Grid, linear quadtree, and z-ordering tree files Vertical Datums
Conceptual Data Models Data-driven structures: R-trees and cost models Horizontal Datums
Logical Data Models Modeling Unstructured Spatial Data Georegistration
Physical Data Models Modeling Semi-Structured Spatial Data Spatial Data Infrastructures
NoSQL Databases Query Processing Metadata
Problems with Large Spatial Databases Optimal I/O Algorithms Content Standards
Representations of Spatial Objects Spatial Joins Data Warehouses
The Raster Data Model Complex Queries Spatial Data Infrastructures
Classical Vector Data Models Spatial Data Quality U.S. National Spatial Data Infrastructure
The Topological Model Vagueness Common Ontologies for Spatial Data & Their Applications
The Spaghetti Model Mathematical Models of Vagueness: Fuzzy and Rough sets   
The Network Model Error-based Uncertainty  
Modeling 3D Entities Spatial Data Uncertainty  
Field-Based Models    
Fuzzy Models    
Triangulated Irregular Network Models    


DM-44 - Approximating the Earth's shape with geoids
  • Explain why gravity varies over the Earth’s surface
  • Explain how geoids are modeled
  • Explain the role that the U.S. National Geodetic Survey plays in maintaining and developing geoid models
  • Explain the concept of an equipotential gravity surface (i.e., a geoid)
DM-14 - Classic vector data models
  • Illustrate the GBF/DIME data model
  • Describe a Freeman-Huffman chain code
  • Describe the relationship of Freeman-Huffman chain codes to the raster model
  • Discuss the impact of early prototype data models (e.g., POLYVRT and GBF/DIME) on contemporary vector formats
  • Describe the relationship between the GBF/DIME and TIGER structures, the rationale for their design, and their intended primary uses, paying particular attention to the role of graph theory in establishing the difference between GBF/DIME and TIGER files
  • Discuss the advantages and disadvantages of POLYVRT
  • Explain what makes POLYVRT a hierarchical vector data model
DM-34 - Conceptual Data Models

Within an initial phase of database design, a conceptual data model is created as a technology-independent specification of the data to be stored within a database. This specification often times takes the form of a formalized diagram.  The process of conceptual data modeling is meant to foster shared understanding among data modelers and stakeholders when creating the specification.  As such, a conceptual data model should be easily readable by people with little or no technical-computer-based expertise because a comprehensive view of information is more important than a detailed view. In a conceptual data model, entity classes are categories of things (person, place, thing, etc.) that have attributes for describing the characteristics of the things.  Relationships can exist between the entity classes.  Entity-relationship diagrams have been and are likely to continue to be a popular way of characterizing entity classes, attributes and relationships.  Various notations for diagrams have been used over the years. The main intent about a conceptual data model and its corresponding entity-relationship diagram is that they should highlight the content and meaning of data within stakeholder information contexts, while postponing the specification of logical structure to the second phase of database design called logical data modeling. 

DM-58 - Content standards
  • Differentiate between a controlled vocabulary and an ontology
  • Describe a domain ontology or vocabulary (i.e., land use classification systems, surveyor codes, data dictionaries, place names, or benthic habitat classification system)
  • Describe how a domain ontology or vocabulary facilitates data sharing
  • Define “thesaurus” as it pertains to geospatial metadata
  • Describe the primary focus of the following content standards: FGDC, Dublin Core Metadata Initiative, and ISO 19115
  • Differentiate between a content standard and a profile
  • Describe some of the profiles created for the Content Standard for Digital Geospatial Metadata (CSDGM)
DM-02 - Data retrieval strategies
  • Analyze the relative performance of data retrieval strategies
  • Implement algorithms that retrieve geospatial data from a range of data structures
  • Describe the particular advantages of Morton addressing relative to geographic data representation
  • Discuss the advantages and disadvantages of different data structures (e.g., arrays, linked lists, binary trees, hash tables, indexes) for retrieving geospatial data
  • Compare and contrast direct and indirect access search and retrieval methods
DM-59 - Data warehouses
  • Differentiate between a data warehouse and a database
  • Describe the functions that gazetteers support
  • Differentiate the retrieval mechanisms of data warehouses and databases
  • Discuss the appropriate use of a data warehouse versus a database
DM-62 - Database administration
  • Describe how using standards can affect implementation of a GIS
  • Explain how validation and verification processes can be used to maintain database integrity
  • Summarize how data access processes can be a factor in development of an enterprise GIS implementation
  • Describe effective methods to get stakeholders to create, adopt, or develop and maintain metadata for shared datasets
DM-20 - Discrete entities
  • Discuss the human predilection to conceptualize geographic phenomena in terms of discrete entities
  • Compare and contrast differing epistemological and metaphysical viewpoints on the “reality” of geographic entities
  • Identify the types of features that need to be modeled in a particular GIS application or procedure
  • Identify phenomena that are difficult or impossible to conceptualize in terms of entities
  • Describe the difficulties in modeling entities with ill-defined edges
  • Describe the difficulties inherent in extending the “tabletop” metaphor of objects to the geographic environment
  • Evaluate the effectiveness of GIS data models for representing the identity, existence, and lifespan of entities
  • Justify or refute the conception of fields (e.g., temperature, density) as spatially-intensive attributes of (sometimes amorphous and anonymous) entities
  • Model “gray area” phenomena, such as categorical coverages (a.k.a. discrete fields), in terms of objects
  • Evaluate the influence of scale on the conceptualization of entities
  • Describe the perceptual processes (e.g., edge detection) that aid cognitive objectification
  • Describe particular entities in terms of space, time, and properties
DM-32 - Error-based uncertainty
  • Define uncertainty-related terms, such as error, accuracy, uncertainty, precision, stochastic, probabilistic, deterministic, and random
  • Recognize expressions of uncertainty in language
  • Evaluate the causes of uncertainty in geospatial data
  • Describe a stochastic error model for a natural phenomenon
  • Explain how the familiar concepts of geographic objects and fields affect the conceptualization of uncertainty
  • Recognize the degree to which the importance of uncertainty depends on scale and application
  • Differentiate uncertainty in geospatial situations from vagueness
DM-69 - Exchange specifications
  • Describe the characteristics of the Geography Markup Language (GML)
  • Explain the purpose, history, and status of the Spatial Data Transfer Standard (SDTS)
  • Identify different levels of information integration
  • Identify the level of integration at which the Geography Markup Language (GML) operates
  • Describe the geospatial elements of Earth science data exchange specifications, such as the Ecological Metadata Language (EML), Earth Science Markup Language (ESML), and Climate Science Modeling Language (CSML)
  • Import data packaged in a standard transfer format to a GIS software package
  • Export data from a GIS program to a standard exchange format