data mining | GIS&T Body of Knowledge

AM-36 - Data mining approaches

Describe how data mining can be used for geospatial intelligence
Explain how the analytical reasoning techniques, visual representations, and interaction techniques that make up the domain of visual analytics have a strong spatial component
Demonstrate how cluster analysis can be used as a data mining tool
Interpret patterns in space and time using Dorling and Openshaw’s geographical analysis machine (GAM) demonstration of disease incidence diffusion
Differentiate between data mining approaches used for spatial and non-spatial applications
Explain how spatial statistics techniques are used in spatial data mining
Compare and contrast the primary types of data mining: summarization/characterization, clustering/categorization, feature extraction, and rule/relationships extraction

AM-38 - Pattern recognition

Differentiate among machine learning, data mining, and pattern recognition
Explain the principles of pattern recognition
Apply a simple spatial mean filter to an image as a means of recognizing patterns
Construct an edge-recognition filter
Design a simple spatial mean filter
Explain the outcome of an artificial intelligence analysis (e.g., edge recognition), including a discussion of what the human did not see that the computer identified and vice versa

AM-37 - Knowledge discovery

Explain how spatial data mining techniques can be used for knowledge discovery
Explain how a Bayesian framework can incorporate expert knowledge in order to retrieve all relevant datasets given an initial user query
Explain how visual data exploration can be combined with data mining techniques as a means of discovering research hypotheses in large spatial datasets

AM-36 - Data mining approaches

Describe how data mining can be used for geospatial intelligence
Explain how the analytical reasoning techniques, visual representations, and interaction techniques that make up the domain of visual analytics have a strong spatial component
Demonstrate how cluster analysis can be used as a data mining tool
Interpret patterns in space and time using Dorling and Openshaw’s geographical analysis machine (GAM) demonstration of disease incidence diffusion
Differentiate between data mining approaches used for spatial and non-spatial applications
Explain how spatial statistics techniques are used in spatial data mining
Compare and contrast the primary types of data mining: summarization/characterization, clustering/categorization, feature extraction, and rule/relationships extraction

DM-70 - Problems of Large Spatial Databases

Large spatial databases often labeled as geospatial big data exceed the capacity of commonly used computing systems as a result of data volume, variety, velocity, and veracity. Additional problems also labeled with V’s are cited, but the four primary ones are the most problematic and focus of this chapter (Li et al., 2016, Panimalar et al., 2017). Sources include satellites, aircraft and drone platforms, vehicles, geosocial networking services, mobile devices, and cameras. The problems in processing these data to extract useful information include query, analysis, and visualization. Data mining techniques and machine learning algorithms, such as deep convolutional neural networks, often are used with geospatial big data. The obvious problem is handling the large data volumes, particularly for input and output operations, requiring parallel read and write of the data, as well as high speed computers, disk services, and network transfer speeds. Additional problems of large spatial databases include the variety and heterogeneity of data requiring advanced algorithms to handle different data types and characteristics, and integration with other data. The velocity at which the data are acquired is a challenge, especially using today’s advanced sensors and the Internet of Things that includes millions of devices creating data on short temporal scales of micro seconds to minutes. Finally, the veracity, or truthfulness of large spatial databases is difficult to establish and validate, particularly for all data elements in the database.

Search form

Pages