Search Page

Showing 1 - 10 of 11
DC-19 - Ground Verification and Accuracy Assessment

Spatial products such as maps of land cover, soil type, wildfire, glaciers, and surface water have become increasingly available and used in science and policy decisions.  These maps are not without error, and it is critical that a description of quality accompany each product.  In the case of a thematic map, one aspect of quality is obtained by conducting a spatially explicit accuracy assessment in which the map class and reference class are compared on a per spatial unit basis (e.g., per 30m x 30m pixel).  The outcome of an accuracy assessment is a description of quality of the end-product map, in contrast to conducting an evaluation of map quality as part of the map production process.  The accuracy results can be used to decide if the map is of adequate quality for an intended application, as input to uncertainty analyses, and as information to improve future map products.

DC-04 - Social Media Platforms

Social media is a group of interactive Web 2.0 Internet-based applications that allow users to create and exchange user-generated content via virtual communities. Social media platforms have a large user population who generate massive amounts of digital footprints, which are valuable data sources for observing and analyzing human activities/behavior. This entry focuses on social media platforms that provide spatial information in different forms for Geographic Information Systems and Technology (GIS&T) research. These social media platforms can be grouped into six categories: microblogging sites, social networking sites, content sharing sites, product and service review sites, collaborative knowledge sharing sites, and others. Four methods are available for capturing data from social media platforms, including Web Application Programming Interfaces (Web APIs), Web scraping, digital participant recruitment, and direct data purchasing. This entry first overviews the history, opportunities, and challenges related to social media platforms. Each category of social media platforms is then introduced in detail, including platform features, well-known platform examples, and data capturing processes.

AM-09 - Classification and Clustering

Classification and clustering are often confused with each other, or used interchangeably. Clustering and classification are distinguished by whether the number and type of classes are known beforehand (classification), or if they are learned from the data (clustering). The overarching goal of classification and clustering is to place observations into groups that share similar characteristics while maximizing the separation of the groups that are dissimilar to each other. Clusters are found in environmental and social applications, and classification is a common way of organizing information. Both are used in many areas of GIS including spatial cluster detection, remote sensing classification, cartography, and spatial analysis. Cartographic classification methods present a simplified way to examine some classification and clustering methods, and these will be explored in more depth with example applications.

AM-78 - Genetic Algorithms and Evolutionary Computing

Genetic algorithms (GAs) are a family of search methods that have been shown to be effective in finding optimal or near-optimal solutions to a wide range of optimization problems. A GA maintains a population of solutions to the problem being solved and uses crossover, mutation, and selection operations to iteratively modify them. As the population evolves across generations, better solutions are created and inferior ones are selectively discarded. GAs usually run for a fixed number of iterations (generations) or until further improvements do not obtain. This contribution discusses the fundamental principles of genetic algorithms and uses Python code to illustrate how GAs can be developed for both numerical and spatial optimization problems. Computational experiments are used to demonstrate the effectiveness of GAs and to illustrate some nuances in GA design.

AM-94 - Machine Learning Approaches

Machine learning approaches are increasingly used across numerous applications in order to learn from data and generate new knowledge discoveries, advance scientific studies and support automated decision making. In this knowledge entry, the fundamentals of Machine Learning (ML) are introduced, focusing on how feature spaces, models and algorithms are being developed and applied in geospatial studies. An example of a ML workflow for supervised/unsupervised learning is also introduced. The main challenges in ML approaches and our vision for future work are discussed at the end.

AM-08 - Kernels and Density Estimation

Kernel density estimation is an important nonparametric technique to estimate density from point-based or line-based data. It has been widely used for various purposes, such as point or line data smoothing, risk mapping, and hot spot detection. It applies a kernel function on each observation (point or line) and spreads the observation over the kernel window. The kernel density estimate at a location will be the sum of the fractions of all observations at that location. In a GIS environment, kernel density estimation usually results in a density surface where each cell is rendered based on the kernel density estimated at the cell center. The result of kernel density estimation could vary substantially depending on the choice of kernel function or kernel bandwidth, with the latter having a greater impact. When applying a fixed kernel bandwidth over all of the observations, undersmoothing of density may occur in areas with only sparse observation while oversmoothing may be found in other areas. To solve this issue, adaptive or variable bandwidth approaches have been suggested.

AM-107 - Spatial Data Uncertainty

Although spatial data users may not be aware of the inherent uncertainty in all the datasets they use, it is critical to evaluate data quality in order to understand the validity and limitations of any conclusions based on spatial data. Spatial data uncertainty is inevitable as all representations of the real world are imperfect. This topic presents the importance of understanding spatial data uncertainty and discusses major methods and models to communicate, represent, and quantify positional and attribute uncertainty in spatial data, including both analytical and simulation approaches. Geo-semantic uncertainty that involves vague geographic concepts and classes is also addressed from the perspectives of fuzzy-set approaches and cognitive experiments. Potential methods that can be implemented to assess the quality of large volumes of crowd-sourced geographic data are also discussed. Finally, this topic ends with future directions to further research on spatial data quality and uncertainty.

AM-97 - An Introduction to Spatial Data Mining

The goal of spatial data mining is to discover potentially useful, interesting, and non-trivial patterns from spatial data-sets (e.g., GPS trajectory of smartphones). Spatial data mining is societally important having applications in public health, public safety, climate science, etc. For example, in epidemiology, spatial data mining helps to nd areas with a high concentration of disease incidents to manage disease outbreaks. Computational methods are needed to discover spatial patterns since the volume and velocity of spatial data exceed the ability of human experts to analyze it. Spatial data has unique characteristics like spatial autocorrelation and spatial heterogeneity which violate the i.i.d (Independent and Identically Distributed) assumption of traditional statistic and data mining methods. Therefore, using traditional methods may miss patterns or may yield spurious patterns, which are costly in societal applications. Further, there are additional challenges such as MAUP (Modiable Areal Unit Problem) as illustrated by a recent court case debating gerrymandering in elections. In this article, we discuss tools and computational methods of spatial data mining, focusing on the primary spatial pattern families: hotspot detection, collocation detection, spatial prediction, and spatial outlier detection. Hotspot detection methods use domain information to accurately model more active and high-density areas. Collocation detection methods find objects whose instances are in proximity to each other in a location. Spatial prediction approaches explicitly model the neighborhood relationship of locations to predict target variables from input features. Finally, spatial outlier detection methods find data that differ from their neighbors. Lastly, we describe future research and trends in spatial data mining.

DC-30 - Georeferencing and Georectification

Georeferencing is the recording of the absolute location of a data point or data points. Georectification refers to the removal of geometric distortions between sets of data points, most often the removal of terrain, platform, and sensor induced distortions from remote sensing imagery. Georeferencing is a requisite task for all spatial data, as spatial data cannot be positioned in space or evaluated with respect to other data that are without being assigned a spatial coordinate within a defined coordinate system. Many data are implicitly georeferenced (i.e., are labeled with spatial reference information), such as points collected from a global navigation satellite system (GNSS). Data that are not labeled with spatial reference information can be georeferenced using a number of approaches, the most commonly applied of which are described in this article. The majority of approaches employ known reference locations (i.e., Ground Control Points) drawn from a reliable source (e.g., GNSS, orthophotography) to calibrate georeferencing models. Regardless of georeferencing approach, positional error is present. The accuracy of georeferencing (i.e., amount of positional error) should be quantified, typically by the root mean squared error between ground control points from a reference source and the georeferenced data product.

AM-43 - Location and Service Area Problems

Many facilities exist to provide essential services in a city or region. The service area of a facility refers to a geographical area where the intended service of the facility can be received effectively. Service area delineation varies with the particular service a facility provides. This topic examines two types of service areas, one that can be defined based on a predetermined range such as travel distance/time and another based on the nearest facility available. Relevant location models are introduced to identify the best location(s) of one or multiple facilities to maximize service provision or minimize the system-wide cost. The delineation of service areas and structuring of a location model draw extensively on existing functions in a GIS. The topic represents an important area of GIS&T.