hot spot analysis

AM-97 - An Introduction to Spatial Data Mining

The goal of spatial data mining is to discover potentially useful, interesting, and non-trivial patterns from spatial data-sets (e.g., GPS trajectory of smartphones). Spatial data mining is societally important having applications in public health, public safety, climate science, etc. For example, in epidemiology, spatial data mining helps to nd areas with a high concentration of disease incidents to manage disease outbreaks. Computational methods are needed to discover spatial patterns since the volume and velocity of spatial data exceed the ability of human experts to analyze it. Spatial data has unique characteristics like spatial autocorrelation and spatial heterogeneity which violate the i.i.d (Independent and Identically Distributed) assumption of traditional statistic and data mining methods. Therefore, using traditional methods may miss patterns or may yield spurious patterns, which are costly in societal applications. Further, there are additional challenges such as MAUP (Modiable Areal Unit Problem) as illustrated by a recent court case debating gerrymandering in elections. In this article, we discuss tools and computational methods of spatial data mining, focusing on the primary spatial pattern families: hotspot detection, collocation detection, spatial prediction, and spatial outlier detection. Hotspot detection methods use domain information to accurately model more active and high-density areas. Collocation detection methods find objects whose instances are in proximity to each other in a location. Spatial prediction approaches explicitly model the neighborhood relationship of locations to predict target variables from input features. Finally, spatial outlier detection methods find data that differ from their neighbors. Lastly, we describe future research and trends in spatial data mining.

AM-23 - Local Measures of Spatial Association

Local measures of spatial association are statistics used to detect variations of a variable of interest across space when the spatial relationship of the variable is not constant across the study region, known as spatial non-stationarity or spatial heterogeneity. Unlike global measures that summarize the overall spatial autocorrelation of the study area in one single value, local measures of spatial association identify local clusters (observations nearby have similar attribute values) or spatial outliers (observations nearby have different attribute values). Like global measures, local indicators of spatial association (LISA), including local Moran’s I and local Geary’s C, incorporate both spatial proximity and attribute similarity. Getis-Ord Gi*another popular local statistic, identifies spatial clusters at various significance levels, known as hot spots (unusually high values) and cold spots (unusually low values). This so-called “hot spot analysis” has been extended to examine spatiotemporal trends in data. Bivariate local Moran’s I describes the statistical relationship between one variable at a location and a spatially lagged second variable at neighboring locations, and geographically weighted regression (GWR) allows regression coefficients to vary at each observation location. Visualization of local measures of spatial association is critical, allowing researchers of various disciplines to easily identify local pockets of interest for future examination.

AM-97 - An Introduction to Spatial Data Mining

The goal of spatial data mining is to discover potentially useful, interesting, and non-trivial patterns from spatial data-sets (e.g., GPS trajectory of smartphones). Spatial data mining is societally important having applications in public health, public safety, climate science, etc. For example, in epidemiology, spatial data mining helps to nd areas with a high concentration of disease incidents to manage disease outbreaks. Computational methods are needed to discover spatial patterns since the volume and velocity of spatial data exceed the ability of human experts to analyze it. Spatial data has unique characteristics like spatial autocorrelation and spatial heterogeneity which violate the i.i.d (Independent and Identically Distributed) assumption of traditional statistic and data mining methods. Therefore, using traditional methods may miss patterns or may yield spurious patterns, which are costly in societal applications. Further, there are additional challenges such as MAUP (Modiable Areal Unit Problem) as illustrated by a recent court case debating gerrymandering in elections. In this article, we discuss tools and computational methods of spatial data mining, focusing on the primary spatial pattern families: hotspot detection, collocation detection, spatial prediction, and spatial outlier detection. Hotspot detection methods use domain information to accurately model more active and high-density areas. Collocation detection methods find objects whose instances are in proximity to each other in a location. Spatial prediction approaches explicitly model the neighborhood relationship of locations to predict target variables from input features. Finally, spatial outlier detection methods find data that differ from their neighbors. Lastly, we describe future research and trends in spatial data mining.

AM-23 - Local Measures of Spatial Association

Local measures of spatial association are statistics used to detect variations of a variable of interest across space when the spatial relationship of the variable is not constant across the study region, known as spatial non-stationarity or spatial heterogeneity. Unlike global measures that summarize the overall spatial autocorrelation of the study area in one single value, local measures of spatial association identify local clusters (observations nearby have similar attribute values) or spatial outliers (observations nearby have different attribute values). Like global measures, local indicators of spatial association (LISA), including local Moran’s I and local Geary’s C, incorporate both spatial proximity and attribute similarity. Getis-Ord Gi*another popular local statistic, identifies spatial clusters at various significance levels, known as hot spots (unusually high values) and cold spots (unusually low values). This so-called “hot spot analysis” has been extended to examine spatiotemporal trends in data. Bivariate local Moran’s I describes the statistical relationship between one variable at a location and a spatially lagged second variable at neighboring locations, and geographically weighted regression (GWR) allows regression coefficients to vary at each observation location. Visualization of local measures of spatial association is critical, allowing researchers of various disciplines to easily identify local pockets of interest for future examination.

AM-97 - An Introduction to Spatial Data Mining

The goal of spatial data mining is to discover potentially useful, interesting, and non-trivial patterns from spatial data-sets (e.g., GPS trajectory of smartphones). Spatial data mining is societally important having applications in public health, public safety, climate science, etc. For example, in epidemiology, spatial data mining helps to nd areas with a high concentration of disease incidents to manage disease outbreaks. Computational methods are needed to discover spatial patterns since the volume and velocity of spatial data exceed the ability of human experts to analyze it. Spatial data has unique characteristics like spatial autocorrelation and spatial heterogeneity which violate the i.i.d (Independent and Identically Distributed) assumption of traditional statistic and data mining methods. Therefore, using traditional methods may miss patterns or may yield spurious patterns, which are costly in societal applications. Further, there are additional challenges such as MAUP (Modiable Areal Unit Problem) as illustrated by a recent court case debating gerrymandering in elections. In this article, we discuss tools and computational methods of spatial data mining, focusing on the primary spatial pattern families: hotspot detection, collocation detection, spatial prediction, and spatial outlier detection. Hotspot detection methods use domain information to accurately model more active and high-density areas. Collocation detection methods find objects whose instances are in proximity to each other in a location. Spatial prediction approaches explicitly model the neighborhood relationship of locations to predict target variables from input features. Finally, spatial outlier detection methods find data that differ from their neighbors. Lastly, we describe future research and trends in spatial data mining.

AM-23 - Local Measures of Spatial Association

Local measures of spatial association are statistics used to detect variations of a variable of interest across space when the spatial relationship of the variable is not constant across the study region, known as spatial non-stationarity or spatial heterogeneity. Unlike global measures that summarize the overall spatial autocorrelation of the study area in one single value, local measures of spatial association identify local clusters (observations nearby have similar attribute values) or spatial outliers (observations nearby have different attribute values). Like global measures, local indicators of spatial association (LISA), including local Moran’s I and local Geary’s C, incorporate both spatial proximity and attribute similarity. Getis-Ord Gi*another popular local statistic, identifies spatial clusters at various significance levels, known as hot spots (unusually high values) and cold spots (unusually low values). This so-called “hot spot analysis” has been extended to examine spatiotemporal trends in data. Bivariate local Moran’s I describes the statistical relationship between one variable at a location and a spatially lagged second variable at neighboring locations, and geographically weighted regression (GWR) allows regression coefficients to vary at each observation location. Visualization of local measures of spatial association is critical, allowing researchers of various disciplines to easily identify local pockets of interest for future examination.

AM-97 - An Introduction to Spatial Data Mining

The goal of spatial data mining is to discover potentially useful, interesting, and non-trivial patterns from spatial data-sets (e.g., GPS trajectory of smartphones). Spatial data mining is societally important having applications in public health, public safety, climate science, etc. For example, in epidemiology, spatial data mining helps to nd areas with a high concentration of disease incidents to manage disease outbreaks. Computational methods are needed to discover spatial patterns since the volume and velocity of spatial data exceed the ability of human experts to analyze it. Spatial data has unique characteristics like spatial autocorrelation and spatial heterogeneity which violate the i.i.d (Independent and Identically Distributed) assumption of traditional statistic and data mining methods. Therefore, using traditional methods may miss patterns or may yield spurious patterns, which are costly in societal applications. Further, there are additional challenges such as MAUP (Modiable Areal Unit Problem) as illustrated by a recent court case debating gerrymandering in elections. In this article, we discuss tools and computational methods of spatial data mining, focusing on the primary spatial pattern families: hotspot detection, collocation detection, spatial prediction, and spatial outlier detection. Hotspot detection methods use domain information to accurately model more active and high-density areas. Collocation detection methods find objects whose instances are in proximity to each other in a location. Spatial prediction approaches explicitly model the neighborhood relationship of locations to predict target variables from input features. Finally, spatial outlier detection methods find data that differ from their neighbors. Lastly, we describe future research and trends in spatial data mining.

AM-23 - Local Measures of Spatial Association

Local measures of spatial association are statistics used to detect variations of a variable of interest across space when the spatial relationship of the variable is not constant across the study region, known as spatial non-stationarity or spatial heterogeneity. Unlike global measures that summarize the overall spatial autocorrelation of the study area in one single value, local measures of spatial association identify local clusters (observations nearby have similar attribute values) or spatial outliers (observations nearby have different attribute values). Like global measures, local indicators of spatial association (LISA), including local Moran’s I and local Geary’s C, incorporate both spatial proximity and attribute similarity. Getis-Ord Gi*another popular local statistic, identifies spatial clusters at various significance levels, known as hot spots (unusually high values) and cold spots (unusually low values). This so-called “hot spot analysis” has been extended to examine spatiotemporal trends in data. Bivariate local Moran’s I describes the statistical relationship between one variable at a location and a spatially lagged second variable at neighboring locations, and geographically weighted regression (GWR) allows regression coefficients to vary at each observation location. Visualization of local measures of spatial association is critical, allowing researchers of various disciplines to easily identify local pockets of interest for future examination.

AM-23 - Local Measures of Spatial Association

Local measures of spatial association are statistics used to detect variations of a variable of interest across space when the spatial relationship of the variable is not constant across the study region, known as spatial non-stationarity or spatial heterogeneity. Unlike global measures that summarize the overall spatial autocorrelation of the study area in one single value, local measures of spatial association identify local clusters (observations nearby have similar attribute values) or spatial outliers (observations nearby have different attribute values). Like global measures, local indicators of spatial association (LISA), including local Moran’s I and local Geary’s C, incorporate both spatial proximity and attribute similarity. Getis-Ord Gi*another popular local statistic, identifies spatial clusters at various significance levels, known as hot spots (unusually high values) and cold spots (unusually low values). This so-called “hot spot analysis” has been extended to examine spatiotemporal trends in data. Bivariate local Moran’s I describes the statistical relationship between one variable at a location and a spatially lagged second variable at neighboring locations, and geographically weighted regression (GWR) allows regression coefficients to vary at each observation location. Visualization of local measures of spatial association is critical, allowing researchers of various disciplines to easily identify local pockets of interest for future examination.