2015-05-04

Exploring Spatial Autocorrelation: Moran's I and Geary's Ratio

While using autocorrelation statistics for time series data is quite common, one has to dig a bit deeper to evaluate spatial autocorrelation for some given data. Spatial autocorrelation might be a starting-point for any analysis of spatial data to get a first impression if places that are close to each other are similar in regards of a variable of interest.

This piece shows how to use the Spatial Data Analysis-Add-In (Version 0.92) for JMP to calculate the most commonly used metrics to measure spatial autocorrelation: Moran's I and Geary's Ratio.




Like here we will try to figure out if ozone measurements at 32 different locations in Los Angeles are spatially correlated or not.
Ozone Data Set
To calculate Moran's I and Geary's Ratio we first need to create a weight-matrix. This weight-matrix shall contain values representing the spatial similarity between data points. Thus the cell in the first row and second column will represent how close data points one and two are. Here we will just calculate the euclidean distances and use the inverse as a measure of similarity. Typically we will set the similarity of a datapoint with itself to zero. Thus the diagonal of this matrix is all zeros.

$$w_{i,j} = \begin{cases} i \neq j & \frac{1}{\sqrt{(x_i-x_j)^2 + (y_i-y_j)^2}} \\ i=j & 0\end{cases}$$

Last but not least we will typically row standardize the matrix - meaning: Divide all rows by the sum of all values in that row. This makes sure the overall sum of each row is equal to 1.

To ease this process I added a new feature to the addin. Go to Add-Ins => Spatial Data Analysis => Euclidean Distance Matrix. As a result you will get this prompt:

Usually it will just create a distance matrix for all rows in the data table, using euclidean distances. But selecting the checkbox Similarity Weights will calculate the weights like above. Finally we receive a dataset in JMP like the following:

Distance Matrix for Ozone Data
Of course this is not the only way to measure the spatial closeness of data points. A future blogpost might address this topic and introduce additional features to generate different kind of weight matrices using the JMP Add-In.

Moran's I and Geary's Ratio

Now that we have the similarities we might start to think about the coefficients of spatial autocorrelation. Without going into the details, here is their interpretation:

Moran's I
  • Moran's I is between -1 and 1 (as long as your weight matrix is row-standardized). 
  • Values close to 0 indicate no spatial autocorrelation*
  • Values close to 1 indicate strong positive spatial autocorrelation. I.e. regions close to each other behave similar in terms of the variable of interest. 
  • Values close to -1 indicate strong negative spatial autocorrelation.

* Actually the expected value of Moran's I if there is no spatial autocorrelation is not exactly 0. The expected value if there is no spatial autocorrelation is $E(I) = \frac{-1}{N-1}$, which is typically close to 0.

Geary's Ratio
  • Geary's ratio is between 0 and 2
  • Values close to 1 indicate no spatial autocorrelation
  • Values close to 0 indicate strong positive autocorrelation.
  • Values close to 2 indicate strong negative spatial autocorrelation.
To get those numbers for the given data go to Add-Ins => Spatial Data Analysis => Spatial Autocorrelation.

The dialog asks for:
  • The variable of interest. For now these are the ozone measurements stored in column Av8top.
  • Which measures to calculate: Moran's I and/or Geary's Ratio.
  • The previously calculated weight matrix as a JMP-data-table.

The report shows that the ozone-measurments are somewhat positively correlated. Being at roughly 0.23 Moran's I is (significantly) larger than -0.0323, which would be the expected value of Moran's I if there was no spatial autocorrelation.

With a value of 0.77 Geary's Ratio is (significantly) smaller than 1. This indicates positive spatial autocorrelation.

Don't become confused by the three color maps at the bottom part of the dialog. They are not depending on your data. Their only purpose is to give people an idea how to interpret Moran's I and Geary's Ratio by showing how spatially correlated data looks like. 
  • Positive Spatial Correlation (left hand side): The data is clustered in terms of the variable of interest.
  • No Spatial Correlation (middle): There is no spatial structure in the data.
  • Negative Spatial Correlation (right hand side): Regions that are close to each other tend to show different values for the variable of interest.


Keine Kommentare:

Kommentar veröffentlichen