Intro to GRASS workshop

From CUOSG
Jump to: navigation, search

Introduction to GRASS GIS

This document was first written to accompany a workshop to be given at Carleton University on 17 October, 2011, in the Department of Geography and Environmental Studies' main GIS lab (Loeb A200). Anyone is welcome to adapt it for other purposes, or to help improve and keep this tutorial up to date (see CC license conditions at bottom left of page).

The expected audience for this tutorial consists of people that are new to GRASS GIS, but not new to GIS in general - i.e. we do not explain basic GIS concepts, and when the software "behaves" very similarly to other common GIS software, this tutorial does not provide much in the way of detailed instructions (e.g. choosing map symbology).

In the following sections, text in bold indicates specific practical steps you should perform on your computer to follow along, whereas the rest of the text offers extra explanation or pointers to other resources.

Getting Started

If GRASS is not already installed on a machine for you, getting the software is probably your first step. This workshop will use the OSGEO Live project's bootable test environment, version 5.0, to showcase up to date versions of all of the software without requiring any changes to the machine it is running on. The Live DVD should already be booted for you when you arrive at the workshop, and you are welcome to take the disc with you when you leave. The software running when you boot the disc is based on an Ubuntu Linux distribution. The disc also contains installers for GRASS (and other software) for Windows and Mac OS environments.

Other ways to get GRASS:

In the OSGEO Live environment, launch GRASS by going to the Geospatial menu icon near the top left of the screen, then choose the Desktop GIS sub-menu, then choose GRASS GIS. This will launch a shell script terminal and a graphical interface to choose your GRASS database, location and mapset. These concepts tend to confuse new users, but once you understand the logic used here, file maintenance is taken care of for you to a large degree.

The GRASS project has a good introduction to this opening screen on its website, so to avoid reinventing the wheel, please click on this link to bring up the relevant web page. We'll go through this material step by step in the workshop. If you are using this tutorial outside of the interactive workshop, you should be able to piece together the material by reading the full web page online.

Although there are sample databases provided on the DVD, I think most people wanting to try out GRASS will soon want to get their own data into a GRASS database, so today we'll start with making a fresh database location from scratch. Confirm that the GIS Data Directory is set to /home/user/grassdata, then click on the Location wizard button in the Manage section of the window (identified by number 4 on the GRASS introductory web site).

In the resulting window, fill in Ottawa as the location, and something like City of Ottawa for the Location Title. Click Next.

In the next window, choose "Read projection and datum terms from a georeferenced data file" and click Next.

For the first steps of this tutorial, we are using data distributed by the City of Ottawa, under a license that allows free redistribution and use under the Terms of Use linked here. These files are already retrieved and made easily available for you in the lab (details provided in the workshop), but if you are performing this tutorial on your own, an archive of the data we are using is available here: GRASS workshop data archive.

In the window asking you to Select georeferenced file, press the Browse button and navigate to the directory where the data is stored (instructions provided in the lab if you are present for the workshop; wherever you extracted the data on your own machine if you are doing this independently). Choose the roadways.shp file. Click Next, then if everything looks right in the Summary screen, click Finish. You will be asked if you want to get the default region extents and resolution - answer Yes.

In the resulting screen, a default extent has been taken from the maximum extents of the roadways data. The resolution parameters do not apply to vector data, however, therefore they are filled with nonsense values. To make things easier later on when we introduce raster data, set both the N-S and E-W resolution to 30m, then press the Set region button.

GRASS uses the concept of a "Window" on your data to define the spatial extent and, for raster processing, the spatial resolution at which all processing is done. This window can be changed at any time, and any data layer can have a different spatial extent and resolution. The Window defines how input layers will be sampled (rasters are resampled at the current window's resolution when data layers have different resolutions). We have just set the default Window for this dataset, which provides an easy way to reset to starting conditions while we're doing GIS processing.

Now you'll be back at the initial GRASS welcome screen, and the new Ottawa location should show up and be selected in the left-most column, with the PERMANENT mapset selected under Accessible mapsets. All locations have at least one mapset, called PERMANENT. Other mapsets can be created to separate out maps belonging to different sub-projects, or to separate maps being created by multiple users sharing the same database. For now, we'll stick with the PERMANENT mapset.

Click the Start GRASS button. The terminal window that was already launched behind the welcome screen will display some introductory text and a GRASS shell prompt (for interactive command-line use), and the GRASS GUI will launch (on these and most modern installations of GRASS, this means the wxPython-based mapping interface).

Importing Vector Data

At this point, we have created an empty database - although you pointed at a shapefile to define its spatial attributes, there aren't any data loaded yet. We will import some shapefile from the City of Ottawa GIS data using the OGR-based vector import tool, v.in.ogr, which is accessed in the GUI by selecting the File menu in the GRASS GIS Layer Manager window, then choosing Import vector data | Common import formats.

In the resulting window, in the Source name section, click on the Browse button, navigate to where your City of Ottawa shapefiles are stored, select roadways.shp, and press OK. In the Import vector data window, press the Import button at the bottom. If all goes well, the shapefile will be imported into the GRASS database, and will show up as a layer in the mapping windows.

Now we'll import the rest of the City's vector data, but this time we'll use a shortcut. Bring up the Import vector data tool again (File|Import vector data|Common import formats), but this time under Source type, choose Directory. Now in the next section, navigate to the DIRECTORY where your files are stored (select the parent directory, then press Open). When you return to the Import vector data window, all of the OGR-readable data that it found in that directory should show up (in this case, all the shapefiles). Deselect the roadways layer (because we've already imported it), then press the Import button at the bottom of the window.

The data in those shapefiles are now all stored (separately) in the GRASS database, in GRASS's native data format. The attribute data is, by default, stored in a DBF-format file, just as is used with shapefiles. GRASS can optionally connect to external database servers to hold attribute data, and (with compatible databases) spatial data as well.

Controlling the map display

As you'll see in your map window, GRASS does not have very exciting default colours when you add layers to your map - everything is black and white or grey. You can adjust the symbology in the GRASS GIS Layer Manager by choosing the Map Layers window tab at the bottom of the window, selecting the layer you want to adjust with your mouse, and either right clicking on the layer, or clicking on the grey button at right edge of the layer listing, and choosing Properties. In the resulting dialogue box, use the Colors, Lines, Symbols and Labels tabs as necessary to get the kind of symbology you think is appropriate for each layer.

Some interesting characteristics of the map layer properties include the ability to limit the display of objects using SQL queries, and the ability to drive symbology using attribute columns for things like symbol sizes and colour schemes. Unfortunately, the City data we're using doesn't have that much in the way of interesting attributes to get very fancy in our map today.

Back in the Layer Manager window, note that right clicking on the layer while it is selected also lets one:

  • set opacity levels for each layer,
  • zoom the map to the extent of the selected layer,
  • change the current GRASS computational region (the "window" discussed above) to match the currently selected layer,
  • remove or rename the layer,
  • turn on editing mode, * rebuild topology for they layer, or
  • bring up the attribute data in a new window.

Importing raster data

Now we'll import some raster data, calling on the power of GDAL software to translate from a wide variety of raster formats.

In the File menu, choose Import raster data, then Common import formats. In the resulting window, change the Format to PCIDSK, then Browse to the file L5016028_20110610.pix. This is a LANDSAT 5 TM scene, from path 16 row 28, taken on 10 June, 2011. It includes most of the City of Ottawa, plus a large region to the north.

Note that in order to speed things up, I have done some preprocessing of the scene, and cropped it somewhat. As discussed in the introductory material about GRASS databases, all files in a given location in the GRASS database must share a common projection and coordinate system. The city data we've used so far and the Landsat image I've downloaded from the USGS archive both are in UTM coordinates, but the datum differs between the two. To fix this, I had to create a separate GRASS location to hold the satellite data at first, then I imported it from GeoTIFF using the same file import tools we've just used above, then I switched back to this GRASS location, and used r.proj (Raster|Develop raster map|Reproject raster map in the menus) to reproject the image files from the other location to this one, specifying a 30m resolution for the reprojection. I then exported all of the bands out to a single PCI .pix format file to speed up our import process.

When the file import is finished, you will get an error message if the "Add layer to display" checkbox was turned on, because the software doesn't realize that multiple bands of imagery were just imported, not a single layer. If a layer called L5016028_20110610 has been added to your Map Layers list, remove it now by selecting it, right clicking on it, and choosing Remove.

Click on the "Add various raster map layers" button on the Layer Manager toolbar (6th button in from the left, looks like a dark/light blue checkerboard pattern with a ... well, I'm not sure what that icon is supposed to be, but it's the one that ISN'T a green and white + sign... I'll try to find time to paste a screen shot in here really soon now! ;) Choose the Add RGB map layer option. Fill in the red, green and blue layers by clicking on the downward facing triangle button to the right of each field, and choosing the following layers for red, green and blue respectively: L5016028_20110610.3, L5016028_20110610.2, and L5016028_20110610.palette (I'm not sure why band one gets labelled .pallette instead of .1, but it does appear to be the blue band, so for now I'm ignoring this). Click OK, and a visible composite should appear in your map window. Feel free to zoom out and explore the image, or examine the other bands.

Analysis example: classification

To demonstrate image processing capabilities of GRASS, we'll quickly perform an unsupervised classification of this image. We'll first do cluster detection on a subset of the bands, then use maximum likelihood classification to assign each pixel a land cover class based on the detected spectral clusters.

Perform the following steps:

  • in the Map layers pane of the GRASS GIS Layer Manager window, right click on the satellite image layer, and choose Set computational region from selected map(s) (this sets the GRASS computational window on the full scene)
  • Under the Imagery menu, choose Develop images and groups|Create/edit group
  • On the required panel, type TM20110610 to name this group (or choose something different, but keep track of what you use)
  • Click on the Optional tab, and fill in Classify as the name of the imagery sub-group
  • Click multiple times on the downward facing triangle to the right of the "Name of raster map(s) to include in group" to choose, in turn each of the layers that end in .palette, .2, .3, .4, and .5 (we are adding the visible and near infrared bands to a "sub-group" to be analyzed for classification)
  • Click the Run button, then, when it has finished running, the Close button
  • Under the Imagery menu, choose Classify image|Clustering input for unsupervised classification
  • Choose TM20110610 as the imagery group, Classify as the sub-group, enter Mysigfile as the output signature file, and use an initial number of classes of 5 (as always, you can learn more about these options by clicking on the Manual tab)
  • Click Run, and once it successfully completes the task, click on Close

You have now completed the cluster analysis, and we're ready to classify the image using the detected clusters:

  • Under the Imagery menu, choose Classify image|Maximum likelihood classification (MLC)
  • for the imagery group, choose TM20110610, for the subgroup choose Classify, the file with signatures is Mysigfile, and enter Classes for the raster map to hold the classification results. Run the module and close the window when it's finished.

Examine the resulting map, and see what you think of the classification results. To be fair, we haven't exactly conducted a careful training procedure, and the image I downloaded has a fair bit of haze across the whole image, but at this number of classes it should have done a decent job of picking out at least a water class and some vegetation classes.

Analysis example: map algebra to calculate NDVI

NDVI (normalized difference vegetation index) calculation is a common task with Landsat (or other) imagery. To obtain a proper NDVI map, we would probably want to first apply atmospheric correction (and obtain an image that didn't have so much haze), but since we're in a hurry today and just demonstrating software capabilities, let's ignore rigour and proceed. The map calculator module in GRASS is an interface for the very powerful r.mapcalc program, which makes it easy to build a wide variety of computational models based on raster input layers. We'll use it here to input the formula for NDVI: NDVI = (NIR - Red) / (NIR + Red), where NIR is reflectance in the near infrared (Landsat TM band 4), and Red is red reflectance (Landsat TM band 3).

  • From the Raster menu in the Layer Manager, choose Raster map calculator
  • In the resulting window enter NDVI as the name of the new raster to create, then use the other buttons to create a formula of the form: float(L5016028_20110610.4 - L5016028_20110610.3) / float(L5016028_20110610.4 + L5016028_20110610.3) (if you use the dropdown menu to insert the layer names, it will also add "@PERMANENT" to indicate that these maps are in the PERMANENT mapset - that's fine, but is not needed so I've left them out here for simplicity)
  • Press the Run button to execute the formula

In the formula above, the float() function was used to force r.mapcalc to evaluate the addition and subtraction operations using floating point numbers; since all the input layers were integer maps, it would have otherwise defaulted to using integers for the output map, which leads to a pretty boring 3 category map because NDVI typically varies between 0-1 (with some exceptions).

Analysis example: vector query, and extracting raster values at vector locations

Now let's see what we can learn about the NDVI in Ottawa's parks. To run this analysis on the whole parks data file, however, takes too long for the purposes of tonight's workshop, so first we'll also subset the parks layer. Do the following steps:

  • In the layer manager window, choose Vector|Query with attributes
  • On the Required tab, for the input vector layer, choose x010_parks_all_parkland, and enter BigParks as the output vector map
  • Switch to the Selection tab, and clear all of the geometry type checkboxes except for area, then down in the WHERE condition field enter AREA_HA > 15 (this will query the existing map for all areas that have an attribute > 15 in the AREA_HA field)
  • Click Run, and then when it finishes, click Close.
  • In the layer manager, on the Map layers pane, select the x010parks_all_parkland layer, right click on it, and choose to set the computational region to the extent of this file (to avoid wasting time processing the entire region of the NDVI layer)
  • From the Vector menu, choose Update area attributes from raster In the resulting window, enter BigParks as the vector polygon map, NDVI as the raster map, and NDVI as the column prefix. Click Run (it may take a while to process - about 50 seconds on my test machine)
  • In the Map layers pane of the Layer Manager, select the BigParks layer, right click on it, and choose Show attribute data

If you scroll to the right in the attribute table, you will see several new fields have been added, describing the distribution of NDVI values that were found in each of the parks.

Analysis example: interpolation (and importing ASCII point data)

Now for a classic geography example: house prices! I've searched the locations of a bunch of recent real estate ads in central Ottawa, and placed them into prices.txt - unfortunately the file was not included on the files already downloaded to your machine, so you need to click here to get the data, and save the file into your data directory.

Import the prices data into your GRASS database using File|Import vector data|ASCII points/GRASS ASCII vector import; use prices as the new layer name, and in the Points tab, specify the third column as having z coordinate data.

Interpolation is actually a fairly complicated topic, and we're not doing it justice to just whiz through it here, but I'll plough ahead and give you a "teaser" look at the possibilities. We're going to try two different approaches, and you'll see that neither of them does very well at producing a cost surface of housing data with the input I've provided.

  • Change the computational region to match that of the prices layer, and then for even better performance, manually zoom into the core of the data in the centre of the city, and then use the Zoom Options button on the mapping window to set the computational region to match the display.
  • In the Layer Manager, choose Interpolate surfaces from the Raster menu, then choose Bilinear and bicubic from vector points.
  • In the v.surf.bspline window, under Required, chose the prices map. Under the Settings tab, set the layer number to 0. Under the Optional tab, set the output raster map to priceSurface. Run the module (it will take some time).
  • Now let's try something else, an IDW interpolation. From the Raster menu, choose Interpolate surfaces|IDW from vector points. Use prices as the input, priceSurfaceIDW as the output, and then under the Values tab, specify int_1 as the attribute table column with values to interpolate. Run the module and inspect the output.

I think you'll find that clearly we need more data to make a sensible map of housing prices, but this exercise at least introduced you to interpolations in GRASS. There are many more algorithms available, and the GEOM4008 class will be exploring this in detail in their next assignment.

Other resources

Important resources for more help on GRASS include: