Geographic Data Structures

In examples provided in prior chapters, geodata was in the form of individual variables. Mapping Toolbox™ software also provides an easy means of displaying, extracting, and manipulating collections of vector map features organized in geographic data structures.

A geographic data structure is a MATLAB® structure array that has one element per geographic feature. Each feature is represented by coordinates and attributes. A geographic data structure that holds geographic coordinates (latitude and longitude) is called a geostruct, and one that holds map coordinates (projected x and y) is called a mapstruct. Geographic data structures hold only vector features and cannot be used to hold raster data (regular or geolocated data grids or images).

Shapefiles

Geographic data structures most frequently originate when vector geodata is imported from a shapefile. The Environmental Systems Research Institute designed the shapefile format for vector geodata. Shapefiles encode coordinates for points, multipoints, lines, or polygons, along with non-geometrical attributes.

A shapefile stores attributes and coordinates in separate files; it consists of a main file, an index file, and an xBASE file. All three files have the same base name and are distinguished by the extensions .shp, .shx, and .dbf, respectively. (For example, given the base name 'concord_roads' the shapefile file names would be 'concord_roads.shp', 'concord_roads.shx', and 'concord_roads.dbf').

The Contents of Geographic Data Structures

The shaperead function reads vector features and attributes from a shapefile and returns a geographic data structure array. The shaperead function determines the names of the attribute fields at run-time from the shapefile xBASE table or from optional, user-specified parameters. If a shapefile attribute name cannot be directly used as a field name, shaperead assigns the field an appropriately modified name, usually by substituting underscores for spaces.

Fields in a Geographic Data Structure

Field NameData TypeDescriptionComments

Geometry

character vector

One of the following shape types: 'Point', 'MultiPoint', 'Line', or 'Polygon'.

For a 'PolyLine', the value of the Geometry field is simply 'Line'.

BoundingBox

2-by-2 numerical array

Specifies the minimum and maximum feature coordinate values in each dimension in the following form:

[min(X)min(Y)max(X)max(Y)]

Omitted for shape type 'Point'.

X, Y, Lon, or Lat

1-by-N array of class double

Coordinate vector.

 

Attr

character vector or scalar number

Attribute name, type, and value.

Optional. There are usually multiple attributes.

The shaperead function does not support any 3-D or "measured" shape types: 'PointZ', 'PointM', 'MultipointZ', 'MultipointM', 'PolyLineZ', 'PolyLineM', 'PolygonZ', 'PolylineM', or 'Multipatch'. Also, although 'Null Shape' features can be present in a 'Point', 'Multipoint', 'PolyLine', or 'Polygon' shapefile, they are ignored.

PolyLine and Polygon Shapes

In geographic data structures with Line or Polygon geometries, individual features can have multiple parts—disconnected line segments and polygon rings. The parts can include counterclockwise inner rings that outline "holes." For an illustration of this, see Displaying a Polygon. Each disconnected part is separated from the next by a NaN within the X and Y (or Lat and Lon) vectors. You can use the isShapeMultipart function to determine if a feature has NaN-separated parts.

Each multipoint or NaN-separated multipart line or polygon entity constitutes a single feature and thus has one character vector or scalar double value per attribute field. It is not possible to assign distinct attributes to the different parts of such a feature; any character vector or numeric attribute imported with (or subsequently added to) the geostruct or mapstruct applies to all the feature's parts in combination.

Mapstructs and Geostructs

By default, shaperead returns a mapstruct containing X and Y fields. This is appropriate if the data set coordinates are already projected (in a map coordinate system). Otherwise, if the data set coordinates are unprojected (in a geographic coordinate system), use the parameter-value pair 'UseGeoCoords',true to make shaperead return a geostruct having Lon and Lat fields.

Coordinate Types. If you do not know whether a shapefile uses geographic coordinates or map coordinates, here are some things you can try:

  • Ask your data provider.

  • Use shapeinfo to obtain the BoundingBox. By looking at the ranges of coordinates, you may be able to tell what kind of coordinates you have.

  • Examine the optional .prj file, if one has been provided. The .prj file is written in well-known text, a text mark-up language. If your .prj file contains the term PROJCS, you have map coordinates. If your .prj file contains the term GEOGCS, but not the term PROJCS, you have geographic coordinates.

The geoshow function displays geographic features stored in geostructs, and the mapshow function displays geographic features stored in mapstructs. If you try to display a mapstruct with geoshow, the function issues a warning and calls mapshow. If you try to display a geostruct with mapshow, the function projects the coordinates with a Plate Carree projection and issues a warning.

Examining a Geographic Data Structure

Here is an example of an unfiltered mapstruct returned by shaperead:

S = shaperead('concord_roads.shp')

The output appears as follows:

S = 
609x1 struct array with fields:
    Geometry
    BoundingBox
    X
    Y
    STREETNAME
    RT_NUMBER
    CLASS
    ADMIN_TYPE
    LENGTH

The shapefile contains 609 features. In addition to the Geometry, BoundingBox, and coordinate fields (X and Y), there are five attribute fields: STREETNAME, RT_NUMBER, CLASS, ADMIN_TYPE, and LENGTH.

Look at the 10th element:

S(10)

The output appears as follows:

ans = 
       Geometry: 'Line'
    BoundingBox: [2x2 double]
              X: [1x9 double]
              Y: [1x9 double]
     STREETNAME: 'WRIGHT FARM'
      RT_NUMBER: ''
          CLASS: 5
     ADMIN_TYPE: 0
         LENGTH: 79.0347

This mapstruct contains 'Line' features. The tenth line has nine vertices. The values of the first two attributes are character vectors. The second happens to be an empty character vector. The final three attributes are numeric. Across the elements of S, X and Y can have various lengths, but STREETNAME and RT_NUMBER must always contain character vectors, and CLASS, ADMIN_TYPE and LENGTH must always contain scalar doubles.

In this example, shaperead returns an unfiltered mapstruct. If you want to filter out some attributes, see Select Shapefile Data to Read for more information.

How to Construct Geographic Data Structures

Functions such as shaperead or gshhs return geostructs when importing vector geodata. However, you might want to create geostructs or mapstructs yourself in some circumstances. For example, you might import vector geodata that is not stored in a shapefile (for example, from a MAT-file, from an Microsoft® Excel® spreadsheet, or by reading in a delimited text file). You also might compute vector geodata and attributes by calling various MATLAB or Mapping Toolbox functions. In both cases, the coordinates and other data are typically vectors or matrices in the workspace. Packaging variables into a geostruct or mapstruct can make mapping and exporting them easier, because geographic data structures provide several advantages over coordinate arrays:

  • All associated geodata variables are packaged in one container, a structure array.

  • The structure is self-documenting through its field names.

  • You can vary map symbology for points, lines, and polygons according to their attribute values by constructing a symbolspec for displaying the geostruct or mapstruct.

  • A one-to-one correspondence exists between structure elements and geographic features, which extends to the children of hggroup objects constructed by mapshow and geoshow.

Achieving these benefits is not difficult. Use the following example as a guide to packaging vector geodata you import or create into geographic data structures.

Making Point and Line Geostructs

The following example first creates a point geostruct containing three cities on different continents and plots it with geoshow. Then it creates a line geostruct containing data for great circle navigational tracks connecting these cities. Finally, it plots these lines using a symbolspec.

  1. Begin with a small set of point data, approximate latitudes and longitudes for three cities on three continents:

    latparis =  48.87084; lonparis =   2.41306;   % Paris coords
    latsant  = -33.36907; lonsant  = -70.82851;   % Santiago
    latnyc   =  40.69746; lonnyc   = -73.93008;   % New York City

  2. Build a point geostruct; it needs to have the following required fields:

    • Geometry (in this case 'Point')

    • Lat (for points, this is a scalar double)

    • Lon (for points, this is a scalar double)

    % The first field by convention is Geometry (dimensionality).
    % As Geometry is the same for all elements, assign it with deal:
    [Cities(1:3).Geometry] = deal('Point');
    
    % Add the latitudes and longitudes to the geostruct:
    Cities(1).Lat = latparis; Cities(1).Lon = lonparis;
    Cities(2).Lat = latsant;  Cities(2).Lon = lonsant;
    Cities(3).Lat = latnyc;   Cities(3).Lon = lonnyc;
    
    % Add city names as City fields. You can name optional fields 
    % anything you like other than Geometry, Lat, Lon, X, or Y.
    Cities(1).Name = 'Paris';
    Cities(2).Name = 'Santiago';
    Cities(3).Name = 'New York';
    % Inspect your completed geostruct and its first member
    Cities    
    
    Cities = 
    1x3 struct array with fields:
        Geometry
        Lat
        Lon
        Name
    
    Cities(1)
    
    ans = 
        Geometry: 'Point'
             Lat: 48.8708
             Lon: 2.4131
            Name: 'Paris'

  3. Display the geostruct on a Mercator projection of the Earth's land masses stored in the landareas.shp shapefile, setting map limits to exclude polar regions:

    axesm('mercator','grid','on','MapLatLimit',[-75 75]); tightmap; 
    % Map the geostruct with the continent outlines
    geoshow('landareas.shp')
    
    % Map the City locations with filled circular markers
    geoshow(Cities,'Marker','o',...
        'MarkerFaceColor','c','MarkerEdgeColor','k');
    
    % Display the city names using data in the geostruct field Name.
    % Note that you must treat the Name field as a cell array.
    textm([Cities(:).Lat],[Cities(:).Lon],...
        {Cities(:).Name},'FontWeight','bold');

  4. Next, build a Line geostruct to package great circle navigational tracks between the three cities:

    % Call the new geostruct Tracks and give it a line geometry:
    [Tracks(1:3).Geometry] = deal('Line');
    
    % Create a text field identifying kind of track each entry is.
    % Here they all will be great circles, identified as 'gc'
    % (character vector used by certain functions to signify great circle arcs)
    trackType = 'gc';
    [Tracks.Type] = deal(trackType);
    
    % Give each track an identifying name
    Tracks(1).Name = 'Paris-Santiago';
    [Tracks(1).Lat Tracks(1).Lon] = ...
            track2(trackType,latparis,lonparis,latsant,lonsant);
    
    Tracks(2).Name = 'Santiago-New York';
    [Tracks(2).Lat Tracks(2).Lon] = ...
            track2(trackType,latsant,lonsant,latnyc,lonnyc);
    
    Tracks(3).Name = 'New York-Paris';
    [Tracks(3).Lat Tracks(3).Lon] = ...
            track2(trackType,latnyc,lonnyc,latparis,lonparis);

  5. Compute lengths of the great circle tracks:

    % The distance function computes distance and azimuth between
    % given points, in degrees. Store both in the geostruct.
    for j = 1:numel(Tracks)
        [dist az] = ...
            distance(trackType,Tracks(j).Lat(1),...
                               Tracks(j).Lon(1),...
                               Tracks(j).Lat(end),...
                               Tracks(j).Lon(end));
        [Tracks(j).Length] = dist;
        [Tracks(j).Azimuth] = az;
    end
    % Inspect the first member of the completed geostruct
    Tracks(1)
    
    ans = 
        Geometry: 'Line'
            Type: 'gc'
            Name: 'Paris-Santiago'
             Lat: [100x1 double]
             Lon: [100x1 double]
          Length: 104.8274
         Azimuth: 235.8143

  6. Map the three tracks in the line geostruct:

    % On cylindrical projections like Mercator, great circle tracks
    % are curved except those that follow the Equator or a meridian.
    
    % Graphically differentiate the tracks by creating a symbolspec;
    % key line color to track length, using the 'summer' colormap.
    % Symbolspecs make it easy to vary color and linetype by
    % attribute values. You can also specify default symbologies.
    
    colorRange = makesymbolspec('Line',...
                {'Length',[min([Tracks.Length]) ...
                  max([Tracks.Length])],...
                 'Color',winter(3)});
    geoshow(Tracks,'SymbolSpec',colorRange);
    

    You can save the geostructs you just created as shapefiles by calling shapewrite with a file name of your choice, for example:

    shapewrite(Cities,'citylocs');
    shapewrite(Tracks,'citytracks');

Making Polygon Geostructs

Creating a geostruct or mapstruct for polygon data is similar to building one for point or line data. However, if your polygons include multiple, NaN-separated parts, recall that they can have only one value per attribute, not one value per part. Each attribute you place in a structure element for such a polygon pertains to all its parts. This means that if you define a group of islands, for example with a single NaN-separated list for each coordinate, all attributes for that element describe the islands as a group, not particular islands. If you want to associate attributes with a particular island, you must provide a distinct structure element for that island.

Be aware that the ordering of polygon vertices matters. When you map polygon data, the direction in which polygons are traversed has significance for how they are rendered by functions such as geoshow, mapshow, and mapview. Proper directionality is particularly important if polygons contain holes. The Mapping Toolbox convention encodes the coordinates of outer rings (e.g., continent and island outlines) in clockwise order; counterclockwise ordering is used for inner rings (e.g., lakes and inland seas). Within the coordinate array, each ring is separated from the one preceding it by a NaN.

When plotted by mapshow or geoshow, clockwise rings are filled. Counterclockwise rings are unfilled; any underlying symbology shows through such holes. To ensure that outer and inner rings are correctly coded according to the above convention, you can invoke the following functions:

  • ispolycw — True if vertices of polygonal contour are clockwise ordered

  • poly2cw — Convert polygonal contour to clockwise ordering

  • poly2ccw — Convert polygonal contour to counterclockwise ordering

  • poly2fv — Convert polygonal region to face-vertex form for use with patch in order to properly render polygons containing holes

Three of these functions check or change the ordering of vertices that define a polygon, and the fourth one converts polygons with holes to a completely different representation.

For an example of working with polygon geostructs, see Converting Coastline Data (GSHHG) to Shapefile Format.

Mapping Toolbox Version 1 Display Structures

Prior to Version 2, when geostructs and mapstructs were introduced, a different data structure was employed when importing geodata from certain external formats to encapsulate it for map display functions. These display structures accommodated both raster and vector map data and other kinds of objects, but lacked the generality of current geostructs and mapstructs for representing vector features and are being phased out of the toolbox. However, you can convert display structures that contain vector geodata to geostruct form using updategeostruct. For more information about Version 1 display structures and their usage, see Version 1 Display Structures in the reference page for displaym. Additional information is located in reference pages for updategeostruct, extractm, and mlayers.