|
From PostGIS in Action, Third Edition by Leo S. Hsu and Regina O. Obe In this article, you’ll learn what a topology is, how to build a topology from scratch, and how to use commonly available geometry data. |
Take 40% off PostGIS in Action, Third Edition by entering fccobe into the discount code box at checkout at manning.com.
Topological representation recognizes that, in reality, geometric features rarely exist independently of each other. When you gaze down on large metropolises from a plane, you see a maze of streets outlining blocks, interlocked. With a simple geometry model, you could use linestrings to represent the streets and polygons to represent the blocks. But once you lay out the streets, you know right where your blocks will be. Having to create polygons for the blocks is an exercise in redundancy. Congratulations, you’ve discovered topology.
We’ll be working with one set of examples. It will be very simple, created without loading data. This set of examples will give you a feel for how topologies are created and organized.
We’ve packaged the ch13_staging
schema as part of this chapter’s download (www.postgis.us/chapter_13_edition_3) and loaded in the tables cityboundary, neighbourhoods, and streetcentrelines. The ch13
schema houses the topogeometry tables for this chapter.
What topology is
The surface of the earth is finite. We have about 196.9 million square miles (510.1 million square kilometers) to play on, water included. Humans are territorial, so we’ve divided up all the land into countries, big and small. Excluding Antarctica and a few disputed zones, moving a country’s border involves at least two countries. The iron law of geography dictates that when one country gains land, another must cede land. This zero sum land game is the result of humans having created countries that are collectively exhaustive and mutually exclusive over the earth.
This collectively exhaustive and mutually exclusive division of area is a requirement of topology. With this premise, you don’t need to restate the obvious when creating geometries. For example, if you have a plot of land in the country and decide to use the northern half for farming and the other half for non-farming uses, then it follows that some kind of demarcation must be present, dividing the two halves. By creating one polygon for the farmable area, you create another polygon for the non-farmable area, and a linestring to part the two.
Consider another example. In 1790, the U.S. Congress created Washington D.C. from land ceded by adjoining states. The district is divided into four quadrants with two perpendicular axes radiating from the capitol buildings. The quadrants are appropriately named: Northwest, Northeast, Southwest, and Southeast. Suppose that in the spirit of equality, Congress decided to make all quadrants the same area. This would mean moving the center of the axes to the north and west. If you used a topology in the model, this reorganization would amount to nothing more than moving one point. By shifting this point, your linestrings (the axes) would follow along and the polygons forming the quadrants would either deflate or inflate. You’d achieve all this by simply moving a point!
This is the power of topology: by defining a set of rules for how geometries are interrelated, you save yourself the effort of having to survey the entire landscape anew whenever you make the slightest alteration.
In PostGIS there are three kinds of representations for vector data. There is the more standard geometry model, where each geometry stands as a separate unit. In the geometry model, things that are shared, such as borders of land masses, are duplicated in each geometry. There is the geography model which much like the geometry model treats each piece of space as separate units and borders are duplicated, but views these units in spheroidal space. Then there is the topology model, which borrows the 2D view of the world from geometry, but with one key difference. In the topology model shared borders and areas are stored once in the database and linked to the geometries that share the border. These geometries with linked edges are called topogeoms.
This has a couple of benefits:
- If you simplify an object for distribution, the edges that are simplified are still shared, so that you don’t end up with overlaps or gaps where you had none before.
- If you have a set of objects such as buildings, neighborhoods, or land parcels that shouldn’t overlap, it’s easier to detect and prevent these problems in a topology model.
Now that you have the concept of topologies fresh in your mind, we’ll move on to building topologies with PostGIS topology functions.
Using topologies
Topology is an entirely different take on spatial features than geometries. Think back to your first geometry course: In Euclidean geometry, points, lines, and polygons didn’t have coordinate systems as a backdrop. You didn’t care about the absolute measurement of things but rather the relationships between them. The topology model, in a way, reverts back to classical geometry, where you describe how two free geometries interact without any regard to coordinate systems.
Because GIS topology is an outgrowth of graph theory, it subscribes to a different set of terminology. For all intents and purposes, you can think of a point in geometry as a node in topology, a linestring as an edge, and a polygon as a face. Collectively, nodes, edges, and faces are topological primitives, used instead of geometries.
NOTE: We use the term topology to refer to both the topology field of study as well as to refer to a topology network.
Installing the topology extension
Before you can create a topology, you must make sure you’ve installed the topology extension. If you’re not sure, look for a schema named topology
in your database. This schema contains functions used to create topologies as well as the topology catalog table. If it’s missing, you haven’t yet installed the extension. Extensions must be installed on a database-by-database basis.
CREATE EXTENSION postgis_topology;
As part of the topology installation, PostGIS adds the topology
schema to your database’s search_path
. This means you can reference the topology functions without explicitly prepending topology
.
In some cases, the role you’re logged in as may have its own custom search_path
setting that will override the database’s search_path
. Before you continue, verify that topology
is part of your search_path
by running this SQL statement:
SHOW search_path;
If you don’t see topology
schema listed in the search_path, disconnect from your database and reconnect.
Once that’s done, you can create a topology.
Creating a topology
In this section you’ll create a stylized topology based on the rectangular state of Colorado with an SRID of 4326. The following listing shows how you can create the topology.
SELECT CreateTopology('ch13a_topology',4326);
After you execute the aforementioned SQL, you’ll notice a new schema named ch13a_topology
. A new entry will appears in the topology.topology catalog table registering the new topology. When you peek inside the ch13a_topology
schema, you’ll see four new tables awaiting data: node, edge_data, face, and relation.
PostGIS uses a separate schema to house each topology network—in this case ch13a_topology
. The chosen SRID applies to all tables within the schema and all topogeometry columns that will make use of the ch13a_topology
schema. Because topology is about relationships between geometries, having differing SRIDs makes no sense.
Within each topology, you’ll always find four tables: node, edge_data, face, and relation. The first three are just topo-speak for point, linestring, and polygon. Of these three tables for storing the primitives, edge_data is the one that holds all the information for building the network. When you start to build spatial objects from topology primitives, the relationships of each of these spatial objects with the topology will reside in the table.
For Colorado, you can start by adding the linestrings that form the state’s four boundaries using the function TopoGeo_AddLineString
, as shown in listing 1.
Listing 1. Building the Colorado topology network
SELECT TopoGeo_AddLineString( 'ch13a_topology', ST_GeomFromText( 'LINESTRING( -109.05304 39.195013, -109.05304 41.000889, -104.897461 40.996484 )', 4326 ) ); SELECT TopoGeo_AddLineString( 'ch13a_topology', ST_GeomFromText( 'LINESTRING( -104.897461 40.996484, -102.051744 40.996484, -102.051744 40.003029 )', 4326 ) ); SELECT TopoGeo_AddLineString( 'ch13a_topology', ST_GeomFromText( 'LINESTRING( -102.051744 40.003029, -102.04874 36.992682, -104.48204 36.992682 )', 4326 ) ); SELECT TopoGeo_AddLineString( 'ch13a_topology', ST_GeomFromText( 'LINESTRING( -104.48204 36.992682, -109.045226 36.999077, -109.05304 39.195013 )', 4326 ) );
To make sure you’ve typed or copied everything correctly, execute the following SQL:
SELECT ST_GetFaceGeometry('ch13a_topology',1);
If you look at the output on the Geometry viewer in pgAdmin, you should see what is shown in Figure 1.
Figure 1. Colorado as seen in pgAdmin4
The entire state of Colorado is one big face that is a perfectly rectangular polygon geometry.
Look inside the tables after running the code in listing 1 and you’ll see four new edges, four new nodes, and one new face. The TopoGeo_AddLineString
function automatically generates the topology network using the edge data and fills in the nodes and faces. You now have a topology of the rectangular outline of Colorado.
Two major interstate highways crisscross the state from boundary to boundary: I-25 runs north/south and I-70 runs west/east. You can add I-70 with the following code.
Listing 2. Adding highway I-70
SELECT TopoGeo_AddLineString( 'ch13a_topology', ST_GeomFromText( 'LINESTRING( -109.05304 39.195013, -108.555908 39.108751, -105.021057 39.717751, -102.051744 40.003029 )', 4326 ) );
Upon successfully adding I-70, the SELECT
will return the ID number of the new edge. You should see the number 5
in the output.
Next, add I-25.
Listing 3. Adding highway I-25
SELECT TopoGeo_AddLineString( 'ch13a_topology', ST_GeomFromText( 'LINESTRING( -104.897461 40.996484, -105.021057 39.717751, -104.798584 38.814031, -104.48204 36.992682 )', 4326 ) );
Because you added I-70 first and then I-25, the latter will bisect I-70, creating two edges for itself and breaking I-70 into two edges. The output will return the ID numbers of the two new edges for I-25: 7
and 8
.
A diagram will be helpful at this point. We used QGIS PostGIS Topology Viewer to produce figure 2, which shows the four face IDs, eight edge IDs, and five nodes (each using a different style of numbers).
Figure 2. Colorado topology network
I-25 (edges 8, 7 with nodes 2, 5, 4) runs north/south. I-70 (edges 5, 6 with nodes 1, 5, 3) runs west to east. The two highways intersect at the state capital, Denver (node 5).
The addition of the highways split the original single-face Colorado into four faces. Look carefully at the tables again: PostGIS automatically reorganized your topology. The corner points are no longer nodes, just vertices outlining the edge. PostGIS added a node for Denver where the two highway edges intersect.
We modeled the highways with kinks. For I-25, the kink is at Colorado Springs. For I-70, the kink is at Grand Junction.These kinks are merely vertices used to refine the geometry; they play no part in relationships. As such, they aren’t nodes. Edges only intersect at nodes.
You now have a total of eight edges. The two highways slice Colorado into four distinct polygons or faces. The addition of highway I-25 split our original one-edge I-70 (edge 5) into two edges (5 and 6).
If you look in the face table using query below, you’ll see each of the faces listed as well as their mbr (minimum bounding rectangle), which is just the bounding box of the face. The face table doesn’t store the actual polygons because all the data necessary to derive them can be found in the edge_data table. This storage methodology abides by the database principle of keeping data in only one place. As before you use the ST_GetFaceGeometry
function to view the actual face geometry.
SELECT face_id, mbr, ST_GetFaceGeometry('ch13a_topology',face_id) AS geom ❶ FROM ch13a_topology.face WHERE face_id > 0; ❷
❶ Return the face geometry
❷ Exclude the universal face which has no geometry
The output of the the geom
column in the above query is shown in Figure 3. The query only considers non-universal faces. The universal face always has a face id of 0 and represents that part which is not part of the topology so is always empty.
Figure 3. Colorado quadsected at Denver as seen in pgAdmin4
edge view and edge_data
The edge
view is a view that contains a subset of the columns of the edge_data table. The edge_data table contains additional columns not defined in the OGC topology spec, but that are used internally by PostGIS topology. For general uses and to keep in line with the OGC topology standards, the edge
view should be used instead of directly querying the edge_data table.
Remember that topology isn’t concerned with describing geometries but how they’re related. Removing all the superfluous vertices in Colorado creates a skeletal network diagram that you can see in figure 4.
Figure 4. Simplified network topology
You can see the edges that make up the topology by querying the edge_data
table using listing 4.
Listing 4. Query edge_data
SELECT * FROM ch13a_topology.edge_data ORDER BY edge_id;
The output of listing 4 is shown in figure 5 shows the contents of the edge_data table based on the simplified topology.
Don’t be alarmed if your output is a little different from below. You should however have 8 edges at this point.
Figure 5. Simplified network topology
The topogeometry type
Once you’ve constructed your topology, you can group your primitives to constitute topogeometries (layers in topo-speak).
Let’s say you want to collect the four edges making up the highways in the Colorado model. You could start by creating a new table to store topogeometries, as shown in listing 5. To this table you could add a topogeometry
column using the PostGIS AddTopoGeometryColumn
function. You should always use the AddTopoGeometryColumn
function to create new columns because it takes care of registering the new topogeometry
column in the topology.layer
table.
Listing 5. Creating a table to store highways and defining a topogeometry column
CREATE TABLE ch13.highways_topo (highway varchar(20) PRIMARY KEY); SELECT AddTopoGeometryColumn( 'ch13a_topology', 'ch13', 'highways_topo', 'topo', 'LINESTRING' );
After running the preceding code, you should see a new entry in the topology.layer
table. AddTopoGeometryColumn
will return the auto-assigned ID of the new layer. Keep in mind that a topogeometry is always tied to a layer.
Once you have your topogeometry
column, you can add the I-70 highway using the CreateTopoGeom
function as shown in listing 6.
Listing 6. Defining I-70 topogeometry using CreateTopoGeom
INSERT INTO ch13.highways_topo (highway, topo) VALUES ( 'I70', CreateTopoGeom( ❶ 'ch13a_topology', 2, ❷ 1, ❸ '{{5,2},{6,2}}'::topoelementarray ❹ ) );
❶ Define entry for I-70 where the topology elements are formed from ch13a_topology
❷ The type of topogeom: 2 = lineal
❸ The ID of the layer this topogeom belongs to. This is the number returned when you defined the topogeom column.
❹ The elements that make up this topogeom. Each element in the array is composed of the element ID and the element type (1 = node, 2 = edge, 3 = face). In this example, all elements are edges.
When defining a new topogeometry column, you need to denote the topogeometry type by one of the following numbers: 1 = point, 2 = lineal, 3 = areal. In the case of topogeometries, polygons and multipolygons are lumped together under the areal type, points and multipoints are the point type, and linestrings and multilinestrings are the lineal type.
A topogeometry is implemented as a database domain type composed of 4 elements. You can see the encarnation with the following query:
SELECT highway, topo, (topo).* FROM ch13.highways_topo WHERE highway = 'I70';
The output of the above listings is:
highway | topo | topology_id | layer_id | id | type ---------+-----------+-------------+----------+----+------ I70 | (1,1,1,2) | 1 | 1 | 1 | 2 (1 row)
As you can see expanding (topo).* lists out as columns the parts that make up the topogeometry. The first item is the topology the topogeometry belongs to. The second element is the layer
the topogeometry belongs to which is the layer id assign to the ch13.highways_topo.topo
column when you added the column. The third item is the id of the topogeometry in the ch13a_topology schema relation
table in column topogeo_id
. Finally the fourth item is the type in this case 2 = lineal
.
If you have geometries to start with, you can use the powerful toTopoGeom
function to convert geometries to topogeometries and add the newly formed topogeometries to your table in one step, as demonstrated in listing 7.
Listing 7. Defining a topogeometry using toTopoGeom
INSERT INTO ch13.highways_topo (highway, topo) SELECT 'I25', toTopoGeom( ❶ ST_GeomFromText( 'LINESTRING( -104.897461 40.996484, -105.021057 39.717751, -104.798584 38.814031, -104.48204 36.992682 )', 4326 ), ❷ 'ch13a_topology', ❸ 1 ❹ );
❶ Define I-25 using toTopoGeom
❷ The geometry; any edges or nodes needed to form the geometry will be created if not present
❸ The topology
❹ The layer
In listing 7 you add the topogeometry of I-25 using the toTopoGeom
function. The risk and benefit of using this function is that it will, by default, create new primitive edges, nodes, and faces as needed, if primitives don’t exist to form the new topogeometry.
In this example, you already added the primitive edges in listing 3, so toTopoGeometry
shouldn’t introduce new edges. You include the name of the topology you’re adding to, as well as the layer that this new topogeometry will be associated with. This layer ID must be the same as the one returned when you created the topogeometry column in listing 5.
If a node or edge needed to form the new topogeometry doesn’t exist, the toTopoGeom
function will automatically apply a tolerance to find matching nodes or edges before resorting to creating them. In other words, if an existing node is within the snap distance of the linestring geometry, toTopogeom
will shift the linestring to incorporate the node as a vertex instead of creating a new node. If you want to override the default tolerance, you can pass in an additional final argument to toTopogeom
to apply a tolerance. The default tolerance that toTopogeom
uses is a function of the bounding box of the input geometry. This default tolerance is computed internally using the function topology._ST_MinTolerance
.
To confirm the composition of your new topogeometries, you can use the GetTopoGeomElements
function, as in the next listing.
Listing 8. Querying primitive elements of Colorado highways
SELECT highway, (topo).*, GetTopoGeomElements(topo) As el FROM ch13.highways_topo ORDER BY highway;
This listing outputs the four topogeometry subelement identifiers accessed with (topo).*)
and a set of topoelements using the GetTopoGeomElements
function.
highway | topology_id | layer_id | id | type | el ---------+-------------+----------+----+------+------- I25 | 1 | 1 | 2 | 2 | {7,2} I25 | 1 | 1 | 2 | 2 | {8,2} I70 | 1 | 1 | 1 | 2 | {5,2} I70 | 1 | 1 | 1 | 2 | {6,2}
The code in listing 8 returns a set of objects called topoelements for each topogeometry. Although you only have two rows in the highways_topo table, you get back four rows when you use the GetTopoGeomElements
function because GetTopoGeomElements
returns a row for each edge of each highway.
The Topoelement
object is an integer array domain type with two elements: The first is the ID of the element in the corresponding table. Because edges make up the highways, the IDs are edge_ids
in ch13a_topology.edge. The second element of a topoelement denotes the layer/class type (1
= node, 2
= edge, 3
= face, and higher numbers are the IDs of layers).
Establish a naming convention
PostGIS doesn’t make a clear distinction between database objects that describe the topological networks versus your own use of topologies in topogeometry columns. We advise you to establish a naming convention. The myriad of schemas and tables supporting topologies can be overwhelming, especially for those charged with maintaining the underlying network.
Recap of using topologies
The PostGIS topology model provides the following features for working with topologies:
- Enabling the topology extension immediately creates the topology schema and functions.
- The
topology.topology
table records all topologies in your database. - The
topology.layer
table records all topogeometry columns (layers) in your database. - Each topological network has its own network schema.
- Primitives (edges, nodes, faces) have their respective tables in the network schema.
- The relation table in a specific topology network schema (in this case, ch13a_topology.relation) records which topology primitives and layer elements belong in which topogeometry.
Once you’ve built your topologies, you’re free to use them anywhere within the database. You can use them elsewhere in your database by building topogeometries from your topology. The process follows:
- Add topogeometry columns (layers) to your own tables.
- Create topogeometries from primitives or other layers, and add them to your topogeometry column.
- Add topogeometries from geometries and change your underlying network in one step using the
toTopoGeom
function. Keep in mind, though, that once you do this, edges, faces, and nodes are automatically added and existing ones are split. Once your topology is changed this way, simply removing the introduced topogeometry is not sufficient to revert the changes to the topology.
That’s all for this article. If you want to learn more about the book, you can check it out on Manning’s browser-based liveBook reader here.