Click the box titles below to expand:

How to's

XSLT

GOOGLE MAPS API

PHP

Antelope

Perl

Document Object Model (DOM)

UNIX

Generic Mapping Tools (GMT)

Miscellaneous

Projects

Courses Taught

Latest Favorites

Mac OS X

Web Development

Beta

How To :: Unix :: Convert Google Earth KML files for use with Generic Mapping Tools (GMT) style XY files

Today many content developers create Google Earth (GE) files as the default file type for public distribution from other third-party applications, for example, ESRIs ArcInfo application. This is great if you just use GE, but many digital cartographers (myself included) still use Generic Mapping Tools (GMT), an open-source series of UNIX programs. I had to create a map of an earthquake swarm in the Yellowstone region from December 2008, and the most recent fault databases of the region available were provided by the United States Geological Survey (USGS) and were only available in GE KML format. So, using some Vi regex- and macro-magic, here is how to convert KML files to GMT-friendly XY data files...

Note: This may not be the most efficient method out there, and I am all ears if you have a better way!- RLN

The KML you start with is typical like:

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
    <Document>
        <open>1</open>
        <name><![CDATA[LateQuaternary]]></name>
        <Folder>
            <name>Legend</name>
            <visibility>0</visibility>
            <ScreenOverlay>
                <name>Legend</name>
                <visibility>0</visibility>
                <overlayXY x="0" y="1" xunits="fraction" yunits="fraction" />
                <screenXY x="0" y="1" xunits="fraction" yunits="fraction" />
                <Icon>
                    <href>legend.png</href>
                </Icon>
            </ScreenOverlay>
        </Folder>
        <Folder>
            <name><![CDATA[LateQuaternary]]></name>
            <visibility>1</visibility>
            <open>1</open>
            <Placemark>
                <name><![[CDATA[]]></name>
                <Style id="LateQuaternary_def_">
                    <IconStyle>
                        <scale>1</scale>
                        <Icon>
                            <w>16</w>
                            <h>16</h>
                            <href>LateQuaternary_def_.png</href>
                        </Icon>
                    </IconStyle>
                    <PolyStyle>
                        <color>ff00ffff</color>
                        <fill>0</fill>
                    </PolyStyle>
                    <LineStyle>
                        <color>ff00ffff</color>
                        <width>2</width>
                    </LineStyle>
                    <LabelStyle>
                        <scale>0</scale>
                    </LabelStyle>
                </Style>
            </Placemark>
            <Folder>
                <name>Data</name>
                <visibility>1</visibility>
                <Placemark id="pm5695" >
                    <Snippet maxLines="0">empty</Snippet>
                    <name><![CDATA[Agai Pah Hills fault zone]]></name>
                    <description><![CDATA[<table cellpadding="1" cellspacing="1">
                        <tr>
                            <td>NAME:<td>
                            <td>Agai Pah Hills fault zone<td>
                        <tr>
                        <tr>
                            <td>NUM:<td>
                            <td>1308<td>
                        <tr>
                    <table>
                    </description>
                    <styleUrl>Style_37</styleUrl>
                    <MultiGeometry>
                        <Point id="g49314">
                            <altitudeMode>clampedToGround</altitudeMode>
                            <coordinates>-118.523687878657,38.834129815241</coordinates>
                        </Point>
                        <MultiGeometry>
                            <tessellate>1</tessellate>
                            <LineString id="g49315">
                                <altitudeMode>clampedToGround</altitudeMode>
                                <coordinates>118.523967939659,38.7999376440516,1
                                    -118.524547956503,38.7989493008326,1 
                                    -118.525211303626,38.7970381695745,1 
                                    -118.525615747629,38.7947875892166,1 
                                    -118.52551684289,38.7932047950355,1 
                                </coordinates>
                            </LineString>
                            <LineString id="g49316">
                                <altitudeMode>clampedToGround</altitudeMode>
                                <coordinates>118.523952406569,38.8030615668241,1 
                                    -118.52374683954,38.8025582279755,1 
                                    -118.523739047814,38.8009026543606,1 
                                    -118.523792382107,38.8007843197693,1 
                                <coordinates>
                                ....
                                ... lots of coordinates ...
                                ....
                            </LineString>
                        </MultiGeometry>
                    </MultiGeometry>
                </Placemark>
            </Folder>
        </Folder>
    </Document>
</kml>

We need to convert this into something that GMT's psxy command can understand, in the format:
(Note: the > signs in GMT mean that a seperate segment is to be plotted)

>
-118.523687878657 38.834129815241
>
-118.523967939659 38.7999376440516
-118.524547956503 38.7989493008326
-118.525211303626 38.7970381695745
-118.525615747629 38.7947875892166
-118.52551684289 38.7932047950355
>
-118.523952406569 38.8030615668241
-118.52374683954 38.8025582279755
-118.523739047814 38.8009026543606
-118.523792382107 38.8007843197693
>
-118.52341853304 38.8080249547814
-118.523296299888 38.8072693902778
-118.523258506779 38.8054727040433
-118.523281276713 38.8043476917548
>

1. Remove the KML, Document, Folder tags etc in the header and footer lines with 'dd'

2. Strip out all the opening <coordinates> tags and replace with new lines with an indent ('>') which is used by GMT to split the XY sections:

:%s/<coordinates>/\r>\r/g

3. Search for closing </coordinates> and the rest of the line and remove.

:%s/<\/coordinates>.*//g

4. Remove all empty lines:

:g/^$/ d

5. Write a macro to remove all the <tr> table tags and content. In my examples, after all my regexs, these are always nine lines long. Therefore, to write a Vim macro:

6. Replace all commas (',') with spaces (' '):

:%s/,/ /g

7. Use awk to print just the first two columns (lon and lat, not elevation):

awk '{print $1,$2}' google_coords.kml > gmt_coords.xy

8. Remove all the newlines that awk just added to the lines with >:

:%s/^M//g

(where ^M is typed as ctrl-(vm)... ie, hold down the ctrl key and type vm (ctrl-v is the escape character))

Click here to see the resulting map created from these fault databases

Resources used