I love xmlformat

There have been many times that I've captured some machine-generated XML document for debugging purposes. Usually these documents omit all line breaks, indentation, and other niceties which make the text more readable. Of course no real XML parser cares about such things, but the very-incomplete XML parser in my head really appreciates them. I've spent plenty of time manually re-formatting various documents to get them into some state where I could understand the structure and makes some sense out of it.

But, wow, that is tedious work. Click, enter, space, space, click, enter, space, space, etc, etc, etc... I went looking for a better solution, and found xmlformat. You can also use tidy, which is included on OSX by default. I prefer xmlformat because of the ease of configuration.

alex@rutabaga:~$ xmlformat --show-config
*DEFAULT
  format = block
  entry-break = 1
  element-break = 1
  exit-break = 1
  subindent = 2
  normalize = no
  wrap-length = 0

*DOCUMENT
  format = block
  entry-break = 0
  element-break = 1
  exit-break = 1
  subindent = 0
  normalize = no
  wrap-length = 0

This is pretty much the default configuration. I keep my configuration in ~/.xmlformat.conf, and set export XMLFORMAT_CONF=~/.xmlformat.conf in my ~/.profile script.

Why is this good?

With a single command:

alex@rutabaga:~$ xmlformat wfs_getfeature.xml

This mess:

<wfs:GetFeature xmlns:wfs="http://www.opengis.net/wfs" service="WFS" version="1.1.0" xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.1.0/wfs.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><wfs:Query typeName="kodos:" srsName="EPSG:4326" xmlns:kodos="http://www.regionsproject.org/kodos"><ogc:Filter xmlns:ogc="http://www.opengis.net/ogc"><ogc:And><ogc:PropertyIsLessThan><ogc:PropertyName>map_id</ogc:PropertyName><ogc:Literal>0</ogc:Literal></ogc:PropertyIsLessThan><ogc:BBOX><ogc:PropertyName>area</ogc:PropertyName><gml:Envelope xmlns:gml="http://www.opengis.net/gml" srsName="EPSG:4326"><gml:lowerCorner>-227.08007812014 0.922841119201</gml:lowerCorner><gml:upperCorner>17.08007812012 68.341004876129</gml:upperCorner></gml:Envelope></ogc:BBOX></ogc:And></ogc:Filter></wfs:Query></wfs:GetFeature>

is transformed into this:

<wfs:GetFeature xmlns:wfs="http://www.opengis.net/wfs" service="WFS" version="1.1.0" xsi:schemaLocation="http://www.opengis.net/wfs http://schemas.opengis.net/wfs/1.1.0/wfs.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <wfs:Query typeName="kodos:" srsName="EPSG:4326" xmlns:kodos="http://www.regionsproject.org/kodos">
    <ogc:Filter xmlns:ogc="http://www.opengis.net/ogc">
      <ogc:And>
        <ogc:PropertyIsLessThan>
          <ogc:PropertyName>map_id</ogc:PropertyName>
          <ogc:Literal>0</ogc:Literal>
        </ogc:PropertyIsLessThan>
        <ogc:BBOX>
          <ogc:PropertyName>area</ogc:PropertyName>
          <gml:Envelope xmlns:gml="http://www.opengis.net/gml" srsName="EPSG:4326">
            <gml:lowerCorner>-227.08007812014 0.922841119201</gml:lowerCorner>
            <gml:upperCorner>17.08007812012 68.341004876129</gml:upperCorner>
          </gml:Envelope>
        </ogc:BBOX>
      </ogc:And>
    </ogc:Filter>
  </wfs:Query>
</wfs:GetFeature>

It's now easy to see what elements are nested inside which other elements. This makes all kinds of debugging and troubleshooting tasks immensely easier.

A co-worker pointed out that xmllint --format somefile accomplishes basically the same thing. xmllint ships by default on (most) Linux'es, ships by default on OSX Snow Leopard. Thanks!

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.