XML query in Bash with XMLStarlet

0

I need to extract a few values from an XML file, and I stumbled onto XMLStarlet that seems pretty powerful. Knowing little about XML, I'm overwhelmed with the tool, and likely need only a very tiny part of it. I have a file like the following and I wanted to get, say, the following address:

<es:ipAddress>123_Westbrook</es:ipAddress>

How would I type that?

What is the significance of these extra fields like es? I assume this brackets this particular object (the ipAddress value, 123 Westbrook), but what part of the path is actually given to XMLStarlet? The brackets? The parameter name? Separated by slashes?

Perhaps:

xmlstarlet sel '<bulkCmConfigDataFile xmlns:gn="JOE.xsd"> < configDat dnPrefix="Undefined"> < xn:Subnetwork id="Oz"><xn:MeContext id="BANANS"><xn:attributes><es:vsDataMeContext><es:ipAddress>

Which should point to the value 123_Westbrook? Insert slashes? Something else?

The original file is very large, so here's the first part of the XML (many of the closing tags are missing by posting only part of it):

 1 <?xml version="1.0" encoding="UTF-8"?>
 2 <bulkCmConfigDataFile xmlns:un="utranNrm.xsd"
 3     xmlns:es="FRED.99.88.xsd"
 4     xmlns:xn="JIM.xsd" xmlns:gn="JOE.xsd" xmlns="CARL.xsd">
 5     <fileHeader fileFormatVersion="THE_GOOD_OND" vendorName="Mr. Softie"/>
 6     <configData dnPrefix="Undefined">
 7         <xn:SubNetwork id="ROOM_4_MORE">
 8             <xn:SubNetwork id="Oz">
 9                 <xn:attributes>
10                     <xn:userDefinedNetworkType>SECRET_SERVICE</xn:userDefinedNetworkType>                 
11                     <xn:userLabel>OZ</xn:userLabel>
12                 </xn:attributes>
13                 <xn:MeContext id="BANANAS">
14                     <xn:VsDataContainer id="BANANAS">
15                         <xn:attributes> 
16                             <xn:vsDataType>SECRET_SQUIRREL</xn:vsDataType>
17                             <xn:vsDataFormatVersion>GOOD_HUMOR</xn:vsDataFormatVersion>         
18                             <es:vsDataMeContext>
19                                 <es:userLabel>ORANGE</es:userLabel>
20                                 <es:ipAddress>123_Westbrook</es:ipAddress>
21                                 <es:neMIMversion>S-11</es:neMIMversion>
22                                 <es:lostSynchronisation>SYNCHRONISED</es:lostSynchronisation>         
23                                 <es:bcrLastChange>LAST_DATE</es:bcrLastChange>
24                                 <es:bctLastChange>LAST_DATE</es:bctLastChange>
25                                 <es:multiStandardRbs6k>uh-uh</es:multiStandardRbs6k>

gmark

Posted 2016-01-15T00:09:23.467

Reputation: 21

Answers

0

What is the significance of these extra fields like es?

es means that ipAddress comes form FRED.99.88.xsd XML Schema - look at xmlns:es="FRED.99.88.xsd" namespace definition (one of bulkCmConfigDataFile root tag attributes).


I assume this brackets this particular object (the ipAddress value, 123 Westbrook), but what part of the path is actually given to XMLStarlet?

According to XMLStarlet documentation:

sel (or select) - Select data or query XML document(s) (XPATH, etc)

and after xmlstarlet sel --help:

XMLStarlet Toolkit: Select from XML document(s)
Usage: xmlstarlet sel <global-options> {<template>} [ <xml-file> ... ]
where
  <global-options> - global options for selecting
  <xml-file> - input XML document file name/uri (stdin is used if missing)
  <template> - template for querying XML document with following syntax:

<global-options> are:
  -Q or --quiet             - do not write anything to standard output.
  -C or --comp              - display generated XSLT
  -R or --root              - print root element <xsl-select>
  -T or --text              - output is text (default is XML)
  -I or --indent            - indent output
  -D or --xml-decl          - do not omit xml declaration line
  -B or --noblanks          - remove insignificant spaces from XML tree
  -E or --encode <encoding> - output in the given encoding (utf-8, unicode...)
  -N <name>=<value>         - predefine namespaces (name without 'xmlns:')
                              ex: xsql=urn:oracle-xsql
                              Multiple -N options are allowed.
  --net                     - allow fetch DTDs or entities over network
  --help                    - display help

Syntax for templates: -t|--template <options>
where <options>
  -c or --copy-of <xpath>   - print copy of XPATH expression
  -v or --value-of <xpath>  - print value of XPATH expression
  -o or --output <string>   - output string literal
  -n or --nl                - print new line
  -f or --inp-name          - print input file name (or URL)
  -m or --match <xpath>     - match XPATH expression
  --var <name> <value> --break or
  --var <name>=<value>      - declare a variable (referenced by $name)
  -i or --if <test-xpath>   - check condition <xsl:if test="test-xpath">
  --elif <test-xpath>       - check condition if previous conditions failed
  --else                    - check if previous conditions failed
  -e or --elem <name>       - print out element <xsl:element name="name">
  -a or --attr <name>       - add attribute <xsl:attribute name="name">
  -b or --break             - break nesting
  -s or --sort op xpath     - sort in order (used after -m) where
  op is X:Y:Z, 
      X is A - for order="ascending"
      X is D - for order="descending"
      Y is N - for data-type="numeric"
      Y is T - for data-type="text"
      Z is U - for case-order="upper-first"
      Z is L - for case-order="lower-first"
...    

You might use XPath to select XML file element here.


Which should point to the value 123_Westbrook? Insert slashes? Something else?

As your question seems as a kind of homework to me, I just give you:

  • XMLStarlet syntax tip:
    xmlstarlet sel -t <template option> <XPath to es:ipAddress tag> -n <filename.xml>
    use template options containing XPATH.
  • XPath examples and sandbox

g2mk

Posted 2016-01-15T00:09:23.467

Reputation: 1 278

0

To get the value of that element "es:ipAddress" with xmlstarlet:

xmlstarlet sel -t -v '//es:ipAddress'  thefilename.xml

which prints: "123_Westbrook".

PBI

Posted 2016-01-15T00:09:23.467

Reputation: 281