Iterative Find (from list) of parent tag to replace child tag value

0

I have a xml file with simple tag hierarchy as such:

<Parent Numbrt="X1">
    <Namedchild>Yes</Namedchild>
    ....more children...
<Parent Number="X2">
    <Namedchild>Yes</Namedchild>
    ....more children...
x10000 lines

I need to find the tag based on it's parameter value and replace the child tag value with Z based on a list (can be csv or otherwise) like this:

Parent  New-Child-Value
X1      No
X4      No
X5      No
etc

for about 800 matches in the table.

The result would be that every parent I have in the source list has "No" as the child tag value in the original xml.

I'm most familiar with python

MartinE

Posted 2019-10-08T18:14:30.390

Reputation: 1

1NEVER parse XML with regex, see: https://stackoverflow.com/a/1732454/372239 – Toto – 2019-10-08T18:21:47.077

@Toto what tool/method would you suggest to get the desired output? – MartinE – 2019-10-08T18:40:06.517

Use a programing language that can parse XML, for example, with PHP: https://stackoverflow.com/q/3577641/372239

– Toto – 2019-10-08T18:45:32.750

With Python, use https://stackoverflow.com/q/11709079/372239

– Toto – 2019-10-09T10:00:02.343

Answers

0

Use . In your case XSLT-2.0 would be sufficient. And it can be executed from PHP.

So, you can use the following XSLT-2.0 code to achieve your goal.
Use this mapping file called a.txt (adjust the RegEx if you're using a different format):

Parent  New-Child-Value
X1      No
X4      No
X5      No

and this XSLT-2.0 file in the same directory

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" /> 
  <xsl:output method="xml" indent="yes" /> 
    <xsl:variable name="fileName" select="'a.txt'" />
    <xsl:variable name="csv" select="unparsed-text($fileName)" />
    <xsl:variable name="input" select="tokenize($csv, '\r?\n')[normalize-space()][position() > 1]"/>
    <xsl:variable name="replace">
      <xsl:for-each select="$input">
        <xsl:analyze-string select="." regex='(.+?)\s+(.+)'>
            <xsl:matching-substring>
                <replace>
                    <this><xsl:value-of select="regex-group(1)" /></this>
                    <that><xsl:value-of select="regex-group(2)" /></that>
                </replace>
            </xsl:matching-substring>
            <xsl:non-matching-substring>
                <xsl:message terminate="yes">REPLACEMENT FILE HEADER is not well-formed!</xsl:message>
            </xsl:non-matching-substring>
        </xsl:analyze-string>
      </xsl:for-each>
    </xsl:variable>

    <!-- Identity template -->
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*" />
        </xsl:copy>
    </xsl:template>  

    <!-- Replace matching values -->
    <xsl:template match="Namedchild[$replace/replace[this = current()/../@Number]]">
        <xsl:copy>
            <xsl:value-of select="$replace/replace[this = current()/../@Number]/that" />
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

with this XML file (I fixed an attribute's value to Number and added a root element compared to your sample code):

<root>
    <Parent Number="X1">
        <Namedchild>Yes</Namedchild>
        ....more children...
    </Parent>
    <Parent Number="X2">
        <Namedchild>Yes</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
    <Parent Number="X3">
        <Namedchild>Yes</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
    <Parent Number="X4">
        <Namedchild>Yes</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
    <Parent Number="X5">
        <Namedchild>Yes</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
</root>

Its output (with an XSLT-2.0 processor) is:

<root>
   <Parent Number="X1">
      <Namedchild>No</Namedchild>
        ....more children...
    </Parent>
   <Parent Number="X2">
      <Namedchild>Yes</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
   <Parent Number="X3">
      <Namedchild>Yes</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
   <Parent Number="X4">
      <Namedchild>No</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
   <Parent Number="X5">
      <Namedchild>No</Namedchild>
        ....more children...
        x10000 lines
    </Parent>
</root>

This is the way to go. Use an XSLT-2.0 processor for Python to achieve this result.

This way is standard-conform and bullet-proof.

zx485

Posted 2019-10-08T18:14:30.390

Reputation: 2 008