XML text transformation

0

I have a large file part of which is like below:

<DataGroup xsi:type="ReportDataGroup">
<SmartReportTemplate DescriptionContentType="text/plain"
IsActive="true">
<Name ns1:translate="yes">Agent Summary</Name>
<Defaults type="defaults">
<Title ns1:translate="yes">Agent Summary Report</Title>
<Description ns1:translate="yes"></Description>

Now I need to check for the patterns .*ns1:translate="yes">(.*)</.* and when found I need to add string from the array below this line. Along with the string from the array I need to add the tags <Name xml:lang="ja"> and </Name> around the string obt from the array if the line above has </Name> and need to add <Title xml:lang="ja"> and </Title> tags if the pattern matched line has </Title>

The final output should look like:

<DataGroup xsi:type="ReportDataGroup">
<SmartReportTemplate DescriptionContentType="text/plain"
IsActive="true">
<Name ns1:translate="yes">EM - perc</Name>
<Name xml:lang="ja">\u886815wEM - perce ~~~~~~~~~ ~~~~~~~~~ ~~\u5834</Name>
<Defaults type="defaults">
<Title ns1:translate="yes">AG - Rep</Title>
<Title xml:lang="ja">\u886815wAG - Rep ~~~~~~~~~ ~~~~~~~~~ ~~\u5834</Title>
<Description ns1:translate="yes"></Description>

where the strings "\u886815wEM - perce ~~~~~~~~~ ~~~~~~~~~ ~~\u5834" etc ... are there in an array.

Any idea how to script this? I tried with sed inside a while loop that reads file line by line but it takes a very long time. I tried with awk but I am not able to access the special character array inside awk.

sameera

Posted 2012-06-15T08:50:57.177

Reputation: 1

See http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454 before anyone gets angry with you for wanting to modify XML through regular expressions :-)

– Daniel Andersson – 2012-06-15T08:58:23.960

How are array contents mapped to replacements? Sequentially (i.e. one array item per match, counting upwards?). – kopischke – 2012-06-17T11:01:16.520

@DanielAndersson: humourous value of the linked answer apart, there are situations where a regex can be enough – especially as regular expressions are perfectly able to match (albeit not parse) well-formed XML. It’s all the difference between a robust architecture for this kind of task and a quick hack. As long as you are aware that a hack is just that (at best brittle and at worst erratic), it might be the easier solution to pull a a quick one, compared to, say, XSLT or custom XML processing in a script.

– kopischke – 2012-06-17T11:07:30.387

Answers

0

If a partial solution in vim is acceptable:

:%s/\(.*\(Name\|Title\).*ns1:translate="yes">.*<\/.*\)/<\2 xml:lang="ja">\\u886815wEM - perce \~\~\~\~\~\~\~\~\~ \~\~\~\~\~\~\~\~\~ \~\~\\u5834<\/\2>\r\1/g

escaping \ by \/ and ~ by \~.

\(Name\|Title\) allows matching two patterns and recalling them with \2 here.

Sébastien Guarnay

Posted 2012-06-15T08:50:57.177

Reputation: 21