1

I have tried various different combos and already checked other similar posts, but can't find my answer. I have a .config file and I need to exclude the comments from the file when I do my search, something like below:

(Get-Content C:\Path\File.config -Raw) | Select-String  '(<!--((?!-->)(.|\n))*-->)'  -AllMatches

I have tried the below regex as well:

(?smi)^\<!--.*?--\>?

Both of these work on regex101 and regex.net, but does not work at all with Powershell. This is how my .config file looks:

Test

<!--<add name=                                />
    <add name=                                />
    <add name=                                />-->
    <add name=                                />

<!--<add name=                                />
    <add name=                                />-->

Test
Test

I have made sure I am using -Raw with my get-content and also tried out-string. These regex work everywhere except, powershell. Your help is duly appreciated

nav
  • 33
  • 1
  • 7
  • The above command **must select** the comment so it's in the output. Please try my RegEx with exactly this command: `(Get-Content C:\Path\File.config -raw) -replace '(?smi)^\ – LotPings Aug 15 '17 at 22:19
  • I tried this, but it outputs the entire file as it is. Wouldn't replace be used when you are trying to replace certain text? I am just trying to display and then use this data to create a csv. I don't wanna modify the existing file at all. Thanks for your help and clarification. – nav Aug 16 '17 at 14:01
  • That's right, I use replace to cut the match off the output, replacing with nothing. Here on my Win10Pro with PSversion 5.1.15063.502 it works with my dummy file you saw on RegEx101. – LotPings Aug 16 '17 at 14:10
  • This is super bizarre. When I run this, it outputs the entire file as it is. I am also on Win10Pro using PS ISE. – nav Aug 16 '17 at 14:18

2 Answers2

3

Why are you trying to parse XML manually using Regex when Powershell has a perfectly good XML parser built-in?

To remove comments from an existing XML file, parse the file, find all the comments with an XPath expression, remove them, and save the file back out like this:

$xml = [xml](Get-Content C:\Path\File.config)
# You might need to tweak the XPath expression for your file,
# but this works for me on a random .NET app.config
$comments = $xml.SelectNodes('descendant::comment()')
$comments | %{ $_.ParentNode.RemoveChild($_) | out-null }
$xml.Save('C:\Path\File-output.config')

But it sounds like even that is overkill for what you're trying to do.

I am trying to output the file without any comments. Then, I will play with this output without comments and create a csv using this data.

In that case, why not just play with the parsed XML directly and just ignore the comments. Once you've parsed the file using the first line from the example above, you've got a perfectly good XML object with all the data that you can query, manipulate, and output as CSV. You shouldn't need to export the file without comments first.

If you need some help figuring out how to work with XML data in Powershell, there are literally thousands of articles online that can help. Google is your friend here.

Ryan Bolger
  • 16,472
  • 3
  • 40
  • 59
1

(<!--((?!-->)(.|\r\n))*-->) worked in Notepad++ on Windows for me.

I believe \n is required on certain OSes and most web sites but \r\n is required on Windows. Apparently, each OS handles newlines slightly differently.

 

Update: 2017/08/16 12:39

This seemed to work for me: (Get-Content C:\Path\File.config -Raw) | Select-String '(<!--((?!-->)(.|\n))*-->)' -AllMatches | ForEach { $_.Matches.Value }

mythofechelon
  • 877
  • 3
  • 22
  • 38
  • Thanks for your feedback, but this didn't work :( It doesn't work for me on Windows 10, Powershell v5 via PS ISE – nav Aug 15 '17 at 19:59
  • Refer to my update on 2017/08/16 at 12:39. – mythofechelon Aug 16 '17 at 11:39
  • WOW! this is the closest I have come so far. I tried this and it outputs all the comments, but I am looking for the opposite. I need to display everything except the comments, but this gives me all the comments. I tried following, but no luck: (Get-Content C:\File.config -Raw) | Select-String '()(.|\n))*-->)' -NotMatch | ForEach { $_.Matches.value} Can we solve the final mystery? – nav Aug 16 '17 at 13:57
  • What are you trying to achieve? Are you trying to remove all lines that contain ` – mythofechelon Aug 16 '17 at 14:09
  • No, I am trying to output the file without any comments. Then, I will play with this output without comments and create a csv using this data. I am trying to get output of my file without any comments in it. This one gives me the output of the conifg file with just the comments, but I want the reverse of it. Display the file content without any comments included. Thanks for your help. – nav Aug 16 '17 at 14:15