Using PowerShell to replace individual bytes in a document

1

We're in the process of converting an HTML help system to a SharePoint document library. We have about 3000 individual HTML documents that we're converting to Word. About 20% of the documents include hyperlinks to related documents, and they're all relative links.

We're trying to automate the process of parsing each document, and in any instance where we have a hyperlink, replacing the last three bytes of the string--"htm"--with "doc".

I've seen some PowerShell samples where people are parsing documents (usually server logs) looking for specific bits of info, but have not been able to find anything about replacing specific characters in a file before closing/saving it.

Does anyone have any tips to achieve this with PowerShell? or ideas about more appropriate tools?

dwwilson66

Posted 2012-11-14T15:26:38.690

Reputation: 1 519

Answers

2

Simply open the file, replace "htm" to "doc" then save and close this file:

Get-ChildItem -Path . -Recurse | 
Where-Object {-not $_.PSIsContainer} | 
Foreach-Object { 
    (Get-Content $_) -replace "htm", "doc" | Set-Content $_
}

thane

Posted 2012-11-14T15:26:38.690

Reputation: 1 627