How can I count the fields in a flat file using Notepad++

1

I work with a lot of flat files that are pipe delimited, and look something like:

John|Smith|01|01|1901|123 Road

Is there a way to find out that '1901' is the 5th field of this record based on the pipes ('|')?

This would be so incredibly useful and save me from having to count fields on massive files.

LikestoLearn

Posted 2017-05-12T19:13:48.013

Reputation: 13

You can use PowerShell for this, give it a try and do some research with doing this with PowerShell and RegEx. Look into Get-Content and look at the -Replace "<RegEx>", "<RegEx>" and pipe that over to a Set-Content. You'd want the regex to get everything after the 4th pipe delimiter but before the 5th so everything between those but not including. Just a quick idea without research and based on vague memory with a scripted process I've written in the past. – Pimp Juice IT – 2017-05-12T19:52:36.670

No problem, did you find a suitable solution? – Pimp Juice IT – 2017-05-15T15:32:31.760

Answers

0

To find the field number based on pipe delimited fields using Notepad++ 32-bit.

  1. Install the python plugin. To do so:

    1. From the menu, select Plugins→Plugin Manager→Show Plugin Manager
    2. Check the box for Python Script and click on Install
  2. Create a new script: From the menu, select Plugins→Python Script→New Script

  3. Name the script Pipe Position and select Save
  4. Paste the text below into the editor and save the file
  5. From the menu, select Plugins→Python Script→Scripts→Pipe Position

"""
Notepad++ python script to count field based on delimiter.
Displays a message box based on the cursor position when invoked 
"""
from Npp import *
import re

field_delimiter = '|'
current_pos = editor.getCurrentPos()
line_number = editor.lineFromPosition(current_pos)
line_start = editor.positionFromLine(line_number)
line_end = editor.getLineEndPosition(line_number)
line = editor.getTextRange(line_start, line_end)
index = current_pos - line_start
field_number = line.count(field_delimiter, 0, index) 

notepad.messageBox("'%s' is field # %d" % (
    line.split(field_delimiter)[field_number], field_number + 1))

Stephen Rauch

Posted 2017-05-12T19:13:48.013

Reputation: 2 455

Thank you for your response. After following your instructions, I get an 'Unknown exception' message in the messagebox. I don't believe I missed any points in your instructions, but it's possible that I did. Do you know why I would receive this error when executing your script? – LikestoLearn – 2017-05-15T15:07:38.007

https://www.python.org/ftp/python/2.7.13/python-2.7.13.msi – Stephen Rauch – 2017-05-15T15:47:48.850

1

Is it possible for you to use more suitable tool for the job?

Instead of the N++, just grab some fine CSV reader and check columns nicely in the grid. Some readers are free. The recommended reCsvEditor can not only read, but also edit CSV files, if you need to. The key step in that editor is to select options on file opening screen according to your CSV file: what is the delimiter, whether there are quotes around the values, whether there are line breaks inside the values. Then your CSV file is open correctly. Then the editor will also indirectly count the fields, you can easily see which field is the rightmost. There is not much documentation about the editor, the autor suggests to post a ticket or post to forums.

Also, as an easier alternative, you can try to reach for other CSV editors, if reCsvEditor seems complex. For example, I am just testing UniCsvEd 1.1 which is basically for no money.

miroxlav

Posted 2017-05-12T19:13:48.013

Reputation: 9 376

Thank you for your response. This actually might be my best option as I had not explored options outside of N++. Thanks for your feedback. Do you know where I could get a better understanding of fully utilizing reCsvEditor? Thanks again. – LikestoLearn – 2017-05-15T15:11:44.703

I have adjusted the answer according to your additional questions. – miroxlav – 2017-05-15T19:07:58.223