28
7
I have a list of about 10,000 name/addresses that I have to manually cleanse in Excel (fun!). As you can imagine, having to double click to enter almost every cell to edit it is a drag. And it's frustrating that if you are even slightly near the top/bottom of the cell, you end up at the top/bottom of the worksheet!
Is there a way of telling Excel to stay in 'edit mode', so that when I click on a cell (or better yet, scroll down to it), it automatically allows me to edit the content? I can't just start typing, as that overwrite the whole cell, when often times I just need to edit 1 or 2 characters. Thanks.
1It's a dirty workaround, but what about converting it to CSV, splitting with commas and editing with Notepad or any other text editor you like? – gronostaj – 2013-05-20T14:58:38.523
9Are you sure there is no script or function that can do this "cleanse" for you? When faced with a similar situation, I took all the data from excel and successfully ran it through a small regex script. – Moses – 2013-05-20T15:00:14.587
19Definitely look in to what Moses says. If you're doing anything 10,000 times manually then you're doing it wrong. – Tanner Faulkner – 2013-05-20T15:03:15.557
@Moses - Unfortunately no. These entries come from a database, and it's mainly the 'Address Name' field that is mucked up, with the data being very inconsistent. Some say things like 'Smith John Mr & Smith Joan Mrs', which should be 'Mr & Mrs J Smith', while others are blank, etc. This, basically, is the start of a very big project. – David Gard – 2013-05-20T16:09:58.747
Can you tell us what the manual cleanse entails? It may be that some of that could be automated instead – datatoo – 2013-05-20T16:21:30.520
1@datatoo - Names need to be but in a readable way, so that they can be used on addresses. Due to multiple databases being merged in to one over the years (and people being lazy), the data is inconsistent within the cell. I have no reference cell to go by, and once this part of the project is complete, I will be splitting the data in to Salutation, Initials, Surname, etc. It's at this point that I can start to use automation. – David Gard – 2013-05-20T16:25:14.173
A python script for the logic + python NLTK should be able to handle all of the human variations of name entry. Use that to get things into a syntax-free canonical form, and then use the python script to generate the name column in whatever format you want. Or if you know another scripting language, use that instead of python. But python's NLTK does a great job at chunking text. – Clayton Stanley – 2013-05-20T19:50:51.080
@David Gard There are things you can do like Make all Names Uppercase, do countifs on those name fields or concatenated fields to find duplicates, then reconvert to Proper names again, then Advanced filter for uniques. You may eliminate a huge portion of the work. Using formulas or vba – datatoo – 2013-05-22T00:19:13.133