How can I reformat an incoming dynamic table into a standard format?

0

I need to reformat a connected-data table in one Excel worksheet into a standard table layout in another worksheet. The problem is that the data-in worksheet will have volatile schema, so I can not count on the same cells always containing the same data. For example, when linking to one data-input, I get cell C2 as a total landbase area number, but if I bring in a different data-input, cell C2 is landbase name and C3 is landbase area.

So I need to identify certain cells by the value they contain, then take the next cell over from that and bring it into worksheet2 in a specific place. To make things trickier, I actually have to evaluate two cell values, and when the two values match certain criteria, then take the third cell in that row and use it in the reformatted worksheet.

For example, on the first dataset worksheet one might be:

A1: "Area D"      B1: "1234ha"      C1: "protected"   D1: "1970"
A2: "Area B"      B2: "1876ha"      C2: "unprotected" D2: "1986"
A3: "Area C"      B3: "654ha"       C3: "protected"   D3: "1965"
A4: "Area B"      B4: "5556ha"      C4: "protected"   D4: "2011"

I need the reformatted sheet, regardless of the input schema, to be:

A1: "Area B"      B1: "protected"   C1: "2011"        D1: "5556ha"
A2: "Area B"      B2: "unprotected" C2: "1986"        D2: "1876ha"
A3: "Area C"      B3: "protected"   C3: "1965"        D3: "654ha"
A4: "Area D"      B4: "protected"   C4: "2011"        D4: "1234ha"

or

The second dataset connected to might be:

A1: "2652ha"      B1: "protected"   C1: "Area A"      D1: "1970"
A2: "767ha"       B2: "protected"   C2: "Area E"      D2: "1966"

I need the reformatted sheet, regardless of the input schema, to be:

A1: "Area A"      B1: "protected"   C1: "1970"        D1: "2652ha"
A2: "Area E"      B2: "protected"   C2: "1966"        D1: "767ha"

So in this example I have to identify when "Area B" is valid but then sort out "protected" vs. "unprotected" and apply the proper area cell ("5556ha" or "1876ha") to that row in the reformatted worksheet.

How can I achieve this cell-shuffling when the incoming data structure is varied?

user25644

Posted 2017-09-20T21:55:14.617

Reputation: 85

Is there a first row on the sheet containing the column names? Otherwise it will be difficult. You could add one manually. If you manually add a row with column names to each data source, you should be able to import the values into database. Does it have to imported to spreadsheet? – cybernard – 2017-09-20T22:51:14.133

How many different data sources are there? Just 2? Is the data arrangement different per row or per data source? – cybernard – 2017-09-20T22:56:57.367

With no column names, you could use a VBA solution wherein you define specific patterns that can be used to unambiguously differentiate the four characteristics. Then it is fairly simple to collect and standardize how they should be presented. Depending on how complex the patterns are, you might be able to use the Like operator, or you may have to delve into Regular Expressions. – Ron Rosenfeld – 2017-09-21T01:39:25.527

I find this question baffling. (1) Is my edit right? Because, if it isn’t, I have no idea what you’re saying. (2) If my edit is right, can you please (2a) explain what you mean by “I … have to evaluate two cell values, and when the two values match certain criteria, then take the third cell …”, and (2b) explicitly state what criteria you’re looking for?  Sure, "protected" vs. "unprotected" is fairly obvious, but is A always Area plus one letter?  Is C always a four-digit number?  Is there any pattern to the D values? – Scott – 2017-09-21T03:03:09.050

Yes, there are column names, but these names will move around depending on the schema of each data set. They all should be fairly standardized (ie. "Name", "Area", "Date", "Status") but in one dataset column A might be "Name" but in another it might be "Date". – user25644 – 2017-09-22T17:49:24.617

There are multiple datasets, in the hundreds. Data arrangement difference is per dataset NOT per row. – user25644 – 2017-09-22T17:50:45.613

Has to be added into a spreadsheet in order to be reformatted into a single row per dataset, then each of these single-row datasets are joined into ESRI's ArcMap – user25644 – 2017-09-22T17:51:59.047

to explain the evaluation further: each dataset will need two fields to create a key that then defines the third data that should go with them. So in the example above we have two "Area B" values for a 'Name' field, so we also have to evaluate 'Status' to delineate "protected" from "unprotected" and in the resulting reformatting put the proper 'Area' with the proper 'Status'. So the resulting single-row reformat would have field and values: 'name_unprotected' = "Area B", 'unprotected_area' = "1876ha", 'name_protected' = "Area B", 'protected_area' = "5556ha" – user25644 – 2017-09-22T17:58:09.877

A is not always "area" plus one letter, that was just for this example. Actual values are operational area names like "Ghost Creek", "Martin Canyon", etc. C is not always a four digit number, and since the schema of the columns changes from dataset to dataset it could be 'Name', 'Area', 'Status', etc. No pattern to the D values, as again the schema might change – user25644 – 2017-09-22T18:10:42.720

No answers