Counting frequency of comma-delimited tags in Excel

2

I have an excel spreadsheet with a list of seminars and classes that students have taken that (when abridged) looks like:

(A)___ClassName________|(B)_________Tags____________|
Astrobiology           | astro, bio, sci            |
Extremophiles          | chem, bio, sci             |
Human Space Habitation | astro, bio, med, engi, sci |  etc.

I would like to be able to extract individual tags from this and get an associated count for each tag on another sheet, such that for the above:

__Tag__|_Frequency_|
astro  |     2     |
bio    |     3     |
sci    |     3     |
chem   |     1     |  etc.

I have been trying to do this only using functions, and I can get a unique list of tag strings (eg. "astro, bio, sci") using

{=INDEX(User1!Tags,MATCH(0,COUNTIF($A$1:A1,User1!Tags),0))}

but I haven't been able to successfully extract the tags themselves. I would like to keep the file "macro warning"-less, if possible, but I am new to excel, so if I'm going about this the wrong way please let me know!

Is this possible?

rcrdcsnv

Posted 2013-04-01T04:04:32.410

Reputation: 21

Answers

0

I am not completly sure that is what you are asking but i will give it a try. Make different columns near to column B with each column named as "astro" "bio".... And use this formula below each column =FIND($C$1;astro) ex if the word "astro" is in the text it will show a number if not an error and at the end of each column use the =COUNT() it will show you the times astro is sawn at column B.

Hope this helped!!

rexxar

Posted 2013-04-01T04:04:32.410

Reputation: 45

This would work, but I would need a column for every unique tag, right? So far there are 47 unique tags, and that number will increase as users add more entries. Can the process of column adding be automated? – rcrdcsnv – 2013-04-01T20:11:28.230

I am not sure that you can do this without macros and i just started to use them so i can not provide you with a solution – rexxar – 2013-04-02T06:33:03.073

0

Copy your tags column to a new sheet then:

  1. Home > Editing -Find & Select, Replace [space] ('one 'character'), Replace All, OK, Close.
  2. Data > Data Tools - Text to Columns, select Delimited, Next, check Comma, Finish.
  3. Insert two new columns, say A & B.
  4. Create your list of 47 uniques (either from above or 'manually') - say in newly inserted ColumnA, starting Row1.
  5. =COUNTIF(range,A1) in B1 and copy down as required, where range is the array containing all the individual tag entries (probably starting in C1) and is defined with fixed references throughout (ie '$' signs).
  6. Select entire sheet, Copy/Paste Special/Values and delete columns C and to the right, as required.
  7. Save.

The first step may not be required but is intended to remove spaces that would otherwise interfere with the counting.

pnuts

Posted 2013-04-01T04:04:32.410

Reputation: 5 716

0

So I got intrigued by this question and wanted to figure out how to solve it with a macro. I know you said you would like to avoid a macro but I don't believe this can be done with a function alone.

The code below loops over the cell B2:B25 (this can be edited or changed to a parameter or the selected cells but this seemed easiest for now). It uses a custom class named KeyValue to aggregate the tag name and the number of occurrences. This could be improved with the use of a Dictionary object but that requires other add-ins. It separates the comma-delimited tags from each cell and counts the frequency. Then it outputs this list to the first two columns on the second worksheet.

To add the code you must do the following. First on the Workbook you need to enable the Developer toolbar and from there click on the Visual Basic button. Then add a new Class Module and name it KeyValue. Paste in the following code:

Public Key As String
Public Value As Integer

Public Sub Init(k As String, v As Integer)
    Key = k
    Value = v
End Sub

Then on Sheet1 add the following code:

Public Sub CountTags()

    Dim kv As KeyValue
    Dim count As Integer
    Dim tag As String
    Dim tags As New Collection
    Dim splitTags As Variant

    For Each Cell In Sheet1.Range("B2:B25")
        ' Split the comma separated list and process each tag
        splitTags = Split(Cell.Value, ", ")
        For tagIndex = LBound(splitTags) To UBound(splitTags)
            tag = splitTags(tagIndex)

            ' If tag is in collection get new count otherwise start at 1.
            If Contains(tags, tag) Then
                Set kv = tags(tag)
                count = kv.Value + 1
                tags.Remove tag
            Else
                count = 1
            End If

            ' Add tag to the collection with its count.
            Set kv = New KeyValue
            kv.Init tag, count
            tags.Add kv, tag
        Next
    Next Cell

    Dim rowIndex As Integer
    rowIndex = 1

    For Each pair In tags
        Set kv = pair
        Sheet2.Cells(rowIndex, 1) = kv.Key
        Sheet2.Cells(rowIndex, 2) = kv.Value
        rowIndex = rowIndex + 1
    Next pair

End Sub

Private Function Contains(col As Collection, Key As Variant) As Boolean
    Dim obj As Variant
    On Error GoTo err
    Contains = True
    Set obj = col(Key)
    Exit Function
err:
    Contains = False
End Function

Click the Run button to have it count the tags.

Brad Patton

Posted 2013-04-01T04:04:32.410

Reputation: 9 939