Using Access to analyse verbatim text

simoran2

Board Regular
Joined
Jan 20, 2009
Messages
62
Hello

I have a large amount of customer feedback which I would like to apply some high level categorisation to.

I would like to set up lookup tables that include words specific to a particular theme. I would then like access to search the customer comment field and if it contained any of the words from the table apply the categorisation.

For example i may set up a 'Financial'table that included words such as bill, cost, expensive etc... A given record in the db may contain a comment from the customer saying "The bills are too expensive. I am looking for a lower cost provider" Access would see the words bill, expensive and cost and categorise this as a Financial reason.

I assume I would just build my lookup tables like any other table (I would probably import some predefined lists of words and categories) but what is the expression to get access to check the contents of the text field and cross reference it with the lookup tables?

Hope this makes sense - thanks in advance
 

Excel Facts

Workdays for a market open Mon, Wed, Friday?
Yes! Use "0101011" for the weekend argument in NETWORKDAYS.INTL or WORKDAY.INTL. The 7 digits start on Monday. 1 means it is a weekend.
Well sorry I can't really help.

Like I said this can probably be done with queries, but how I'm not 100% sure.

Even with queries a little bit of code would help a lot.
 
Upvote 0
My approach to this would be to use code also.

When you get customer feedback text, it must be processed to see if it contains any/several
keywords/categories. One feedback text could contain several keywords and could relate to one or more categories.

You'll have to determine which keywords and categories for each feedback text.

There may also be texts where there is a "new word" that should be added to a specific category. Some review process to ensure concepts aren't being missed because of insufficient keywords.

An important point on the set up of this whole thing is to make sure your keyword relates to a specific category. You really don't want keywords to show up in multiple categories. You will also have to deal with spelling errors/typos as well. You may find some "strange keywords" that need to be accommodated (typos) in your keyword to category table.

I think you'll also spend time on developing an approach. Parse the text data for words of 3 or more characters (pick a number), then compare them to your existing keyword/category list....

I really don't think queries only is practical, but they may go a long way to get what you need.

I think this would be a good exercise for learning some vba, looping etc.
 
Upvote 0
Right, I think I've over complicated things a bit.

The first part is quite straightforward and I came up with some queries.

I based them on 2 small tables:

<TABLE border=1 cellSpacing=0 bgColor=#ffffff><CAPTION>tblFreeText</CAPTION><THEAD><TR><TH bgColor=#c0c0c0 borderColor=#000000>Response</TH></TR></THEAD><TBODY><TR vAlign=top><TD borderColor=#eeece1>The bills are too expensive. I am looking for a lower cost provider</TD></TR><TR vAlign=top><TD borderColor=#eeece1>My bill is to expensive and my handset is broken</TD></TR><TR vAlign=top><TD borderColor=#eeece1>My keyboard doesn't work</TD></TR><TR vAlign=top><TD borderColor=#eeece1>I need a lower cost provider</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Cannot connect phone to internet</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Messages and calls not being sent</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Billing too erratic</TD></TR></TBODY><TFOOT></TFOOT></TABLE>

and

<TABLE border=1 cellSpacing=0 bgColor=#ffffff><CAPTION>tblCategoryTerms</CAPTION><THEAD><TR><TH bgColor=#c0c0c0 borderColor=#000000>Category</TH><TH bgColor=#c0c0c0 borderColor=#000000>Term</TH></TR></THEAD><TBODY><TR vAlign=top><TD borderColor=#eeece1>Financial</TD><TD borderColor=#eeece1>bill</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Financial</TD><TD borderColor=#eeece1>billing</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Technical</TD><TD borderColor=#eeece1>handset</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Technical</TD><TD borderColor=#eeece1>screen</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Technical</TD><TD borderColor=#eeece1>keyboard</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Technical</TD><TD borderColor=#eeece1>connect</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Financial</TD><TD borderColor=#eeece1>cost</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Financial</TD><TD borderColor=#eeece1>expensive</TD></TR><TR vAlign=top><TD borderColor=#eeece1>Financial</TD><TD borderColor=#eeece1>Provider</TD></TR></TBODY><TFOOT></TFOOT></TABLE>

And the one of the queries:

SELECT tblFreeText.Response, IIf([response] Like '*' & [term] & '*',[Category],"") AS InCategory, Count(tblCategoryTerms.Category) AS CountOfCategory
FROM tblCategoryTerms, tblFreeText
GROUP BY tblFreeText.Response, IIf([response] Like '*' & [term] & '*',[Category],"")
HAVING (((IIf([response] Like '*' & [term] & '*',[Category],""))<>""))
ORDER BY Count(tblCategoryTerms.Category);


which gives you this.

<TABLE border=1 cellSpacing=0 bgColor=#ffffff><CAPTION>TermCount</CAPTION><THEAD><TR><TH bgColor=#c0c0c0 borderColor=#000000>Response</TH><TH bgColor=#c0c0c0 borderColor=#000000>InCategory</TH><TH bgColor=#c0c0c0 borderColor=#000000>CountOfCategory</TH></TR></THEAD><TBODY><TR vAlign=top><TD borderColor=#d0d7e5>My keyboard doesn't work</TD><TD borderColor=#d0d7e5>Technical</TD><TD borderColor=#d0d7e5 align=right>1</TD></TR><TR vAlign=top><TD borderColor=#d0d7e5>My bill is to expensive and my handset is broken</TD><TD borderColor=#d0d7e5>Technical</TD><TD borderColor=#d0d7e5 align=right>1</TD></TR><TR vAlign=top><TD borderColor=#d0d7e5>Cannot connect phone to internet</TD><TD borderColor=#d0d7e5>Technical</TD><TD borderColor=#d0d7e5 align=right>1</TD></TR><TR vAlign=top><TD borderColor=#d0d7e5>My bill is to expensive and my handset is broken</TD><TD borderColor=#d0d7e5>Financial</TD><TD borderColor=#d0d7e5 align=right>2</TD></TR><TR vAlign=top><TD borderColor=#d0d7e5>I need a lower cost provider</TD><TD borderColor=#d0d7e5>Financial</TD><TD borderColor=#d0d7e5 align=right>2</TD></TR><TR vAlign=top><TD borderColor=#d0d7e5>Billing too erratic</TD><TD borderColor=#d0d7e5>Financial</TD><TD borderColor=#d0d7e5 align=right>2</TD></TR><TR vAlign=top><TD borderColor=#d0d7e5>The bills are too expensive. I am looking for a lower cost provider</TD><TD borderColor=#d0d7e5>Financial</TD><TD borderColor=#d0d7e5 align=right>4</TD></TR></TBODY><TFOOT></TFOOT></TABLE>

Now that gives you the basic data you want but the next part is to make decisions based on that.

eg if there are 2 matched in the Financial category and 2 in the Technical category which applies?

Another consideration with this is the no of records involved.

There were only 7 records in the response table and 9 in the terms table.

In the query I posted if there was no grouping or criteria then 63 (7*9) records would be produced.

So for X responses and Y terms X*Y records would be produced.

The grouping etc obviously reduces that but it's also pretty expensive resource wise.

This is where code could come into it as a decision making tool.
 
Upvote 0
Again guys - appreciate the help with this!

I will give the query ago first and see how I get on but i'll certainly use this as an excuse to start learning some VBA (i have started a thread in the Excel forum asking for good starting points)

Ill come back with my results later
 
Upvote 0
I actually meant VBA I'm Access.

Mind you I suppose you could use both, Access for the 'heavy lifting' and Excel for the analysis.
 
Upvote 0
I actually meant VBA I'm Access.

Mind you I suppose you could use both, Access for the 'heavy lifting' and Excel for the analysis.

Is VBA not the same for both?

Anyway I tried the above using queries but it is becoming apparent this will not really be feasible. Although using a small list of financial terms I did get results, it seemed to duplicate the records (as you mentioned it would...)

Where would I start with a VBA solution on this?
 
Upvote 0
Well VBA is basically the same but both have different object models.

In Excel you have workbooks, worksheets, ranges etc.

In Access you have tables, queries etc.

If you aren't going to be using any of those objects then theoretically you could write the VBA in Word.

(You could actually use the objects in Word via automation but I won't go into that.)

Anyway, back to the query.

I found one way to get rid of the duplicates and display the data.

It was a crosstab query, sort of like a pivot table you would create in Excel.

I don't have it to hand right now but had the text as row headers/items, the categories as column header/items and the count as the data item.

You could try creating it yourself using the query wizard.

Fire that up, selet the crosstab option, select the query that returns the counts and then add the fields as I've described above.

I can post the SQL for it later if you need it, and if I can find it.:)

PS You could actually create a pivot table in Excel with your count query as an external data source.

Not sure how Excel would handle it if there are a vast amount of records though, but pivot tables do have more functionality than crosstab queries.
 
Upvote 0

Forum statistics

Threads
1,224,603
Messages
6,179,855
Members
452,948
Latest member
UsmanAli786

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top