Fraud Prevention using....?

jaxisdex

New Member
Joined
May 26, 2015
Messages
35
Hi guys, I am trying to create a way to check for fraud in our system. We have sales reps that go around different parts of the country and they turn in info about their sales. The problem is some reps have found a way to submit multiple entries that count as sales, when in actuality, these people do not exist. For instance in the customer name field, the reps can put in a "-" or a "Jr." "Sr." or a II/III and the system will accept it. Im looking for a way to check just that one column to see if names are similar or close enough that they look like fraud. I can group the UserIDs to get only the customer names they did, but can not figure out a way to check them. I have tried the Levenshtein Distance only to realize that i must plug in a string text which would make this utterly time consuming. We have almost 15,000 new entries daily. I am still fairly new to Excel and am unsure on how to go about this. Then I tried my hand at a fuzzy lookup and that also was not what i needed.

The second method to detect fraud is the time stamp for each agent. Once again i can group the agents just to see their Create Date Time. I need a way to find if the difference between two of the times is less than 4 minutes. This would also be a alert to something fishy occurring. Does Excel 2013 not have Datediff? I also tried doing =TEXT(E3-E2),"mm") but this returned number values that were under .0008^n. or along those lines of small numbers that couldn't mean minutes. The TransactionID are unique numbers.

This is a sample of how the info looks:Just imagine another 14993 lines of information. If the question is why do we not use some other method of storing and examining the information, the development team is working on something but we need something right now.

[TABLE="width: 774"]
<tbody>[TR]
[TD]User ID[/TD]
[TD]Transaction ID[/TD]
[TD]Application ID[/TD]
[TD]Status ID[/TD]
[TD]Create Date Time[/TD]
[TD]Customer Name[/TD]
[/TR]
[TR]
[TD]PCRE5181[/TD]
[TD]104620811[/TD]
[TD]198042515[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:53:29[/TD]
[TD]jessica adams[/TD]
[/TR]
[TR]
[TD]PCRE5181[/TD]
[TD]104620808[/TD]
[TD]198042516[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:51:33[/TD]
[TD]walter white[/TD]
[/TR]
[TR]
[TD]PCRE70D3[/TD]
[TD]104620798[/TD]
[TD]198042500[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:45:04[/TD]
[TD]robert jordan[/TD]
[/TR]
[TR]
[TD]PCRE5619[/TD]
[TD]104620795[/TD]
[TD]198042505[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:43:29[/TD]
[TD]guangming liu[/TD]
[/TR]
[TR]
[TD]PCRE5181[/TD]
[TD]104620792[/TD]
[TD]198042497[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:41:32[/TD]
[TD]richard nixon[/TD]
[/TR]
[TR]
[TD]PCRE70D3[/TD]
[TD]104620789[/TD]
[TD]198042492[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:40:21[/TD]
[TD]tommy pickles[/TD]
[/TR]
[TR]
[TD]PCRE70C4[/TD]
[TD]104620785[/TD]
[TD]198042489[/TD]
[TD]Completed[/TD]
[TD]2015/05/22 23:38:32[/TD]
[TD]angelo mayo[/TD]
[/TR]
</tbody>[/TABLE]
[TABLE="width: 774"]
<tbody>[TR]
[TD="class: xl65, width: 84"]PCRE5181[/TD]
[TD="class: xl65, width: 131"]104620776[/TD]
[TD="class: xl65, width: 129"]198042472[/TD]
[TD="class: xl66, width: 95"]Completed[/TD]
[TD="class: xl67, width: 151"]2015/05/22 23:33:32[/TD]
[TD="class: xl65, width: 184"]jessica-adams II[/TD]
[/TR]
</tbody>[/TABLE]

Thank for any and all help, and sort of advice/direction is greatly appreciated and I will love you forever!!!
 

Excel Facts

How can you turn a range sideways?
Copy the range. Select a blank cell. Right-click, Paste Special, then choose Transpose.
Hi jaxisdex

If you deduct one entry from 'Create Date Time' from another and multiply the result by 1,440 (60 minutes in an hour x 24 hours in a day), you will get the result expressed in minutes (to a decimal place). Then compare this result to four minutes ...

Hope this helps.

cheers

pvr928
 
Upvote 0
Thank you! That worked perfectly. Wow, i wasn't even considering something that simple. Seriously thank you for your help.

Any idea for the naming portion? Or if there is a way to loop the Levenshtein function to go through each record for the entirety of the column? If not, its cool. Im sure something will pop up! Thank you once again!:)
 
Upvote 0
No worries - glad I could help.

I think the way Excel approaches dates can be quite confusing - all you have to remember is underneath all that formatting is a number. Excel's date series seems to start at 1 January 1990 (number 0) with every date simply part of the number series since that time (ie today, 28 May 2015 is day 42,152). To create hours, minutes and seconds, they are just fractions of that particular date. It took me a while before I intuitively began to understand this concept.

So far as your efforts relating to determining whether a name is fraudulent, I'm not going to pretend I know what a Levenshtein function is.

Have you thought more broadly about how such fraudulent behavior would be manifested? ie the perpetrator is likely to repeat their actions and over time and get lazy (certainly if they think they are getting away with it). How about conducting analysis on the average sales made per rep, using perhaps days of the week, region & season to focus on. You might discover there is a pattern, with any fraudulent behavior falling outside these patterns, ie a very consistent/smooth number of sales for fraudulent reps, whereas they fluctuate for honest ones. Once you've identified anomalies you then go deeper to then compare names, etc.

Excel's formulae and macros could come up with some pretty effective and fast tools to do all that.

Hope that helps.

Cheers

pvr928
 
Upvote 0
Haha, yeah ive been trying to morph it for my needs. Ill be having my friend, who is a developer, look at my code because I can't seem to figure how to loop it to the whole array correctly.

Ah, that time makes a lot more sense now. I've only been using Excel for maybe a month now so still trying to learn all the bells and whistles! The issue about other ways to prevent fraud is the fact we don't have access to all that info yet lol. The client side is building a more detailed BI tool for us to use, so until then im stuck with this. For now ill just use the time difference to build a report on reoccurring culprits.

Thank you so much for your help once again!!!
 
Upvote 0

Forum statistics

Threads
1,223,275
Messages
6,171,121
Members
452,381
Latest member
Nova88

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top