Raw data--cleansing macro?

tc88

Board Regular
Joined
Jul 6, 2011
Messages
80
Hello all-<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p>
This is my first post here, but I have been browsing for the past couple of weeks at my new job. I am an intern in the accounting/finance department at a publicly traded company here in the US and we are currently working on reducing waste in business processes. One of those processes revolves around exporting data from SAP into Excel, fixing some of the data, and then removing irrelevant data. I am using Excel 2010 on Windows XP SP 3.
The first step I have set up is to check if any of the data needs to be reformatted (it is missing a “0” in the middle of the number). I have the formula as follows, and it works correctly (Column M in dummy data below)
Code:
=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),D2)
<o:p></o:p>

From there, I use a nested IF based on the column for the above formula. That formula is as follows (Column N in dummy data below)
Code:
=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"IRRELEVANT"))
<o:p></o:p>

I have that column conditionally formatted to show “IRRELEVANT” with a red fill, and up to this point I have absolutely no problems and the formulas appear to work correctly. However, from there, I must manually delete all “IRRELEVANT” rows to get out the data for our specific facility. If this were a few rows it wouldn’t be a big deal but the raw data is ~2000 rows and my manually cleansed data is ~600 rows, so as you can see I am manually clearing out quite a bit of data.<o:p></o:p>
This is a process that is done monthly, and while I believe a macro is the best way to do this, I don’t have any experience writing macros/using VBA. I consider myself a strong Excel user, but I am not an expert.<o:p></o:p>

Below is some dummy data so you have a better idea of what I am talking about:

Excel Workbook
BCDEFGHIJKLMN
1Cost ElementNameAuxAcctAs1Name of offsetting accountOffst.acctOrderOffActCoCdVal.in rep.cur.Postg DateCreated onFormatting FixRelevant?
2Facilities10357E EN DUMMY DATA13967010009,956.0005/31/201106/02/2011IRRELEVANT
3Facilities104004-PRJ DUMMY DATA15629510002,645.0005/31/201106/01/2011CTR 1004004CTR 1004004
4Fringe10384 BD DUMMY DATA115746100021.5605/31/201106/02/2011IRRELEVANT
5Fringe10436 RD DUMMY DATA1157461000545.2105/31/201106/02/2011IRRELEVANT
6GeneralJCTE000204193CTR 12708A/P AMCL210036154710S10005.8905/04/201105/05/2011CTR 12708CTR 12708
7GeneralJCTE000204193CTR 1004003A/P AMCL210036156295S100016.6705/04/201105/05/2011CTR 1004003CTR 1004003
8GeneralJCTE000204193CTR 1004003A/P AMCL210036157010S10006.2805/04/201105/05/2011CTR 1004003CTR 1004003
9Other104003_DSTDUMMYDATA1564301000-2,215.6505/31/201106/02/2011CTR 1004003CTR 1004003
10Other12335_DST DISTRDUMMY DATA1569501000-312.4505/31/201106/02/2011IRRELEVANT
Sheet1
Excel 2010
Cell Formulas
RangeFormula
M2=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),D2)
N2=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"IRRELEVANT"))



Thanks in advance for any assistance!
<o:p></o:p>
 
Last edited:
No problem. I wouldn't put it in a sheet, I'd put it in a module.

And the code you'd put on the button click event would be:

Call Foo

Ok, I have my raw data in a Sheet2 in a new workbook. I go into VBA by hitting Alt+F11, and Insert-->Module. I pasted the code in, and when I go back into Excel and select "Macros" under the Developer tab, it shows "Foo" as the only macro.

I assigned that macro to a button, but when I click the button, it says:

"Run Time Error '1004':

Select method of Range class failed"

and gives me the option to End, Help, or Debug.

:confused:
 
Upvote 0

Excel Facts

Test for Multiple Conditions in IF?
Use AND(test, test, test, test) or OR(test, test, test, ...) as the logical_test argument of IF.
hit debeug and post back which line is highlighted.

Code:
Sheet1.Cells.FormatConditions.Delete
Sheet1.Range("N2").Select

The second line "Sheet1.Range("N2").Select" is highlighted, first line is for reference.
 
Upvote 0
bah...error was because of my own laziness to not code the conditional formatting bit correctly. Give me a second to get that sorted and I'll post the fix.
 
Upvote 0
Try this:

Code:
Option Explicit

Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
Sheet2.Cells.Copy Destination:=Sheet1.Range("A1")
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(OR(RIGHT(D2,7)=""1004001"",RIGHT(D2,7)=""1004003"",RIGHT(D2,7)=""1004004""),D2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,IF(OR(A2=""5050"",RIGHT(C2,7)=""charges""),""Check CTR"",""IRRELEVANT""))))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
 
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
 
Sheet1.Cells.FormatConditions.Delete
Sheet1.Range("$N$2:$N$" & lr).FormatConditions.Add Type:=xlExpression, Formula1:= _
        "=IF(R[0]C14=""Check CTR"",TRUE)"
Sheet1.Range("$N2:$N" & lr).FormatConditions(Sheet1.Range("$N2:$N" & lr).FormatConditions.Count).SetFirstPriority
    With Sheet1.Range("$N2:$N" & lr).FormatConditions(1).Interior
        .PatternColorIndex = xlAutomatic
        .Color = 65535
        .TintAndShade = 0
    End With
Sheet1.Range("$N2:$N" & lr).FormatConditions(1).StopIfTrue = False
 
Application.ScreenUpdating = True
End Sub
 
Upvote 0
Try this:

Code:
Option Explicit
 
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
Sheet2.Cells.Copy Destination:=Sheet1.Range("A1")
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(OR(RIGHT(D2,7)=""1004001"",RIGHT(D2,7)=""1004003"",RIGHT(D2,7)=""1004004""),D2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,IF(OR(A2=""5050"",RIGHT(C2,7)=""charges""),""Check CTR"",""IRRELEVANT""))))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
 
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
 
Sheet1.Cells.FormatConditions.Delete
Sheet1.Range("$N$2:$N$" & lr).FormatConditions.Add Type:=xlExpression, Formula1:= _
        "=IF(R[0]C14=""Check CTR"",TRUE)"
Sheet1.Range("$N2:$N" & lr).FormatConditions(Sheet1.Range("$N2:$N" & lr).FormatConditions.Count).SetFirstPriority
    With Sheet1.Range("$N2:$N" & lr).FormatConditions(1).Interior
        .PatternColorIndex = xlAutomatic
        .Color = 65535
        .TintAndShade = 0
    End With
Sheet1.Range("$N2:$N" & lr).FormatConditions(1).StopIfTrue = False
 
Application.ScreenUpdating = True
End Sub

You da man, thanks! Got a new workbook set up with an Instructions tab. Do I need to keep the worksheets named "Sheet1" and "Sheet2"? I would like to make this as foolproof as possible and have a "Raw Data" worksheet and "Cleansed Data" worksheet.
 
Upvote 0
You can change the sheet name, shouldn't be a problem. It all has to do with how you reference the sheet in the code. I prefer to use the sheet's
"code name", which I tend to leave as the default Sheet1, Sheet2, Sheet3, etc. This is different than the sheet's name, which is displayed on the tab. Not sure if that makes sense...have a look at the second paragraph under #10 here:

http://www.techrepublic.com/blog/10...ence-excel-workbooks-and-sheets-using-vba/967

so what you want to do is change the sheet's name to "Raw Data" and "Cleansed Data", which should have no impact on the code. By that I mean, the line:

Code:
Sheet2.Cells.Copy Destination:=Sheet1.Range("A1")

does not need to be changed to:

Code:
Raw Data.Cells.Copy Destination:=Cleansed Data.Range("A1")
 
Upvote 0
You can change the sheet name, shouldn't be a problem. It all has to do with how you reference the sheet in the code. I prefer to use the sheet's
"code name", which I tend to leave as the default Sheet1, Sheet2, Sheet3, etc. This is different than the sheet's name, which is displayed on the tab. Not sure if that makes sense...have a look at the second paragraph under #10 here:

http://www.techrepublic.com/blog/10...ence-excel-workbooks-and-sheets-using-vba/967

so what you want to do is change the sheet's name to "Raw Data" and "Cleansed Data", which should have no impact on the code. By that I mean, the line:

Code:
Sheet2.Cells.Copy Destination:=Sheet1.Range("A1")

does not need to be changed to:

Code:
Raw Data.Cells.Copy Destination:=Cleansed Data.Range("A1")

I did notice that in VBA that it had (Sheet1) or something else in parentheses next to what it was named in Excel. Thanks for the clarification, you have been extremely helpful!
 
Upvote 0

Forum statistics

Threads
1,225,155
Messages
6,183,214
Members
453,151
Latest member
Lizamaison

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top