Raw data--cleansing macro?

tc88

Board Regular
Joined
Jul 6, 2011
Messages
80
Hello all-<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p>
This is my first post here, but I have been browsing for the past couple of weeks at my new job. I am an intern in the accounting/finance department at a publicly traded company here in the US and we are currently working on reducing waste in business processes. One of those processes revolves around exporting data from SAP into Excel, fixing some of the data, and then removing irrelevant data. I am using Excel 2010 on Windows XP SP 3.
The first step I have set up is to check if any of the data needs to be reformatted (it is missing a “0” in the middle of the number). I have the formula as follows, and it works correctly (Column M in dummy data below)
Code:
=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),D2)
<o:p></o:p>

From there, I use a nested IF based on the column for the above formula. That formula is as follows (Column N in dummy data below)
Code:
=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"IRRELEVANT"))
<o:p></o:p>

I have that column conditionally formatted to show “IRRELEVANT” with a red fill, and up to this point I have absolutely no problems and the formulas appear to work correctly. However, from there, I must manually delete all “IRRELEVANT” rows to get out the data for our specific facility. If this were a few rows it wouldn’t be a big deal but the raw data is ~2000 rows and my manually cleansed data is ~600 rows, so as you can see I am manually clearing out quite a bit of data.<o:p></o:p>
This is a process that is done monthly, and while I believe a macro is the best way to do this, I don’t have any experience writing macros/using VBA. I consider myself a strong Excel user, but I am not an expert.<o:p></o:p>

Below is some dummy data so you have a better idea of what I am talking about:

Excel Workbook
BCDEFGHIJKLMN
1Cost ElementNameAuxAcctAs1Name of offsetting accountOffst.acctOrderOffActCoCdVal.in rep.cur.Postg DateCreated onFormatting FixRelevant?
2Facilities10357E EN DUMMY DATA13967010009,956.0005/31/201106/02/2011IRRELEVANT
3Facilities104004-PRJ DUMMY DATA15629510002,645.0005/31/201106/01/2011CTR 1004004CTR 1004004
4Fringe10384 BD DUMMY DATA115746100021.5605/31/201106/02/2011IRRELEVANT
5Fringe10436 RD DUMMY DATA1157461000545.2105/31/201106/02/2011IRRELEVANT
6GeneralJCTE000204193CTR 12708A/P AMCL210036154710S10005.8905/04/201105/05/2011CTR 12708CTR 12708
7GeneralJCTE000204193CTR 1004003A/P AMCL210036156295S100016.6705/04/201105/05/2011CTR 1004003CTR 1004003
8GeneralJCTE000204193CTR 1004003A/P AMCL210036157010S10006.2805/04/201105/05/2011CTR 1004003CTR 1004003
9Other104003_DSTDUMMYDATA1564301000-2,215.6505/31/201106/02/2011CTR 1004003CTR 1004003
10Other12335_DST DISTRDUMMY DATA1569501000-312.4505/31/201106/02/2011IRRELEVANT
Sheet1
Excel 2010
Cell Formulas
RangeFormula
M2=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),D2)
N2=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"IRRELEVANT"))



Thanks in advance for any assistance!
<o:p></o:p>
 
Last edited:
Great, glad I could help! Feel free to post back should you need anything else.

I'm back! After testing the macro out on a couple of previous month's data, I found I needed to add another IF statement. Apologies, it's a long function.

I have rewritten the macro code as follows, and it runs OK, but is it possible to set the macro to conditionally format anything with "Check CTR" as a bright yellow fill or something? Fill or text color, doesn't matter, just something so that it sticks out clearly.

Code:
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(OR(RIGHT(D2,7)=""1004001"",RIGHT(D2,7)=""1004003"",RIGHT(D2,7)=""1004004""),D2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,IF(OR(A2=""5050"",RIGHT(C2,7)=""charges""),""Check CTR"",""IRRELEVANT""))))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
Application.ScreenUpdating = True
End Sub
 
Upvote 0

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
So the "Check CRT" shows up in column 'N' correct? And do you want conditional formatting?
 
Upvote 0
See if this works:

Code:
Option Explicit

Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(OR(RIGHT(D2,7)=""1004001"",RIGHT(D2,7)=""1004003"",RIGHT(D2,7)=""1004004""),D2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,IF(OR(A2=""5050"",RIGHT(C2,7)=""charges""),""Check CTR"",""IRRELEVANT""))))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value

Sheet1.Range("N1:N" & lr).FormatConditions.Add Type:=xlExpression, Formula1:= _
        "=""Check CTR"""
Sheet1.Range("N1:N" & lr).FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
    With Sheet1.Range("N1:N" & lr).FormatConditions(1).Interior
        .PatternColorIndex = xlAutomatic
        .Color = 65535
        .TintAndShade = 0
    End With
Sheet1.Range("N1:N" & lr).FormatConditions(1).StopIfTrue = False

Application.ScreenUpdating = True
End Sub

Conditional formatting isn't my strong suit, but it should work...
 
Upvote 0
See if this works:

Code:
Option Explicit
 
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(OR(RIGHT(D2,7)=""1004001"",RIGHT(D2,7)=""1004003"",RIGHT(D2,7)=""1004004""),D2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,IF(OR(A2=""5050"",RIGHT(C2,7)=""charges""),""Check CTR"",""IRRELEVANT""))))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
 
Sheet1.Range("N1:N" & lr).FormatConditions.Add Type:=xlExpression, Formula1:= _
        "=""Check CTR"""
Sheet1.Range("N1:N" & lr).FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
    With Sheet1.Range("N1:N" & lr).FormatConditions(1).Interior
        .PatternColorIndex = xlAutomatic
        .Color = 65535
        .TintAndShade = 0
    End With
Sheet1.Range("N1:N" & lr).FormatConditions(1).StopIfTrue = False
 
Application.ScreenUpdating = True
End Sub

Conditional formatting isn't my strong suit, but it should work...

Sorry for the delay, I have a couple of meetings this morning but will be able to give it a try before lunch and give you an update.
 
Upvote 0
Sorry for the delay, I have a couple of meetings this morning but will be able to give it a try before lunch and give you an update.

It ran the macro, but came back with a "Subscript out of range", and the conditional formatting for "Check CTR" did not work.
 
Upvote 0
what line is highlighted when you click "debug"?

When I get down to:

Code:
Sheet1.Range("N1:N" & lr).FormatConditions(Selection.FormatConditions.Count).SetFirstPriority
    With Sheet1.Range("N1:N" & lr).FormatConditions(1).Interior
        .PatternColorIndex = xlAutomatic
        .Color = 65535
        .TintAndShade = 0
    End With

It highlights that first line that ends with "SetFirstPriority", and when I hit F8 again it says:

Code:
Run-time Error '9': Subscript out of range
 
Last edited:
Upvote 0
OK, try this:

Code:
Option Explicit
 
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(OR(RIGHT(D2,7)=""1004001"",RIGHT(D2,7)=""1004003"",RIGHT(D2,7)=""1004004""),D2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,IF(OR(A2=""5050"",RIGHT(C2,7)=""charges""),""Check CTR"",""IRRELEVANT""))))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row

Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value

Sheet1.Cells.FormatConditions.Delete
Sheet1.Range("N2").Select
Sheet1.Range("$N$2:$N$" & lr).FormatConditions.Add Type:=xlExpression, Formula1:= _
        "=IF($N2=""Check CTR"",TRUE)"
Sheet1.Range("$N2:$N" & lr).FormatConditions(Sheet1.Range("$N2:$N" & lr).FormatConditions.Count).SetFirstPriority
    With Sheet1.Range("$N2:$N" & lr).FormatConditions(1).Interior
        .PatternColorIndex = xlAutomatic
        .Color = 65535
        .TintAndShade = 0
    End With
Sheet1.Range("$N2:$N" & lr).FormatConditions(1).StopIfTrue = False
 
Application.ScreenUpdating = True
End Sub

I'm aware that the "Sheet1.Range("N2").Select" isn't the best practice, but for the life of me, I can't figure out how to apply conditional formatting reliably without selecting the first cell.

If anyone else is still following this thread and can chime in with a non-Sheet1.Range("N2").Select solution, I'd appreciate it for my own education.
 
Upvote 0

Forum statistics

Threads
1,225,155
Messages
6,183,217
Members
453,151
Latest member
Lizamaison

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top