Raw data--cleansing macro?

tc88

Board Regular
Joined
Jul 6, 2011
Messages
80
Hello all-<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p>
This is my first post here, but I have been browsing for the past couple of weeks at my new job. I am an intern in the accounting/finance department at a publicly traded company here in the US and we are currently working on reducing waste in business processes. One of those processes revolves around exporting data from SAP into Excel, fixing some of the data, and then removing irrelevant data. I am using Excel 2010 on Windows XP SP 3.
The first step I have set up is to check if any of the data needs to be reformatted (it is missing a “0” in the middle of the number). I have the formula as follows, and it works correctly (Column M in dummy data below)
Code:
=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),D2)
<o:p></o:p>

From there, I use a nested IF based on the column for the above formula. That formula is as follows (Column N in dummy data below)
Code:
=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"IRRELEVANT"))
<o:p></o:p>

I have that column conditionally formatted to show “IRRELEVANT” with a red fill, and up to this point I have absolutely no problems and the formulas appear to work correctly. However, from there, I must manually delete all “IRRELEVANT” rows to get out the data for our specific facility. If this were a few rows it wouldn’t be a big deal but the raw data is ~2000 rows and my manually cleansed data is ~600 rows, so as you can see I am manually clearing out quite a bit of data.<o:p></o:p>
This is a process that is done monthly, and while I believe a macro is the best way to do this, I don’t have any experience writing macros/using VBA. I consider myself a strong Excel user, but I am not an expert.<o:p></o:p>

Below is some dummy data so you have a better idea of what I am talking about:

Excel Workbook
BCDEFGHIJKLMN
1Cost ElementNameAuxAcctAs1Name of offsetting accountOffst.acctOrderOffActCoCdVal.in rep.cur.Postg DateCreated onFormatting FixRelevant?
2Facilities10357E EN DUMMY DATA13967010009,956.0005/31/201106/02/2011IRRELEVANT
3Facilities104004-PRJ DUMMY DATA15629510002,645.0005/31/201106/01/2011CTR 1004004CTR 1004004
4Fringe10384 BD DUMMY DATA115746100021.5605/31/201106/02/2011IRRELEVANT
5Fringe10436 RD DUMMY DATA1157461000545.2105/31/201106/02/2011IRRELEVANT
6GeneralJCTE000204193CTR 12708A/P AMCL210036154710S10005.8905/04/201105/05/2011CTR 12708CTR 12708
7GeneralJCTE000204193CTR 1004003A/P AMCL210036156295S100016.6705/04/201105/05/2011CTR 1004003CTR 1004003
8GeneralJCTE000204193CTR 1004003A/P AMCL210036157010S10006.2805/04/201105/05/2011CTR 1004003CTR 1004003
9Other104003_DSTDUMMYDATA1564301000-2,215.6505/31/201106/02/2011CTR 1004003CTR 1004003
10Other12335_DST DISTRDUMMY DATA1569501000-312.4505/31/201106/02/2011IRRELEVANT
Sheet1
Excel 2010
Cell Formulas
RangeFormula
M2=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),D2)
N2=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"IRRELEVANT"))



Thanks in advance for any assistance!
<o:p></o:p>
 
Last edited:
Yep..try this one:

Code:
Sub Foo()

Dim lr As Long
Dim rng As Range

Application.ScreenUpdating = False

lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),D2)"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,""IRRELEVANT""))"


Set rng = Sheet1.Range("B1:N" & lr)

With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0

    .AutoFilter
End With
Application.ScreenUpdating = True
End Sub

Wasn't sure if you needed the headers put in or not for columns M and N, I put them in.
 
Upvote 0

Excel Facts

Square and cube roots
The =SQRT(25) is a square root. For a cube root, use =125^(1/3). For a fourth root, use =625^(1/4).
Yep..try this one:

Code:
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),D2)"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,""IRRELEVANT""))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
Application.ScreenUpdating = True
End Sub

Wasn't sure if you needed the headers put in or not for columns M and N, I put them in.

Worked like a charm, thank you again for all of your help!!!

One more question: each month this data is pulled out of SAP. As far as running the macro and cleansing the data, would the best way be to create a macro workbook to copy/paste the data in? AFAIK they copy/paste the cleansed data into another worksheet, so it could be copied from the macro workbook.
 
Upvote 0
Yes, I think that having a macro-enabled workbook w/ a single sheet named "SAP Output" would be a perfectly fine way to do it.
 
Upvote 0
Yes, I think that having a macro-enabled workbook w/ a single sheet named "SAP Output" would be a perfectly fine way to do it.

I have just worked with the person who receives the data from me, is it possible once the "IRRELEVANT" data has been deleted to cut and paste the remaining values into column D? Thanks again, you've been a huge help.
 
Upvote 0
BA-BOOM!


Code:
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),D2)"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,""IRRELEVANT""))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With

lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row

Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
Application.ScreenUpdating = True
End Sub
 
Upvote 0
BA-BOOM!


Code:
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),D2)"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,""IRRELEVANT""))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
 
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
Application.ScreenUpdating = True
End Sub

OK, I ended up needing to change the formula around as it wasn't catching everything that is related to our facility based on the raw data.

The new formula for column M is
Code:
=IF(OR(LEFT(C2,6)="104001",LEFT(C2,6)="104003",LEFT(C2,6)="104004"),CONCATENATE("CTR ",LEFT(C2,2),"0",MID(C2,3,4)),IF(OR(LEFT(C2,6)="100404",LEFT(C2,6)="100403"),CONCATENATE("CTR ",LEFT(C2,5),"0",MID(C2,6,1)),D2))

Column N is the same
Code:
=IF(OR(M2="CTR 1004001",M2="CTR 1004003",M2="CTR 1004004"),M2,IF(AND(RIGHT(D2,5)>="12475",RIGHT(D2,5)<="12739"),D2,"NON-SPENCER"))

I tried using your macro to change the code as follows:

Code:
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2)"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,""IRRELEVANT""))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
Application.ScreenUpdating = True
End Sub

However, it comes up as a red "X" error under Microsoft Visual Basic for Applications and has the error message "400" and nothing else.
 
Upvote 0
Interesting...I don't see anything offhand that would cause an error. Did my macro work before you made any changes? And when the error comes up, do you have the option to "debug"?
 
Upvote 0
Also, you were missing a closing ) on your first if formula. This ran without issue on my machine using the test data you posted in your first post:

Code:
Sub Foo()
 
Dim lr As Long
Dim rng As Range
 
Application.ScreenUpdating = False
 
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("M1").Value = "Formatting Fix"
Sheet1.Range("N1").Value = "Relevant"
Sheet1.Range("M2:M" & lr).Formula = "=IF(OR(LEFT(C2,6)=""104001"",LEFT(C2,6)=""104003"",LEFT(C2,6)=""104004""),CONCATENATE(""CTR "",LEFT(C2,2),""0"",MID(C2,3,4)),IF(OR(LEFT(C2,6)=""100404"",LEFT(C2,6)=""100403""),CONCATENATE(""CTR "",LEFT(C2,5),""0"",MID(C2,6,1)),D2))"
Sheet1.Range("N2:N" & lr).Formula = "=IF(OR(M2=""CTR 1004001"",M2=""CTR 1004003"",M2=""CTR 1004004""),M2,IF(AND(RIGHT(D2,5)>=""12475"",RIGHT(D2,5)<=""12739""),D2,""IRRELEVANT""))"
 
 
Set rng = Sheet1.Range("B1:N" & lr)
 
With rng
    .AutoFilter Field:=13, Criteria1:= _
        "IRRELEVANT"
            On Error Resume Next
    .Offset(1, 0).Resize(.Rows.Count - 1, 1). _
            SpecialCells(xlCellTypeVisible).EntireRow.Delete
        On Error GoTo 0
 
    .AutoFilter
End With
lr = Sheet1.Cells.Find(what:="*", searchorder:=xlByRows, searchdirection:=xlPrevious).Row
Sheet1.Range("D2:D" & lr).Value = Sheet1.Range("M2:M" & lr).Value
Application.ScreenUpdating = True
End Sub
 
Upvote 0
Interesting...I don't see anything offhand that would cause an error. Did my macro work before you made any changes? And when the error comes up, do you have the option to "debug"?

I didn't get to try it yesterday before I left the office, but I didn't have the option to debug either.

It looks like macro in the post above me works and got the same row count as when my co-worker completed the process manually. I just need to verify the totals are correct for column J and I should be set for the meeting this afternoon. Thanks again for all of your help, I really appreciate it.
 
Upvote 0

Forum statistics

Threads
1,225,155
Messages
6,183,217
Members
453,151
Latest member
Lizamaison

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top