VBA to Clean Up CSV with Large Blank Spaces and Extra Text Rows

mharper90 · Nov 14, 2023

I have several CSV files that are exported from an air quality monitor. I am trying to extract only the useful values, which to me is any row with either column C, D, or E containing a non-zero number. All of the other rows can be deleted.

Additionally, there are new header rows inserted every 500 rows of data. All of the data rows have a sample number in column A, so that seems like the logical filter criteria for me; if A is a number, keep it, if not, delete the row.

My biggest obstacle right now is that every cell has a large space after the text or number, making the cell not cooperate with normal criteria and functions.

How do I remove all of these large blank spaces (almost like a tab)? And I'd appreciate any help with coming to the end product described (deleting all rows except those which contain a non-zero number in columns C, D, or E). Thanks!

Snapshot below, and a sample of one of my CSVs is at this link: NO - Sample File.csv

alansidman · Nov 14, 2023

Cannot open or view your file in Dropbox. Comes up empty. I have a Dropbox account and it still is not letting me see your file.

Fluff · Nov 14, 2023

I can download the file no problem, although it's 10.5MB

alansidman · Nov 14, 2023

Was able to finally open. Power Query solution below

Power Query:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Filtered Rows" = Table.SelectRows(Source, each ([Column3] <> null)),
    #"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows", [PromoteAllScalars=true]),
    #"Filtered Rows1" = Table.SelectRows(#"Promoted Headers", each ([#"Index DataLog - Reading"] <> "Index DataLog - Reading")),
    #"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1",{{"NO Reading#(tab)", type number}, {"NO STEL Reading#(tab)", type number}, {"NO TWA Reading", type number}}),
    #"Filtered Rows2" = Table.SelectRows(#"Changed Type", each [#"NO Reading#(tab)"] > 0)
in
    #"Filtered Rows2"

Link to file-->NO - Sample File (1).csv | Powered by Box

mharper90 · Nov 14, 2023

alansidman said:

Was able to finally open. Power Query solution below

Power Query:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Filtered Rows" = Table.SelectRows(Source, each ([Column3] <> null)),
    #"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows", [PromoteAllScalars=true]),
    #"Filtered Rows1" = Table.SelectRows(#"Promoted Headers", each ([#"Index DataLog - Reading"] <> "Index DataLog - Reading")),
    #"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows1",{{"NO Reading#(tab)", type number}, {"NO STEL Reading#(tab)", type number}, {"NO TWA Reading", type number}}),
    #"Filtered Rows2" = Table.SelectRows(#"Changed Type", each [#"NO Reading#(tab)"] > 0)
in
    #"Filtered Rows2"

Link to file-->NO - Sample File (1).csv | Powered by Box

Thank you! I'll try this when I get into the office tomorrow and report back. I've never used Power Query, but just did a quick Google search about it. I'm excited to give it a shot and see if I can successfully add this to my project!

kevin9999 · Nov 15, 2023

Not 100% certain what you're looking for as a final product, but please try the following on a copy of your workbook. Just change the sheet name to suit.

VBA Code:

Option Explicit
Sub mharper90()
    Application.ScreenUpdating = False
    Dim ws As Worksheet
    Set ws = Worksheets("NO - Sample File")     '<-- *** Change the sheet name to suit ***
    Dim LRow As Long, LCol As Long, i As Long, r As Range
    LRow = ws.Cells.Find("*", , xlFormulas, , xlByRows, xlPrevious).Row
    LCol = ws.Cells.Find("*", , xlFormulas, , xlByColumns, xlPrevious).Column + 1
    
    Set r = ws.Range(ws.Cells(2, 3), ws.Cells(LRow, 5))
    With r
        .Value = Evaluate("Clean(" & .Address(, , , 1) & ")")
    End With
    
    Dim a, b
    a = r
    ReDim b(1 To UBound(a, 1), 1 To 1)
    For i = 1 To UBound(a, 1)
        If Not IsNumeric(a(i, 1)) Or a(i, 1) + a(i, 2) + a(i, 3) = 0 Then b(i, 1) = 1
    Next i
    
    ws.Cells(2, LCol).Resize(UBound(b, 1)).Value = b
    i = WorksheetFunction.Sum(ws.Columns(LCol))
    If i > 0 Then
        ws.Range(ws.Cells(2, 1), ws.Cells(LRow, LCol)).Sort Key1:=ws.Cells(2, LCol), _
        order1:=xlAscending, Header:=xlNo
        ws.Cells(2, LCol).Resize(i).EntireRow.Delete
    End If
    
    With ws
        .Range("A1").Resize(1, 5).Value = Array("Index DataLog - Reading", "Date/Time", "NO Reading", "NO STEL Reading", "NO TWA Reading")
        .Columns("A:E").AutoFit
    End With
    Application.ScreenUpdating = True
End Sub

VBA to Clean Up CSV with Large Blank Spaces and Extra Text Rows

mharper90

Board Regular

alansidman

Well-known Member

Fluff

MrExcel MVP, Moderator

alansidman

Well-known Member

mharper90

Board Regular

kevin9999

Well-known Member

Similar threads

Share this page

VBA to Clean Up CSV with Large Blank Spaces and Extra Text Rows

mharper90

Board Regular

alansidman

Well-known Member

Fluff

MrExcel MVP, Moderator

alansidman

Well-known Member

mharper90

Board Regular

kevin9999

Well-known Member

Similar threads

Share this page

We've detected that you are using an adblocker.

Which adblocker are you using?

Disable AdBlock

Disable AdBlock Plus

Disable uBlock Origin

Disable uBlock