VBA to normalize files in a folder

BBalazs

New Member
Joined
Apr 6, 2013
Messages
11
First of all, thank you for all the support I got here during my previous question. I much appreciate it.

I am still learning VBA, slowly getting the hang of it, but now I've ran across a complex problem which I have just no idea how to handle. Here is how it goes.

I have my workbook stored in "myDir" and a subfolder, myDir & "\CSV" contains source files which I will be using later. The problem is that the source files are updated, but not overwritten automatically. Therefore, the folder contains files like this:
Apples_Report.csv
Oranges_Report.csv
Oranges_Report-1.csv
Pears_Report.csv
Pears_Report-1.csv
Pears_Report-2.csv

I will have to clean this folder using VBA. Here is the logic of what I'd like to do (as this is not real code, I'm not sure if I need to wrap it into CODE tags, sorry about that):

Code:
Iterate for All Files

Get CurrentFilename
id = 1
If (CurrentFilename & "-" & id & ".csv") is Found
Then Delete CurrentFilename
Else Next File

mainiter:
id = id+1
If (CurrentFilename & "-" & id & ".csv") is Found
Then Delete CurrentFilename & "-" & id-1 & ".csv"; GoTo mainiter
Else Rename File CurrentFilename & "-" & id & ".csv" To CurrentFilename & ".csv"; Next File



Something like that... Of course that riddle is obviously ambiguous and has logical loopholes but technically:

For files that have 1 instance only, leave them intact
For files that have more recent instances, leave only the file with the largest suffix, delete the others and rename it to a suffix-free version.
Do this for all files in the CSV subfolder.


I know I should perhaps be using the FileSystemObject, but I have little experience with that. Do you thing you can help me out?
 
Last edited:

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
I myself would tend to use FSO, and I can probably come up with a script that'll do what you need, but a few points I'd need to know in order to code the script properly:


  • Is there a standard nomenclature for the file names created? Such as they will all end with "Report" or some other identifier?
  • Will the file names have dashes and numbers in them?
  • Does the user have the ability to delete from that folder? In other words, could we potentially have a series of files that look like "Oranges_Report.csv", and then "Oranges_Report-2.csv" with out the "-1" version?

Give me a little bit and I can probably come up with majority of a shell for the process and then it can be tweaked once I receive the answers to the questions above.
 
Upvote 0
Thank you for your help. I will try to give you as much info as I can, and I hope this example will help me familiarise myself with the FSO as well.

The nomenclature for the files will read like the following:
[Name]_Report[-x].csv
Where [name] is a variable element (short company name), "_Report" is fixed, and [-x] appears when the automated process downloads the file to the location and there already exists one with the same name. 'x' represents an incrementing number.

File names will NOT have dashes or underscores (save the ones you can see above in _Report and -x), but may contain extra numbers in the [name] field. For example, we already have a file called 123_Report.csv (as that is the name of the company) and so we will have to cope with 123_Report-1.csv, 123_Report-2.csv etc.

The folder is not locked, but there will not be user intervention. The case you mentioned will not happen, and there is a strict rule for the following increment: No suffix, -1, -2, -3, etc.

Please do ask whatever information you require, and I'll try to be of as much assistance as possible.

Thanks!
 
Upvote 0
Great! That nomenclature will be perfect as there will be a hard rule of having "_Report" before the dot and the file extension (for the "normal" files and not the incrementals). Makes life a lot easier!

You mentioned the folder is wide open, but does not have direct user intervention. So, does that mean also that we won't have a situation where you'd have a "Oranges_Report-1.csv" and "Oranges_Report-2.csv" file without the "main" "Oranges_Report.csv" file in the folder (assuming the maintenance process runs correctly)

I don't mean to by picky, but the algorithm will be written in such a way that it'll have to make certain assumptions and rules around the format of the files in the folder.

Another question -- does the user (or some type of background process) open the "main" files in that folder? I was wondering what problems we may run into if we try to delete a "main" file and it's open or otherwise in use. That may have to be accounted for in the script. Perhaps we could just skip that file and move on, but that just adds another layer of complexity...

I ended up getting tied up yesterday afternoon but should have some time to write up something later this morning...
 
Upvote 0
Okay, so here's a starting point...it doesn't have any "file in use" checking when running the deletes, but it's really close.

Let me know how this looks so far:

Code:
Public FSO As New Scripting.FileSystemObject
Const ReportFolder = "C:\Temp\SubFolder\" ' list folder containing report CSV files


Public Sub CleanupReportFiles()
    Dim oFolder As Scripting.Folder, oFile As Scripting.File


    Set oFolder = FSO.GetFolder(ReportFolder) ' intialize report folder variable
    
    ' loop through each file in report folder and identify "main" report file
    For Each oFile In oFolder.Files
        If Right(oFile.Name, 11) = "_Report.csv" Then
            ' found "main" file -- process it
            DeleteIncrementalFiles oFile.Name
        End If
    Next oFile
    
    ' object cleanup
    Set oFile = Nothing
    Set oFolder = Nothing
End Sub


Private Sub DeleteIncrementalFiles(FullReportName As String)
    Dim oFolder As Scripting.Folder, oFile As Scripting.File, ReportName As String
    Dim DashPosition As Long, DotPosition As Long
    Dim CurrentIncremental As Long, MaxIncremental As Long
    
    ReportName = Left(FullReportName, Len(FullReportName) - 4) ' obtain just the report name without dot/file ext


    Set oFolder = FSO.GetFolder(ReportFolder) ' intialize report folder variable


    ' loop through each file in report folder and identify "main" report file
    For Each oFile In oFolder.Files
        If Left(oFile.Name, Len(ReportName)) = ReportName Then ' check to see if we are focused on a related report
            If oFile.Name <> FullReportName Then ' skip the main report file itself
                DashPosition = InStrRev(oFile.Name, "-") ' determine dash position is for incremental after report name
                DotPosition = InStrRev(oFile.Name, ".") ' determine dot position for beginning of file extention
                CurrentIncremental = Mid(oFile.Name, DashPosition + 1, DotPosition - DashPosition - 1) ' extract incremental valoue from file name
                
                ' Determine if the incremental value for the current file being processed is greater than the
                ' max (last) incremental value for all report files in that group.  This is done since we cannot
                ' depend on the sort order from the population of the oFolder object
                If CurrentIncremental > MaxIncremental Then
                    MaxIncremental = CurrentIncremental
                End If
            End If
        End If
    Next oFile
    
    ' only cleanup files which have at least one incremental
    If MaxIncremental > 0 Then
        FSO.DeleteFile ReportFolder & FullReportName, True ' delete main report file
        
        ' loop through all incrementals with the exception of the most recent / max incremental
        For CurrentIncremental = 1 To MaxIncremental - 1
            FSO.DeleteFile ReportFolder & ReportName & "-" & CurrentIncremental & ".csv", True
        Next CurrentIncremental
        
        ' move (rename) max incremental file name to main report name
        FSO.MoveFile ReportFolder & ReportName & "-" & MaxIncremental & ".csv", ReportFolder & FullReportName
    End If
    
        ' object cleanup
    Set oFile = Nothing
    Set oFolder = Nothing
End Sub
 
Upvote 0
The code itself looks great, and don't worry about those assumptions - the naming convention should be rigid and people won't be tampering with those files.

I've tried to install the macro into the workbook, but unfortunately I ran into some problems. Here's what I did:
- Referenced the Scripting Runtime
- Copied your code into a new module, Module3 for now
- Added Call CleanupReportFiles to where it is appropriate (in ThisWorkbook, see code)
Code:
Private Sub Workbook_Open()

MSG1 = MsgBox("Update data from source files?", vbYesNo, "Data Update")

If MSG1 = vbYes Then
Application.ScreenUpdating = False
Call Clear
Call CleanupReportFiles
Call ConvertAllToUTF8
Call ImportCSV
Call RefreshActiveDataset
Call AddMeasures
Call ResetCharts
Application.StatusBar = False
Application.ScreenUpdating = True
MsgBox "Done."
Else
  MsgBox "Data not updated."
End If

End Sub

- And opened the spreadsheet, clicked yes.

Everything was fine except that nothing happened, almost as if your script was ignored/skipped.
Now please note that I actually created a C:\Temp\SubFolder\ folder for testing purposes and copied the files over there. Nothing happened to them. I tried going a step further by calling CleanupReportFiles from the Macros panel. I got a busy mouse cursor for a quarter of a second, but nothing else happened, the files in the temp folder were left intact.

Can you please help me figure out why didn't the code execute? I got no errors, almost as if the process just "skipped" your code.
Oh, and one last thing - I've had trouble passing ThisWorkbook.Path & "\CSV\" to the Const ReportFolder - as it is a Const, it makes sense. Is there a way to make a dynamic reference however? As the workbook might be moved along with the source data folder, this would be an important addition.

Thanks for all your help, I much appreciate it. Just by reading your code I think I'm slowly getting a hang of the very basics of FSO.
 
Upvote 0
Please disregard my last post, I replied too soon. I made a silly mistake while inserting the code, that is why it did not execute. Can an administrator/moderator alter my previous post so as not to be confusing?

I will return with proper feedback soon.
 
Upvote 0
Yes, first move is to add Microsoft Scripting Runtime to your Project References. Otherwise, you would have had a compile error.

So, do you know how to debug scripts that are executing? I would suggest putting a break point (F9) at the "Call CleanupReportFiles" line. Then, the execution should go into the CleanupReportFiles() Public Sub once you hit F8 to step into the routine. Keeping hitting F8 and see what happens and perhaps something would jump out at you as the execution runs through the various loops. You can hover over the variables and object names to get a little tooltip of the current value...or alternatively, you can turn on the Locals window to see what's going on too.

Off hand, I don't see anything wrong with how you implemented or are calling the routine. Maybe stepping through the code will point you in the right direction.

Concerning assigning the constant, you cannot dynamically assign the value. However, if you are needed the path to be dynamic based on the path of the current workbook, then I would move that constant into the main CleanupReportFiles() routine and change it from a constant to just a normal string variable. Then, you can assign the value dynamically...
 
Upvote 0
Please disregard my last post, I replied too soon. I made a silly mistake while inserting the code, that is why it did not execute. Can an administrator/moderator alter my previous post so as not to be confusing?

I will return with proper feedback soon.
Cool -- I was scratching my head trying to figure out what was going on! :laugh:
 
Upvote 0
So now for some proper feedback. I'd be very happy if someone with appropriate privileges could please delete my previous two posts to maintain thread transparency.

I ran into some case sensitivity issues (with report being spelled with a non-capital r), but having corrected that, the code checks out and works fine. This won't be an issue in the future, the code can remain case sensitive, it's not an issue from now on.

I would like to pose one small feature request, if feasible. Instead of a Const ReportFolder it would be important to have a dynamic reference to ThisWorkbook.Path & "\CSV\". However being a Const I can't add a variable like that. Is there a way to turn that into a non-absolute reference?

Edit: So we cross-posted one-another bigtime :D Thanks for answering my question, I will fiddle around with that and come back if I fail miserably :) Thanks for your help!
 
Last edited:
Upvote 0

Forum statistics

Threads
1,223,229
Messages
6,170,881
Members
452,364
Latest member
springate

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top