Seeking suggestions for fastest approach to re-organise data

JackDanIce

Well-known Member
Joined
Feb 3, 2010
Messages
9,922
Office Version
  1. 365
Platform
  1. Windows
Hi,


I have data in > 2,500 CSV files to aggregate, clean and reduce to 2 columns.

Each file contains at mininum an address and restricted key word e.g. 1 Abbey Road, License Submitted (key word in bold)

Some files have more than 1 address and key word, e.g. 1 Abbey Road, License Submitted, 2 Abbey Road, London, License Approved

Comma's are used as delimiting character (unfortunately means some addresses span 1 cell, others more than 1 cell); key words pre-exist and can be placed into a dictionary.


Task is to create a 2 column list where column A is the address and column B is the key word


Using a nested loop seems inefficient, to test each cell value if it exists in the dictionary and process accordingly; key words needed for final stats (probably pivot the data once 2 column list is created).

Any suggestions or code design for this?

TIA,
Jack
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
I would be tempted to try and import all the csvs, and process once, not sure if that is feasible, maybe add/gatehr a log of the files imported
 
Upvote 0
Hi Mole, thanks for the reply:
Code:
Sub Main()


    Application.ScreenUpdating = False
    
    Read_Data
    Clean_Data
    Rearrange_Data
    
    Application.ScreenUpdating = True
    
End Sub
I'm trying to avoid listing all my code, however, Read_Data reads in all the data from the CSV files first - each row can extend upto column I or J

Clean_Data clears unecessary characters ("#EANF#" and oddChars(1) = 187: oddChars(2) = 191: oddChars(3) = 239

Rearrange_Data is where I will be reducing the data into 2 columns. I didn't state, each "pair" of address and keyword needs to be in the next empty row in the output 2 columns.

e.g. with initial example, output:

1 Abbey Road, License Submitted
1 Abbey Road, License Submitted
2 Abbey Road, License Approved

Some csv files may have repeating addresses, this is not a problem.

Best,
Jack
 
Upvote 0
When I've wanted to do something quick with a bunch of .csv's, I've concatenated them with a DOS command first.
If you open a command window in the folder where the files are you can type:
Code:
type *.csv > x.txt
x.txt would have all the .csv files in it (used .txt to prevent including itself)
 
Upvote 0
Hey didn’t know of that, thanks GR00007 :) I’ll try and reply back
 
Upvote 0
Glad that worked for you.
The ">" below is a redirect, outputting the results of "Type *.csv" into the named file x.txt
Code:
Type *.csv > x.txt
 
Upvote 0
Thanks, barely can remember any DOS programming pre Windows days.. seem to recall things like pipes and stuff!
 
Upvote 0

Forum statistics

Threads
1,223,900
Messages
6,175,276
Members
452,629
Latest member
SahilPolekar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top