Please help on data Cleaning, lookups and merging

Mangolili

Board Regular
Joined
Jun 21, 2011
Messages
54
Hi All,

I have been working with large excel files ranging from 500,000 to 800,000 plus rows of data.
Cleaning, matching and merging data is time consuming. I am wondering if you have an good approach of dealing with large data files?
Any good data cleaning and matching tools to recommend?

How I work with the large data files currently:
1. Put the shifted columns back to the appropriate columns
2. Clean out data that's not useful
3. Match the customers from the original file to other files that's different.
4. Since there's no IDs in the original file, I look up the data using customer_name, address, email address and city to match other data files, if there's a match, the customerID will show up via vlookup.
5. At first, I started with vlooking up the email, then the address
6. After vlooking up like that, I have two different columns with different IDS.
7. I wanted to merge them together on as one column. Currently, I use a filter on ID using Address to get rid of the blanks and N/A.
And for the ID using email column, I use = to the values of the " ID using address" column

For example:
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]CustomerName[/TD]
[TD]ID using email[/TD]
[TD]ID using address[/TD]
[/TR]
[TR]
[TD]ABC company[/TD]
[TD]123[/TD]
[TD]N/A[/TD]
[/TR]
[TR]
[TD]HairSalon556[/TD]
[TD]N/A[/TD]
[TD]4351[/TD]
[/TR]
[TR]
[TD]BeautySupply[/TD]
[TD]N/A[/TD]
[TD]235325[/TD]
[/TR]
[TR]
[TD]Makeupforever[/TD]
[TD]789[/TD]
[TD]789[/TD]
[/TR]
[TR]
[TD]HairExtension[/TD]
[TD]5437[/TD]
[TD]N/A[/TD]
[/TR]
[TR]
[TD]BeautySupply[/TD]
[TD]N/A[/TD]
[TD]345567[/TD]
[/TR]
</tbody>[/TABLE]

[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]CustomerName[/TD]
[TD]ID using email[/TD]
[/TR]
[TR]
[TD]ABC company[/TD]
[TD]123[/TD]
[/TR]
[TR]
[TD]HairSalon556[/TD]
[TD]4351[/TD]
[/TR]
[TR]
[TD]BeautySupply[/TD]
[TD]235325[/TD]
[/TR]
[TR]
[TD]Makeupforever[/TD]
[TD]789[/TD]
[/TR]
[TR]
[TD]HairExtension[/TD]
[TD]5437[/TD]
[/TR]
[TR]
[TD]BeautySupply[/TD]
[TD]34556[/TD]
[/TR]
</tbody>[/TABLE]

I think this method works well for small data files. However, for large data files, it is very time consuming.
It took me almost a week to do all the data cleaning, ID Vlookups and data merging. If would be great if you have any idea of how to do this more efficiently. Thank you so much.
 

Excel Facts

Create a chart in one keystroke
Select the data and press Alt+F1 to insert a default chart. You can change the default chart to any chart type

Forum statistics

Threads
1,223,227
Messages
6,170,848
Members
452,361
Latest member
d3ad3y3

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top