Merging huge amount of data from two different workbooks into Master book based on multiple condition VBA

Denin Srmic

New Member
Joined
Apr 28, 2020
Messages
19
Office Version
  1. 365
Platform
  1. Windows
Hello,

I have two workbooks Master and Search, and within workbooks there is one spreadsheet in each book. Master book contains over 220 000 rows, whereas Search book has 35 000 rows. Both Sheets have the same number of Columns (216) ordered the same way and with the same Header names.

My task is to pull all the data from the Search Sheet based on condition matching Name, Surname, Date of Birth, Screening Day, and Event in both Sheets, into Master spreadsheet. I do not need entire row to be pulled into Master Sheet but only data starting from column BX to HH NOT entire row.

I have completed the code (please see bellow) that would do the job, but it is tremendously slow for vast amounts of data.
I would appreciate help of VBA Wizards here if they could help in creating a code that would do this job much faster, as this take too long.

Any suggestions, improvements, or critiques on my code are very welcome.

Many Thanks

VBA Code:
Option Explicit

Sub MergeData()
'We want to merge data from our ThisWorkbook(Search Sheet) with MasterDb(Master Sheet) based on multiple conditions
    
    
    Dim oThisWb As Workbook
    Dim oThisWs As Worksheet
    
    Dim oMasterWb As Workbook
    Dim oMasterWs As Worksheet
    
    
    Dim rT As Range 'whole Range in Search Wb
    Dim rThdr As Range 'Header Range in Search Wb
    Dim rM As Range 'whole Range in Search Wb
    Dim rMhdr As Range 'Header Range in Search Wb
    
    Dim lr As Long
    Dim lc As Long
    Dim lrT As Long
    Dim lcT As Long
    Dim iM As Long
    Dim iT As Long
    Dim lCounter As Long
    
    Dim sMasterDbFolderPath As String
    Dim sFirstName As String
    Dim sSurname As String
    Dim sPdetailsVenue As String
    Dim dDateOfScreening As Date
    Dim dPdetailsDOB As Date
    
    
    Application.DisplayAlerts = False
    Application.AskToUpdateLinks = False
    Application.EnableEvents = False
    Application.Calculation = xlCalculationManual
    Application.ScreenUpdating = False
    
    
    sMasterDbFolderPath = Environ("UserProfile") & "\Desktop\H\Master DatabaseFinal25112022.xlsx"
    If Dir(sMasterDbFolderPath) = vbNullString Then
        MsgBox Prompt:="Folder Path for Master Db does not exist!", Buttons:=vbCritical, Title:="Folder Does not exist!"
        Exit Sub
    End If
    
    Set oThisWb = ThisWorkbook
    Set oThisWs = oThisWb.Worksheets("HData")
    With oThisWs
        .Activate
        lrT = .Cells.Find(What:="*", LookIn:=xlValues, LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious, MatchCase:=False).Row
        lcT = .Cells.Find(What:="*", LookIn:=xlValues, LookAt:=xlPart, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious, MatchCase:=False).Column
        Set rThdr = .Range(Range("A1"), .Cells(1, lcT))
        Set rT = .Range("A1").Resize(lrT, lcT)
    End With
    
    On Error Resume Next
    Set oMasterWb = Workbooks.Open(sMasterDbFolderPath)
    On Error GoTo 0
    Set oMasterWs = oMasterWb.Worksheets("RevisedMasterDB")
    With oMasterWs
        lr = .Cells.Find(What:="*", LookIn:=xlValues, LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlPrevious, MatchCase:=False).Row
        lc = .Cells.Find(What:="*", LookIn:=xlValues, LookAt:=xlPart, SearchOrder:=xlByColumns, SearchDirection:=xlPrevious, MatchCase:=False).Column
        Set rMhdr = .Range(Range("A1"), .Cells(1, lc))
        Set rM = .Range("A1").Resize(lr, lc)
    End With
    
    'first loop read name and surname and other conditions in H's Spreadsheet
    oThisWs.Activate
    For iT = 2 To lrT
            sFirstName = oThisWs.Range("F" & iT).Value
                sSurname = oThisWs.Range("G" & iT).Value
                    sPdetailsVenue = oThisWs.Range("C" & iT).Value
                        dDateOfScreening = oThisWs.Range("A" & iT).Value
                            dPdetailsDOB = oThisWs.Range("M" & iT).Value
                            
        'second loop to search for names, firstname and surname and other conditions in master Db and performe extraction
        oMasterWs.Activate
        For iM = 2 To lr
        On Error Resume Next
        If oMasterWs.Range("F" & iM).Value = sFirstName And oMasterWs.Range("G" & iM).Value = sSurname _
            And oMasterWs.Range("C" & iM).Value = sPdetailsVenue And oMasterWs.Range("A" & iM).Value = dDateOfScreening _
                And oMasterWs.Range("M" & iM).Value = dPdetailsDOB Then
        On Error GoTo 0
                oThisWs.Activate
                oThisWs.Range(Cells(iT, "BX"), Cells(iT, "HH")).Copy
                oMasterWs.Activate
                oMasterWs.Range(Cells(iM, "BX"), Cells(iM, "HH")).Select
                oMasterWs.Range(Cells(iM, "BX"), Cells(iM, "HH")).PasteSpecial Paste:=xlPasteValues
                oMasterWs.Range("O" & iM).Value = oThisWs.Range("O" & iT).Value
                oMasterWs.Range("P" & iM).Value = oThisWs.Range("P" & iT).Value
                oMasterWs.Range(Cells(iM, "BX"), Cells(iM, "HH")).Interior.Color = vbYellow
                
        End If
        Next iM
        Application.CutCopyMode = False
        lCounter = lCounter + 1
        Debug.Print lCounter
    Next iT
    oMasterWs.Activate
    oMasterWb.Save
    
    
    Application.DisplayAlerts = True
    Application.AskToUpdateLinks = True
    Application.EnableEvents = True
    Application.Calculation = xlCalculationAutomatic
    Application.ScreenUpdating = True
    
    
    
    Set rT = Nothing
    Set rThdr = Nothing
    Set rM = Nothing
    Set rMhdr = Nothing
    lr = 0
    lc = 0
    lrT = 0
    lcT = 0
    iM = 0
    iT = 0
    lCounter = 0
    sMasterDbFolderPath = vbNullString
    sFirstName = vbNullString
    sSurname = vbNullString
    sPdetailsVenue = vbNullString
    dDateOfScreening = 0
    dPdetailsDOB = 0
    
    End Sub
 
In addition to my previous reply,

this line of code is highlighted

'sh2.Range("A2").Resize(UBound(b, 1), UBound(b, 2)).Value = b'

and it says Runtime error 7, Out of Memory.
Hi DanteAmor,

sorry for late reply, I was engulfed in work.

I have checked most of the Data in both records, they are appear to be identical, that is to say their columns which are set as condition in order to transfer data appear to be identical. I am not sure why it did spit out Error 7 first time when I run the code, but now code works fine save that it transferred only few rows of data.

For example instead of thousands of matching rows it transfers only hundred. I have run the code multiple times with the same result.

With my old code that process matching data and copying them over into Master db seems to be promising but it is very very slow, too slow for such huge amount of data. During that process I was able to see that it finds more than hundred records and highlights them. I am not sure what it could be with the new code.

Data are highly sensitive, therefore not sure if I am allowed to send them.

Many Thanks
 
Upvote 0

Excel Facts

Quick Sum
Select a range of cells. The total appears in bottom right of Excel screen. Right-click total to add Max, Min, Count, Average.
Remember that all 5 columns must match.

Check which records in book1 "HData" you think match in book2 "RevisedMasterDB".
Take those records (3 or 4 records) from book1 and copy them into the new book1_a.
Now from book2, take the records that you think match and put them in the new book2_b.
Share book1_a and book2_b books.

You could upload books to a free site such www.dropbox.com or google drive.

Replace sensitive data with generic data.
 
Upvote 0
In addition to my previous reply,

this line of code is highlighted

'sh2.Range("A2").Resize(UBound(b, 1), UBound(b, 2)).Value = b'

and it says Runtime error 7, Out of Memory.
Hi DantAmor,

Over these days I have gone through my data assiduously and have realized that Columns F G and M do have matches in both Workbooks that is to say both sheets in my sample data of 1500 rows, whereas Column A does have partial match as some of the rows are missing in either in both sheets or in just one, and Column C in Master Db is completely empty save Headers. Therefore, I Have tried to run your code excluding one of the columns, or run it even with condition based on three columns(F,G,M) (Please see bellow) but it would throw the same error message Runtime error 7 error, out of memory. It wont allow me to end procedure due to the error, except if all five columns are stated as condition.

Working with all five columns given as condition, the code either doesn't complete the task at all or it appears it transfers only very small amount of data. Does this has to do with the fact that in some columns there are positioned empty rows with no data especially in column A in either of Workbooks. It also can happen that in other columns F, G, M there are some empty rows. I didn't realize that before I spent hours ploughing through data.

I would want to test the code based on three conditions columns F,G,M,. Could you please just have a look at the code and advise me why it doesn't work with three condition only.
many thanks

Sorry for all the trouble

VBA Code:
Sub MergeDataWithAnArray()

  Dim wb1 As Workbook, wb2 As Workbook
  Dim sh1 As Worksheet, sh2 As Worksheet
  Dim i As Long, j As Long, k As Long, lr1 As Long, lc As Long, lr2 As Long, w As Long
  Dim MasterFile As String, ky As String
  Dim a As Variant, b As Variant, nRows As Variant
  Dim rg As Range, rng As Range
  Dim dic As Object
  Dim t As Double
  
  
    
  Application.ScreenUpdating = False
  t = Timer
  Set dic = CreateObject("Scripting.Dictionary")
  
  'Search sheet
  Set wb1 = ThisWorkbook
  Set sh1 = wb1.Worksheets("HarshilsData")
  lr1 = sh1.Cells.Find("*", , xlValues, xlPart, xlByRows, xlPrevious, False).Row
  lc = sh1.Cells.Find("*", , xlValues, xlPart, xlByColumns, xlPrevious, False).Column
  a = sh1.Range("A2", sh1.Cells(lr1, lc)).Value
  'Master sheet
'  MasterFile = Environ("UserProfile") & "\Desktop\H\Master DatabaseFinal25112022.xlsx"
'  If Dir(MasterFile) = vbNullString Then
'    MsgBox "Folder Path for Master Db does not exist!", vbCritical, "Folder Does not exist!"
'    Exit Sub
'  End If
'  Set wb2 = Workbooks.Open(MasterFile)
    
  Set wb2 = Workbooks("Master DatabaseFinal25112022 - Copy.xlsx")
  Set sh2 = wb2.Worksheets("RevisedMasterDB")
  lr2 = sh2.Cells.Find("*", , xlValues, xlPart, xlByRows, xlPrevious, False).Row
  b = sh2.Range("A2", sh2.Cells(lr2, lc)).Value
  
  
  'read matrix 'b' Master sheet. key columns: F, G, C, A, M
  For i = 1 To UBound(b, 1)
    ky = b(i, 6) & "|" & b(i, 7) & "|" & b(i, 3) & "|" & b(i, 1) & "|" & b(i, 13)
    'ky = b(i, 6) & "|" & b(i, 7) & "|" & b(i, 1) & "|" & b(i, 13) 'my attempt to reduce columns given as condition
    'ky = b(i, 6) & "|" & b(i, 7) & "|" & b(i, 13) 'my attempt to reduce columns given as condition
    dic(ky) = dic(ky) & "|" & i
  Next
  
  Stop
  
  'read matrix 'a' Search sheet
  Set rng = sh2.Range("BX" & lr2 + 1).Resize(1, 141)
  For i = 1 To UBound(a, 1)
    ky = a(i, 6) & "|" & a(i, 7) & "|" & a(i, 3) & "|" & a(i, 1) & "|" & a(i, 13)
    'ky = a(i, 6) & "|" & a(i, 7) & "|" & a(i, 1) & "|" & a(i, 13)
    'ky = a(i, 6) & "|" & a(i, 7) & "|" & a(i, 13)
    If dic.Exists(ky) Then
      nRows = Split(dic(ky), "|")
      'Stop
      For k = 1 To UBound(nRows)
        w = nRows(k)
        For j = Columns("O").Column To Columns("P").Column
          b(w, j) = a(i, j)
        Next
        For j = Columns("BX").Column To Columns("HH").Column
          b(w, j) = a(i, j)
        Next
        Set rng = Union(rng, sh2.Range("BX" & w + 1).Resize(1, 141))
      Next
    End If
  Next
  
  
  Set dic = Nothing 'trying to save some memory
  Set rng = Nothing
  Set rg = Nothing
  
  If w > 0 Then 'if all (all 5) columns are given as condition w is at this line 0, it jumps to End if. I have only realized this after testing the code many times.
    wb2.Activate
    sh2.Range("A2").Resize(UBound(b, 1), UBound(b, 2)).Value = b 'here it throes the error, Run time error 7, when I try to run the code with condition based on three columns.
    rng.Interior.Color = vbYellow
    sh2.Range("BX" & lr2 + 1).Resize(1, 141).Interior.Color = xlNone
  End If
  wb1.Activate
  Application.ScreenUpdating = True
  MsgBox "Time : " & Timer - t & " sec"
End Sub
 
Upvote 0
In post #6 and #12 I recommended you to do a test with a few records and that you share those records and show me which one is not completing the information, but you haven't.
In my tests it works with 5 columns, now I tell you that my test works with 3 columns.
I attach my test files so you can see how it works. The 3 columns match and the information is completed in the master book.

Try my files, the macro goes in the "HData with macro" file
Then select 5 records of your data and put it in my files (controlled test) and try the macro again.

Hdata
Master
 
Upvote 0
I prepared another code to update the master file in groups of 10,000 records, and in that way, try to avoid the out of memory error.
Try the following code. With 40,000 records on the master sheet, the process took 14 seconds. Test on a controlled file for you to analyze the results.
Note: The key is 3 columns.

VBA Code:
Sub MergeData_with_array_v2()
  Dim wb1 As Workbook, wb2 As Workbook
  Dim sh1 As Worksheet, sh2 As Worksheet
  Dim i As Long, j As Long, k As Long, g As Long, gg As Long, qqq As Long, gq As Long
  Dim lr1 As Long, lc As Long, lr2 As Long, w As Long
  
  Dim MasterFile As String, ky As String
  Dim a As Variant, b As Variant, nRows As Variant
  Dim rg As Range, rng As Range
  Dim dic As Object
  Dim t As Double
    
  Application.ScreenUpdating = False
  t = Timer
  Set dic = CreateObject("Scripting.Dictionary")
  
  'Search sheet
  Set wb1 = ThisWorkbook
  Set sh1 = wb1.Worksheets("HarshilsData")
  lr1 = sh1.Cells.Find("*", , xlValues, xlPart, xlByRows, xlPrevious, False).Row
  lc = sh1.Cells.Find("*", , xlValues, xlPart, xlByColumns, xlPrevious, False).Column
  a = sh1.Range("A2", sh1.Cells(lr1, lc)).Value2
    
  'Master sheet
'  MasterFile = Environ("UserProfile") & "\Desktop\H\Master DatabaseFinal25112022.xlsx"
'  If Dir(MasterFile) = vbNullString Then
'    MsgBox "Folder Path for Master Db does not exist!", vbCritical, "Folder Does not exist!"
'    Exit Sub
'  End If
'  Set wb2 = Workbooks.Open(MasterFile)
  Set wb2 = Workbooks("Master DatabaseFinal25112022")
  Set sh2 = wb2.Worksheets("RevisedMasterDB")
  lr2 = sh2.Cells.Find("*", , xlValues, xlPart, xlByRows, xlPrevious, False).Row
  b = sh2.Range("A2", sh2.Cells(lr2, lc)).Value2

  'read matrix 'a' Search sheet. key columns: F, G, M
  For i = 1 To UBound(a, 1)
    ky = a(i, 6) & "|" & a(i, 7) & "|" & a(i, 13)
    If ky <> "" Then dic(ky) = dic(ky) & "|" & i
  Next
  
  'read matrix 'b' Master sheet
  'groups 10,000 records
  gg = 2
  qqq = 10000
  For g = 1 To UBound(b, 1) Step qqq
    If g + qqq > UBound(b, 1) Then gq = UBound(b, 1) Else gq = g + qqq
    For i = g To gq
      ky = b(i, 6) & "|" & b(i, 7) & "|" & b(i, 13)
      If dic.exists(ky) Then
        w = Split(dic(ky), "|")(1)
        For j = Columns("O").Column To Columns("P").Column
          b(i, j) = a(w, j)
        Next
        For j = Columns("BX").Column To Columns("HH").Column
          b(i, j) = a(w, j)
        Next
      End If
    Next
    
    If w > 0 Then
      sh2.Range("A" & gg).Resize(qqq, UBound(b, 2)).Value = b
      'rng.Interior.Color = vbYellow
      'sh2.Range("BX" & lr2 + 1).Resize(1, 141).Interior.Color = xlNone
      gg = gg + qqq
      DoEvents
    End If
  Next
  
  Application.ScreenUpdating = True
  MsgBox "Time : " & Timer - t & " sec"
End Sub
 
Upvote 0
Solution
I prepared another code to update the master file in groups of 10,000 records, and in that way, try to avoid the out of memory error.
Try the following code. With 40,000 records on the master sheet, the process took 14 seconds. Test on a controlled file for you to analyze the results.
Note: The key is 3 columns.

VBA Code:
Sub MergeData_with_array_v2()
  Dim wb1 As Workbook, wb2 As Workbook
  Dim sh1 As Worksheet, sh2 As Worksheet
  Dim i As Long, j As Long, k As Long, g As Long, gg As Long, qqq As Long, gq As Long
  Dim lr1 As Long, lc As Long, lr2 As Long, w As Long
 
  Dim MasterFile As String, ky As String
  Dim a As Variant, b As Variant, nRows As Variant
  Dim rg As Range, rng As Range
  Dim dic As Object
  Dim t As Double
   
  Application.ScreenUpdating = False
  t = Timer
  Set dic = CreateObject("Scripting.Dictionary")
 
  'Search sheet
  Set wb1 = ThisWorkbook
  Set sh1 = wb1.Worksheets("HarshilsData")
  lr1 = sh1.Cells.Find("*", , xlValues, xlPart, xlByRows, xlPrevious, False).Row
  lc = sh1.Cells.Find("*", , xlValues, xlPart, xlByColumns, xlPrevious, False).Column
  a = sh1.Range("A2", sh1.Cells(lr1, lc)).Value2
   
  'Master sheet
'  MasterFile = Environ("UserProfile") & "\Desktop\H\Master DatabaseFinal25112022.xlsx"
'  If Dir(MasterFile) = vbNullString Then
'    MsgBox "Folder Path for Master Db does not exist!", vbCritical, "Folder Does not exist!"
'    Exit Sub
'  End If
'  Set wb2 = Workbooks.Open(MasterFile)
  Set wb2 = Workbooks("Master DatabaseFinal25112022")
  Set sh2 = wb2.Worksheets("RevisedMasterDB")
  lr2 = sh2.Cells.Find("*", , xlValues, xlPart, xlByRows, xlPrevious, False).Row
  b = sh2.Range("A2", sh2.Cells(lr2, lc)).Value2

  'read matrix 'a' Search sheet. key columns: F, G, M
  For i = 1 To UBound(a, 1)
    ky = a(i, 6) & "|" & a(i, 7) & "|" & a(i, 13)
    If ky <> "" Then dic(ky) = dic(ky) & "|" & i
  Next
 
  'read matrix 'b' Master sheet
  'groups 10,000 records
  gg = 2
  qqq = 10000
  For g = 1 To UBound(b, 1) Step qqq
    If g + qqq > UBound(b, 1) Then gq = UBound(b, 1) Else gq = g + qqq
    For i = g To gq
      ky = b(i, 6) & "|" & b(i, 7) & "|" & b(i, 13)
      If dic.exists(ky) Then
        w = Split(dic(ky), "|")(1)
        For j = Columns("O").Column To Columns("P").Column
          b(i, j) = a(w, j)
        Next
        For j = Columns("BX").Column To Columns("HH").Column
          b(i, j) = a(w, j)
        Next
      End If
    Next
   
    If w > 0 Then
      sh2.Range("A" & gg).Resize(qqq, UBound(b, 2)).Value = b
      'rng.Interior.Color = vbYellow
      'sh2.Range("BX" & lr2 + 1).Resize(1, 141).Interior.Color = xlNone
      gg = gg + qqq
      DoEvents
    End If
  Next
 
  Application.ScreenUpdating = True
  MsgBox "Time : " & Timer - t & " sec"
End Sub
Code works brilliantly,

Many thanks for all the help you provided for my troublemaking spreadsheets. I am going through your videos on VBA Dictionary, do please keep posting them! I must get acquainted with this object and get behind the logic, especially with arrays.
Was it for you difficult to learn VBA Dictionary Object at the beginning?

All the best
 
Upvote 0
Code works brilliantly
Im glad to help you, thanks for the feed back.

Could you comment how long the macro takes with your complete files?


Was it for you difficult to learn VBA Dictionary Object at the beginning?
Yes, at first it was complicated, I had to constantly go to Google to review examples to know how to use it. But with practice I already know the properties and methods, that's why I prepared the videos on youtube so that others can understand it easier.
 
Upvote 0
Im glad to help you, thanks for the feed back.

Could you comment how long the macro takes with your complete files?



Yes, at first it was complicated, I had to constantly go to Google to review examples to know how to use it. But with practice I already know the properties and methods, that's why I prepared the videos on youtube so that others can understand it easier.
Hi,

I think it was not longer than fifteen seconds, I haven't look at the Timer always as I was going though you master piece trying to understand logic behind it, as the former being of secondary interest as opposed to the code functionality itself, but It was fast and end result is for me more than satisfyingly, all data there were they are supposed to be, really amazing.
Do please post more videos on your you tube channel regarding VBA, especially arrays and Scripting.Dictionary, they are very educational, and very informative, and one can learn Spanish language too as additional beneficial add-in.

Thank you.

All the best.

Denin
 
Upvote 0

Forum statistics

Threads
1,223,214
Messages
6,170,771
Members
452,353
Latest member
strainu

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top