Looking for suggestions with efficient comparing of two ranges

JeffGrant

Well-known Member
Joined
Apr 7, 2021
Messages
558
Office Version
  1. 365
Platform
  1. Windows
Hi All,

Because there are 4 times zones across Australia, I need to download data from an API every 10 minutes. This frequent download ensures that I have the most current data in my model.
The downloaded data is 12 columns wide and can easily be up to 200 rows deep.
At the moment, I simply append the latest data download to the previous downloads to create a master data set and all is good.
However, this very simplistic approach can easily be 10,000 rows+ once we start getting into late afternoon.
There are many examples of how to compare ranges looking for similarities and differences.
However, none of the examples I can find seem to suit my situation.
What I would like to do is compare the two ranges and then either ignore the current download if the data is the same or append to the master file if the data download is different....

I am look for some suggestions from your good selves on how best to handle this situation because my model is quite large and this type of comparison will chew up valuable time.
Because of the tight time frames attached to my model, using PQ is not really an option here, however, using an array would be.

Thanks for your help.
 

Excel Facts

Fastest way to copy a worksheet?
Hold down the Ctrl key while dragging tab for Sheet1 to the right. Excel will make a copy of the worksheet.
if all you are trying to detect is whether the two data sets are identical you could use the Checksum approach, which should be much faster to calculate then actually detecting the differences
 
Upvote 0
Solution
Could this be developed into the feature you need?

Neither of the functions (EncodeBase64 and Base64_HMACSHA256) are made by me, but unfortunately I don't remember their original author

0. Inside the VBE, Go to Tools -> References, then Select Microsoft XML, v6.0
1. The macro reads the data from sheet(1) a1.currentRegion
2. The macro does not compare it to anything. It just prints Hash to the immediate window, so...
2.1 It is worthwhile to make a list where the hash values are stored. In that case, only the value of the new data needs to be calculated.

You could probably skip the EncodeBase64 step, but I need it myself and I didn't start testing the problems that might come from skipping EncodeBase64.

VBA Code:
Option Explicit

Public Function Base64_HMACSHA256(ByVal sTextToHash As String, ByVal sSharedSecretKey As String)
    Dim asc As Object, enc As Object
    Dim TextToHash() As Byte
    Dim SharedSecretKey() As Byte
    Set asc = CreateObject("System.Text.UTF8Encoding")
    Set enc = CreateObject("System.Security.Cryptography.HMACSHA256")
 
    TextToHash = asc.Getbytes_4(sTextToHash)
    SharedSecretKey = asc.Getbytes_4(sSharedSecretKey)
    enc.Key = SharedSecretKey
 
    Dim bytes() As Byte
    bytes = enc.ComputeHash_2((TextToHash))
    Base64_HMACSHA256 = EncodeBase64(bytes)
    Set asc = Nothing
    Set enc = Nothing
End Function

Private Function EncodeBase64(ByRef arrData() As Byte) As String
 
    'Inside the VBE, Go to Tools -> References, then Select Microsoft XML, v6.0
    '(or whatever your latest is. This will give you access to the XML Object Library.)
 
    Dim objXML As MSXML2.DOMDocument60
    Dim objNode As MSXML2.IXMLDOMElement
 
    Set objXML = New MSXML2.DOMDocument60
 
    ' byte array to base64
    Set objNode = objXML.createElement("b64")
    objNode.DataType = "bin.base64"
    objNode.nodeTypedValue = arrData
    EncodeBase64 = objNode.Text
 
    Set objNode = Nothing
    Set objXML = Nothing
 
End Function

Sub TS_Hashing_Range()

Dim ws As Worksheet
Dim ReadDataRNG As Range

' ***** Here are the variables to define *****
Set ws = ThisWorkbook.Worksheets(1)
Set ReadDataRNG = ws.Range("a1").CurrentRegion
Dim Salt As String: Salt = "Some long text" ' Salt for hashing

Dim coT As Single: coT = Timer()

Dim DataARR As Variant
DataARR = ReadDataRNG.Value2

Dim HashRowSTR As String
Dim HashRowHASH As String

Dim DataARRRows As Long: DataARRRows = UBound(DataARR, 1)
Dim DataARRCols As Long: DataARRCols = UBound(DataARR, 2)

Dim HashARR As Variant
ReDim HashARR(1 To DataARRRows, 1 To 1)

Dim iR As Long
Dim iC As Long

Dim HashFinalSTR As String
Dim HashFinalHASH As String

' ***** Calculation of row Hash *****
For iR = 1 To DataARRRows
    HashRowSTR = ""
    
        For iC = 1 To DataARRCols
            HashRowSTR = HashRowSTR & DataARR(iR, iC)
        Next iC
        
    HashRowHASH = Base64_HMACSHA256(HashRowSTR, Salt)
    HashARR(iR, 1) = HashRowHASH
Next iR


' ***** Joining of DATA String from Row Hash *****
' This can be done at the same time as the rows are Hashed, but for testing I prefer to do them separately.
Dim i As Variant

For i = 1 To DataARRRows
    HashFinalSTR = HashFinalSTR & HashARR(i, 1)
Next i

' ***** Calculation of DATA Hash from HashFinalSTR *****
HashFinalHASH = Base64_HMACSHA256(HashFinalSTR, Salt)

' ***** This is the checksum of the range *****
Debug.Print HashFinalHASH
Debug.Print Timer() - coT ' Time spent calculating hash values.
End Sub



My apologies for any quirks, English is not my native language. "So I blame Google translate for all my goofs." :devilish:
 
Upvote 0
I forgot something, that was to tell the purpose of the macro...
🫣


The macro calculates the hash from each row in range into the array and from array it calculates hash for the entire range.
 
Upvote 0
My thoughts/suggestions are:
1. You have a workbook with a "master data set" sheet and a "Hash" sheet with hashes calculated during the same day.
2. via the API, you import the data to the "Temp" sheet, for example.
3. You run the macro on the Temp sheet.
3.1 You compare the hash value you get with the values of the Hash sheet.
3.1.1 If a corresponding hash is found, then the Temp sheet is cleared.
3.1.2 If a corresponding hash value is not found then:
- copy the contents of the Temp sheet to the "master data set" sheet
- Adding a Hash value to the Hash sheet
- Clearing the Temp sheet

Is this a proposal at all according to your plan?
 
Upvote 0
Here is a newer version that makes it easier to test the approach I suggested in post #5.

The macro TS_Hashing_Range is designed to workbook with: Data, hash and Temp sheets

Temp sheet is an empty sheet where Api data is imported. (Col A to L)
- No headers (data start at row 1)
- The macro always clears the Temp sheet.

The data sheet has headers. (Col A to L)
- New data is added after the older data.

the hash sheet has headers. (HashValue,DateTime)
- The hash value is added to the first free row. (Col A)
- The creation time of the hash value is stored next to the hash value. (Col B)


VBA Code:
Option Explicit

Sub TS_Hashing_Range()

Dim wsTemp As Worksheet, wsData As Worksheet
Dim ReadDataRNG As Range

On Error GoTo ErrHand: Call TurnOffFeatures

' ***** Here are the variables to define *****
Set wsTemp = ThisWorkbook.Worksheets("Temp")
Set wsData = ThisWorkbook.Worksheets("Data")
Set ReadDataRNG = wsTemp.Range("a1").CurrentRegion
Dim Salt As String: Salt = "Some long text" ' Salt for hashing

Dim coT As Single: coT = Timer()

Dim DataARR As Variant
DataARR = ReadDataRNG.Value2

Dim HashRowSTR As String
Dim HashRowHASH As String

Dim DataARRRows As Long: DataARRRows = UBound(DataARR, 1)
Dim DataARRCols As Long: DataARRCols = UBound(DataARR, 2)

Dim HashARR As Variant
ReDim HashARR(1 To DataARRRows, 1 To 1)

Dim iR As Long
Dim iC As Long

Dim HashFinalSTR As String
Dim HashFinalHASH As String

' ***** Calculation of row Hash *****
For iR = 1 To DataARRRows
    HashRowSTR = ""
    
        For iC = 1 To DataARRCols
            HashRowSTR = HashRowSTR & DataARR(iR, iC)
        Next iC
        
    HashRowHASH = Base64_HMACSHA256(HashRowSTR, Salt)
    HashARR(iR, 1) = HashRowHASH
Next iR


' ***** Joining of DATA String from Row Hash *****
' This can be done at the same time as the rows are Hashed, but for testing I prefer to do them separately.
Dim i As Variant

For i = 1 To DataARRRows
    HashFinalSTR = HashFinalSTR & HashARR(i, 1)
Next i

' ***** Calculation of DATA Hash from HashFinalSTR *****
HashFinalHASH = Base64_HMACSHA256(HashFinalSTR, Salt)

' ***** Checking if Hash value exists *****
Dim Hash_Exists As Boolean
Hash_Exists = TS_Fu_CheckHash(HashFinalHASH)

Dim AddNewDataRNG As Range

    If Hash_Exists Then     ' The data has already been added
        wsTemp.Range("a1").CurrentRegion.Clear
    Else                    ' Data not exists
        Set AddNewDataRNG = wsData.Range("A" & wsData.Cells(Rows.Count, "A").End(xlUp).Row + 1).Resize(UBound(DataARR, 1), UBound(DataARR, 2))
        AddNewDataRNG.Value2 = DataARR
        wsTemp.Range("a1").CurrentRegion.Clear
    End If

' ***** This is the checksum of the range *****
Debug.Print HashFinalHASH
Debug.Print Timer() - coT ' Time spent calculating hash values.

ErrHand:
    Call TurnOnFeatures
    If Err.Number <> 0 Then MsgBox "Something went badly wrong!" & vbCrLf & vbCrLf & "Error number: " & Err.Number & " " & Err.Description

End Sub

Function TS_Fu_CheckHash(HashSTR As String)
Dim ws As Worksheet: Set ws = Worksheets("hash")
Dim dict As Object: Set dict = CreateObject("Scripting.Dictionary")
Dim HashRNG As Range, nextRowRNG As Range
Dim HashARR As Variant

Set HashRNG = ws.Range("A2:B" & ws.Cells(Rows.Count, "A").End(xlUp).Row)
HashARR = HashRNG.Value2

Dim iR As Long

    For iR = 1 To UBound(HashARR, 1)
            dict.Add HashARR(iR, 1), HashARR(iR, 2)
    Next iR

    If dict.Exists(HashSTR) Then
        TS_Fu_CheckHash = True
    Else
        Set nextRowRNG = ws.Range("A" & ws.Cells(Rows.Count, "A").End(xlUp).Row + 1).Resize(1, 2)
        nextRowRNG.Cells(1).Value = HashSTR
        nextRowRNG.Cells(2).Value = Now()
        TS_Fu_CheckHash = False
    End If

End Function

Public Function TurnOffFeatures()
    Application.Calculation = xlManual
    Application.EnableEvents = False
    Application.ScreenUpdating = False
    Application.DisplayStatusBar = False
End Function
Public Function TurnOnFeatures()
    Application.Calculation = xlAutomatic
    Application.EnableEvents = True
    Application.ScreenUpdating = True
    Application.DisplayStatusBar = True
End Function

Public Function Base64_HMACSHA256(ByVal sTextToHash As String, ByVal sSharedSecretKey As String)
    Dim asc As Object, enc As Object
    Dim TextToHash() As Byte
    Dim SharedSecretKey() As Byte
    Set asc = CreateObject("System.Text.UTF8Encoding")
    Set enc = CreateObject("System.Security.Cryptography.HMACSHA256")
 
    TextToHash = asc.Getbytes_4(sTextToHash)
    SharedSecretKey = asc.Getbytes_4(sSharedSecretKey)
    enc.Key = SharedSecretKey
 
    Dim bytes() As Byte
    bytes = enc.ComputeHash_2((TextToHash))
    Base64_HMACSHA256 = EncodeBase64(bytes)
    Set asc = Nothing
    Set enc = Nothing
End Function
Private Function EncodeBase64(ByRef arrData() As Byte) As String
 
    'Inside the VBE, Go to Tools -> References, then Select Microsoft XML, v6.0
    '(or whatever your latest is. This will give you access to the XML Object Library.)
 
    Dim objXML As MSXML2.DOMDocument60
    Dim objNode As MSXML2.IXMLDOMElement
 
    Set objXML = New MSXML2.DOMDocument60
 
    ' byte array to base64
    Set objNode = objXML.createElement("b64")
    objNode.DataType = "bin.base64"
    objNode.nodeTypedValue = arrData
    EncodeBase64 = objNode.Text
 
    Set objNode = Nothing
    Set objXML = Nothing
 
End Function


My apologies for any quirks, English is not my native language. "So I blame Google translate for all my goofs." :devilish:
 
Upvote 0

Forum statistics

Threads
1,224,828
Messages
6,181,201
Members
453,022
Latest member
RobertV1609

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top