Remove duplicates using SQL DISTINCT

tiredofit

Well-known Member
Joined
Apr 11, 2013
Messages
1,935
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
This code is taken from here:

Code:
https://www.ozgrid.com/forum/forum/tip-tricks-code-no-questions/advanced-excel-integration/7960-create-a-unique-list-with-ado-sql-from-closed-workbooks-xl

It removes duplicates using the SQL command DISTINCT.

The problem is it works for a small column of data, say 10 values but for something like 300,000 values, it returns only a few thousand records, when the true value should be 150,000.

What is wrong?

Code:
Option Explicit
Sub Create_Unique_List()
Dim cnt As ADODB.Connection
Dim rst As ADODB.Recordset
Dim stCon As String, stSQL As String
Dim i As Long, lnMode As Long
Dim wbBook As Workbook
Dim wsSheet As Worksheet
Set wbBook = ThisWorkbook
Set wsSheet = wbBook.Worksheets("Sheet1")
stCon = "Provider=Microsoft.Jet.OLEDB.4.0;" _
        & "Data Source=" & wbBook.FullName & ";" _
        & "Extended Properties=""Excel 8.0;HDR=YES"";"
'The keyword DISTINCT generate a unique list in the SQL-statement."
'All the column in the worksheet Sheet1 have fieldnames in the first row
'which we all use here.
stSQL = "SELECT DISTINCT Data FROM [Sheet1$]"
Set cnt = New ADODB.Connection
Set rst = New ADODB.Recordset
cnt.Open stCon
rst.Open stSQL, cnt, adOpenStatic, adLockOptimistic
'A check to see that records actually exist.
If Not rst.EOF Then
With Application
.ScreenUpdating = False
'Collect the present calculation-mode.
lnMode = .Calculation
.Calculation = xlCalculationManual
'Add a new worksheet
Worksheets.Add Before:=wsSheet
'Copy the records to the added worksheet.
ActiveSheet.Cells(2, 1).CopyFromRecordset rst
'Reset the calculation-mode.
.Calculation = lnMode
.ScreenUpdating = True
End With
Else
MsgBox "No records could be find!", vbCritical
End If
'Cleaning up
If CBool(rst.State And adStateOpen) Then rst.Close
Set rst = Nothing
If CBool(cnt.State And adStateOpen) Then cnt.Close
Set cnt = Nothing
End Sub
 

Excel Facts

Which Excel functions can ignore hidden rows?
The SUBTOTAL and AGGREGATE functions ignore hidden rows. AGGREGATE can also exclude error cells and more.
You shouldn't be doing it this way, there's a known bug that leads to a memory leak when querying the open workbook with ADO.

Is there a reason you can't use the remove duplicates excel feature?
 
Upvote 0
Thanks, I was aware that you shouldn't query open workbooks but would the code above work (or maybe adapted) for closed workbooks?

The built in remove duplicates function in Excel does not work for a large set of data, eg 300,000 values.

I've written code using dictionary or collection, which works but was curious to see if using sql distinct would work also.

So if I understand you correctly, sql distinct should work (certainly does in access) but not here because the workbook is open?
 
Upvote 0
It should work but you shouldn't be using the Jet 4 driver or specifying Excel 8 which is the old xls format, which was limited to 65536 rows.
 
Upvote 0
It should work but you shouldn't be using the Jet 4 driver or specifying Excel 8 which is the old xls format, which was limited to 65536 rows.

Should it be ACE and Excel 12?
 
Upvote 0
What format is the workbook?
 
Upvote 0
Excel 2016 .xlsm.

Tried this:

Rich (BB code):
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myfile.xlsm; Extended Properties="Excel 12.0 Macro;HDR=YES";

from:

Rich (BB code):
https://www.connectionstrings.com/excel/




but got an error message:

Rich (BB code):
[Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified.
 
Last edited:
Upvote 0
Yes.

Even tried this to be safe:

Code:
stCon = " Microsoft.ACE.OLEDB.12.0;" _
        & "Data Source=" & wbBook.FullName & ";" _
        & "Extended Properties=""Excel 12.0;HDR=YES"";"
 
Upvote 0
You're missing the Provider= part.
 
Upvote 0

Forum statistics

Threads
1,225,750
Messages
6,186,808
Members
453,373
Latest member
Ereha

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top