Remove duplicates using SQL DISTINCT

tiredofit

Well-known Member
Joined
Apr 11, 2013
Messages
1,926
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
This code is taken from here:

Code:
https://www.ozgrid.com/forum/forum/tip-tricks-code-no-questions/advanced-excel-integration/7960-create-a-unique-list-with-ado-sql-from-closed-workbooks-xl

It removes duplicates using the SQL command DISTINCT.

The problem is it works for a small column of data, say 10 values but for something like 300,000 values, it returns only a few thousand records, when the true value should be 150,000.

What is wrong?

Code:
Option Explicit
Sub Create_Unique_List()
Dim cnt As ADODB.Connection
Dim rst As ADODB.Recordset
Dim stCon As String, stSQL As String
Dim i As Long, lnMode As Long
Dim wbBook As Workbook
Dim wsSheet As Worksheet
Set wbBook = ThisWorkbook
Set wsSheet = wbBook.Worksheets("Sheet1")
stCon = "Provider=Microsoft.Jet.OLEDB.4.0;" _
        & "Data Source=" & wbBook.FullName & ";" _
        & "Extended Properties=""Excel 8.0;HDR=YES"";"
'The keyword DISTINCT generate a unique list in the SQL-statement."
'All the column in the worksheet Sheet1 have fieldnames in the first row
'which we all use here.
stSQL = "SELECT DISTINCT Data FROM [Sheet1$]"
Set cnt = New ADODB.Connection
Set rst = New ADODB.Recordset
cnt.Open stCon
rst.Open stSQL, cnt, adOpenStatic, adLockOptimistic
'A check to see that records actually exist.
If Not rst.EOF Then
With Application
.ScreenUpdating = False
'Collect the present calculation-mode.
lnMode = .Calculation
.Calculation = xlCalculationManual
'Add a new worksheet
Worksheets.Add Before:=wsSheet
'Copy the records to the added worksheet.
ActiveSheet.Cells(2, 1).CopyFromRecordset rst
'Reset the calculation-mode.
.Calculation = lnMode
.ScreenUpdating = True
End With
Else
MsgBox "No records could be find!", vbCritical
End If
'Cleaning up
If CBool(rst.State And adStateOpen) Then rst.Close
Set rst = Nothing
If CBool(cnt.State And adStateOpen) Then cnt.Close
Set cnt = Nothing
End Sub
 
Thanks.

Added it in, re-ran.

It didn't crash but still only returned 40,000 records, instead of 150,000.
 
Upvote 0

Excel Facts

Formula for Yesterday
Name Manager, New Name. Yesterday =TODAY()-1. OK. Then, use =YESTERDAY in any cell. Tomorrow could be =TODAY()+1.
Daft question, are you sure your data is actually distinct? There are no extra columns or something
 
Upvote 0
The original data, with 300,000 values all in column A, contain duplicates.

Using dictionary / collection, it returns 150,000 unique values.

Using sql, it only returns 40,000 values.

Have modified my code to point to a closed workbook.

Still only returning 40,000 records.
 
Last edited:
Upvote 0
if you run it without DISTINCT, how many records are returned
 
Upvote 0
So are there rogue columns? Presumably your collection/dictionary is looking only at column A, your SQL may not be
 
Upvote 0
if you run it without DISTINCT, how many records are returned

It crashes on this line:

Code:
ActiveSheet.Cells(2, 1).CopyFromRecordset rst

with the message:

Code:
method of copyfromrecordset of object range failed

So are there rogue columns? Presumably your collection/dictionary is looking only at column A, your SQL may not be

no rogue columns.
 
Last edited:
Upvote 0
Are you expecting say aaa and AAA to be distinct?
 
Upvote 0
You've got mixed data types in the same column, which really don't play well with ADO. Because the first several thousand rows are numbers, it then treats all the later non-numeric data as Null.
 
Upvote 0

Forum statistics

Threads
1,224,824
Messages
6,181,186
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top