Remove duplicates using SQL DISTINCT

tiredofit

Well-known Member
Joined
Apr 11, 2013
Messages
1,926
Office Version
  1. 365
  2. 2019
Platform
  1. Windows
This code is taken from here:

Code:
https://www.ozgrid.com/forum/forum/tip-tricks-code-no-questions/advanced-excel-integration/7960-create-a-unique-list-with-ado-sql-from-closed-workbooks-xl

It removes duplicates using the SQL command DISTINCT.

The problem is it works for a small column of data, say 10 values but for something like 300,000 values, it returns only a few thousand records, when the true value should be 150,000.

What is wrong?

Code:
Option Explicit
Sub Create_Unique_List()
Dim cnt As ADODB.Connection
Dim rst As ADODB.Recordset
Dim stCon As String, stSQL As String
Dim i As Long, lnMode As Long
Dim wbBook As Workbook
Dim wsSheet As Worksheet
Set wbBook = ThisWorkbook
Set wsSheet = wbBook.Worksheets("Sheet1")
stCon = "Provider=Microsoft.Jet.OLEDB.4.0;" _
        & "Data Source=" & wbBook.FullName & ";" _
        & "Extended Properties=""Excel 8.0;HDR=YES"";"
'The keyword DISTINCT generate a unique list in the SQL-statement."
'All the column in the worksheet Sheet1 have fieldnames in the first row
'which we all use here.
stSQL = "SELECT DISTINCT Data FROM [Sheet1$]"
Set cnt = New ADODB.Connection
Set rst = New ADODB.Recordset
cnt.Open stCon
rst.Open stSQL, cnt, adOpenStatic, adLockOptimistic
'A check to see that records actually exist.
If Not rst.EOF Then
With Application
.ScreenUpdating = False
'Collect the present calculation-mode.
lnMode = .Calculation
.Calculation = xlCalculationManual
'Add a new worksheet
Worksheets.Add Before:=wsSheet
'Copy the records to the added worksheet.
ActiveSheet.Cells(2, 1).CopyFromRecordset rst
'Reset the calculation-mode.
.Calculation = lnMode
.ScreenUpdating = True
End With
Else
MsgBox "No records could be find!", vbCritical
End If
'Cleaning up
If CBool(rst.State And adStateOpen) Then rst.Close
Set rst = Nothing
If CBool(cnt.State And adStateOpen) Then cnt.Close
Set cnt = Nothing
End Sub
 
You've got mixed data types in the same column, which really don't play well with ADO. Because the first several thousand rows are numbers, it then treats all the later non-numeric data as Null.

Thanks for the feedback.

So I suppose using a dictionary (or collection) is the preferable method.
 
Upvote 0

Excel Facts

Format cells as time
Select range and press Ctrl+Shift+2 to format cells as time. (Shift 2 is the @ sign).
Yes. To be honest I'd expect it to be faster anyway.
 
Upvote 0
could it be forced
SELECT
'''&ColA try to add a text leader to every one of those cells to prevent numbers being the default ?
From
Datatable
WHER
 
Upvote 0
No, that won't work. It will already be Null by that point.
 
Upvote 0
so this isn't pulling the records
stSQL = "SELECT DISTINCT Data FROM [Sheet1$]"

I use it, but not familiar with all the wrinkles
 
Upvote 0
Nope. It can't "see" any of the non-numeric records.
 
Upvote 0

Forum statistics

Threads
1,224,824
Messages
6,181,186
Members
453,020
Latest member
Mohamed Magdi Tawfiq Emam

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top