Import external data - 126 million rows...

rdt

New Member
Joined
Feb 14, 2012
Messages
1
Is there a way of selectively importing data - have a csv file that is around 126 million rows and 3 columns. The third column will be a number between 0 up to around 1,000.

Is it possible to request that a row is only imported if, say, the value in the 3rd column for that row was less than 20? This should help ensure that the data selected fits onto one tab.

Alternatively is there a way of importing data to automatically chunk it into seperate tabs?

Usually like an Excel challenge but this one has me stumped.
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop
hi, you can treat the text file as a database and query it using SQL to return the rows, try the following, does your data have field names at the top, cos we might need to tweak the code below.

also what are the data types of the 3 fields, text, integer, date ???

Code:
'****
' Declare some global variables
'****

Public strfilename As String
Public StrPathToTextFile As String

Sub ParseDataFile()

'****
' Setup constants for the SQL query
'****

Const adOpenStatic = 3
Const adLockOptimistic = 3
Const adCmdText = &H1

'****
' Declare stuff
'****

Dim StrSQL As String

'****
' create objects that we need
'****

Set objconnection = CreateObject("ADODB.Connection")
Set objrecordset = CreateObject("ADODB.Recordset")

StrPathToTextFile = "C:\mydata\"
Strfilename = "file1.csv"

Call CreateSchema

'****
' All preparatory work done, lets begin the main process
'****

'****
' Open JET connection to our data file
'****

objconnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
    "Data Source=" & StrPathToTextFile & ";" & _
        "Extended Properties=""text;HDR=NO;FMT=Delimited"""

'****
' Build SQL strings to get unique categories in no order
'****

StrSQL = "SELECT * FROM " & StrFileName
StrSQL = StrSQL & " WHERE F3 < 20"

'****
' Execute the SQL and get the results and store for next stage
'****

objrecordset.Open StrSQL, objconnection, adOpenStatic, adLockOptimistic, adCmdText

Range("A1").CopyFromRecordset objrecordset
Columns("A:I").EntireColumn.AutoFit

objrecordset.Close

End Sub
Sub CreateSchema()

Open StrPathToTextFile & "schema.ini" For Output As #1

Print #1, "[" & StrFileName & "]"
Print #1, "Format=Delimited(,)"
Print #1, "ColNameHeader = True"
Print #1, "MaxScanRows=0"
Print #1, "Col1=F1 Text"
Print #1, "Col2=F2 Text"
Print #1, "Col3=F3 Text"
Close #1

End Sub
 
Upvote 0

Forum statistics

Threads
1,221,310
Messages
6,159,173
Members
451,543
Latest member
cesymcox

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top