Access UNION query won't run, data limits with a small file and almost nothing in it???

Maggie Barr

Board Regular
Joined
Jan 28, 2014
Messages
188
Greetings, and thank you in advance if you can assist.
I am using Microsoft Office Pro 2019, so Access and Excel Power Query 2019, on a PC, 64 bit, 16 gigs ram, windows 10.

I have a database that choked when I tried to run a UNION query. I saved a copy, deleted any other information other than the two linked files I wanted to UNION, and tried to run the query and it still choked. The databse size is showing 512 KB, the two txt files that are linked (not physically loaded) are 1,245,342 KB (or 3,504,398 records) and 797 KB (or 1717 records). I know there is a database size cap of 2 GB, but the database itself isn't even close to that, and the files in it are only linked, so it seems to me that another query to run a UNION of the two (not loading them to a physical table) should still run. I am pursuing other routes to try to merge these two txt files, but I am having no luck with that. I was able to merge them fine in Excel's Power Query, but there is no way to export that query as a txt file like in Access, which is what I want to do so I can get the data into my next database. I have a series of databases I have had to split as the dataset has grown. This just seems like it should not be a problem considering the data being used is only linked.

Can anyone tell my if my scenario should actually be causing a data cap limit, or if there could be something else wrong. I already have compact on close, and clear cache on close, enabled. I have done the compact and repair database multiple times. If I am really at capacity, can anyone recommend how to merge two txt files, as I am past the notepad limit for such.

Again, thank you in advance if you can help, I am in a bad spot trying to jump this hurdle.
Sincerely,
Maggie Barr
 

Excel Facts

Whats the difference between CONCAT and CONCATENATE?
The newer CONCAT function can reference a range of cells. =CONCATENATE(A1,A2,A3,A4,A5) becomes =CONCAT(A1:A5)
why link them ?
just import them
then once you have two tables in access make sure the columns in each table are the same type
then edit the sql to UNION ALL instead of UNION

UNION checks for duplicate rows in the two tables, so that takes time
UNION ALL doesn't
 
Upvote 0
Could be a memory issue (run out of memory for processing, or evening storing the result - all that execution has to happen somewhere and that is a lot of data to manage). If the goal is to get the data into the next database, I'm not sure you need Union at all. Perhaps just load the first file, then load the next one? With 3.5M records I hope the "next database" isn't Access...

Agree with the above, that if the data can fit into Access then importing might be better than working with linked text files, and Union All is always better when you don't need to exclude duplicates.
 
Upvote 0
Thank you both for reaching out!
I cannot load the files because they are too big and result in 1.8 GB of data, so my workout around to handle this has always been to link the files (keeps the database file size small), split databases when needed, and do what I can with the limitations of the programs (file size) and memory capacity (16 GB) on the computer. I know there aren't duplicates in my data, but I hadn't thought about the UNION doing a check for duplicates with the union, and I used the UNION ALL, and it worked! Thank you for the advice on that.
Xenou, my next databse is an Access database. If you have advice as to other programs, I would appreciate it as I may indeed reach a point that splitting databases just won't suffice.
Thank you again!
Maggie
 
Upvote 0
How much data do you have overall? Are your linked files text files or linked Access databases?
 
Upvote 0
How much data do you have overall? Are your linked files text files or linked Access databases?
My files come to me as txt files, and they are "large", about 3 1/2 million records. I use quotations because that is small for some folks I would guess. That is the starting file, and then there are the files and fields I extract, join to ArcMap for spatial joins, bring back in via another txt file where I can bring in the spatial attributes back to the original records. There are a lot of data processes and manipulations, but whenever I reach capacity in an Access database, I find a location within it to split it, then export the output from that new database as a txt file to build and finish the subsequent needs in the next. I suppose I could try linking the output queries to the subsequent database, but I hadn't tried that. Then I take (actually link) output from those final Access databases into Excel Power Query to do further manipulations and analyses. It might not sound pretty or efficient, but having learned "on the go" with time constraints, I have managed thus far. I looked into SQL momentarily when I started, but I did not have the resources or time needed to go there. I will be investigating the R program soon enough as I worry my file will just get beyond a database splitting fix.
 
Upvote 0
There are many many databases (and other data munging tools), of which many don't have inherent limits (but may have practical or feasible limits). Are you using Access db's mainly for storage? How much processing (querying, updates, inserts, deletes) are taking place in Access? I see something about joining to ArcMap - is that the main job for Access, besides basic "storage" and "retrieval", or is there more?
 
Upvote 0
Upvote 0
There are many many databases (and other data munging tools), of which many don't have inherent limits (but may have practical or feasible limits). Are you using Access db's mainly for storage? How much processing (querying, updates, inserts, deletes) are taking place in Access? I see something about joining to ArcMap - is that the main job for Access, besides basic "storage" and "retrieval", or is there more?
There is a lot more. I have to run the data through a series of tests for quality parameters (acceptable observation date, breeding codes, species documented, what eBird portal it is entered into etc.) and accept or reject records to the final dataset according to the output, remove some data based on group Id's for shared records from different users, meaning identical records except that they are recorded by different people, but I also have to test that there aren't differences, like bird breeding codes were changed for one user to another, or that one user didn't record the record in a different location despite the "shared" record. On the other set of databases for the data (I call it a flagging database), I then extract all the "invalid" records for review as some might be accepted despite the parameters set (I produce the file, someone else makes that call). In this, I have to compare those records to the max breeding code value for each species within a block from the valid record databse (hence the ArcMap), and determine if the record has enough "value" to increase the breeding code for the species for the block so we are not reviewing all the records, but only the geographically relevant/important ones. I have another database that codes those results and allows for acceptance of invalid records into the first/valid record database mentioned and rejection of others. I do this through using that output as a lookup table within that database. As well, I keep track of the changes from year to year (Or monthly download to monthly download) for the "invalid" datasets (Atlas Portal & Non-Atlas Portal) as the records can change portal location in eBird which then modifies if the record is still invalid or if it is now accepted. There are a lot of manipulations and tests. It is very difficult to explain the process, but I've managed and it works. When I started the process, the dataset wasn't large, so I managed to accomplish it all in excel believe it or not, but the dataset grew...hence all the different programs and split databases. The nice thing was that I was able to use the first part done in excel to proof my process as I developed the Access and Power Query databases. I know that there are probably programs that would be better suited or more powerful, but without the skillsets or time to dive into those under the time constraints of an ongoing project, I haven't attempted to branch out.
 
Upvote 0
If cost is an issue, maybe look at something like MySQL: MySQL :: Why MySQL?
You can use it to house the data, and still use Access as the front-end (see here: MySQL :: MySQL Connector/ODBC Developer Guide :: 6.4.3 Using Microsoft Access as a Front-end to MySQL).
I had glanced at MySQL, and even downloaded it and tried a little, but the language and process is not something I am familiar with, and the learning curve would have been too great considering the time constraints, so I found my way....
 
Upvote 0

Forum statistics

Threads
1,225,761
Messages
6,186,893
Members
453,383
Latest member
SSXP

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top