Maggie Barr
Board Regular
- Joined
- Jan 28, 2014
- Messages
- 188
Hello, and thank you in advance for providing input or advice if you can.
I have a set of data that is almost 800,000 rows with multiple columns. I initially have to split the larger file up into smaller files to get it into ArcMap so I can spatially join the records and export it back to excel and then rejoin the data by bringing the newly acquired variables into the original set using VLOOKUP.
I then have to run this data through a series of procedures combining some of the variables and creating new variables according to certain parameters so I can flag and subsequently filter the record numbers to only what we consider valid by our criteria. I have done this in excel 2013 with a series of formulas, VLOOKUPS, and pivots. I am pretty happy with the process, BUT, the number of records I will have in the next download will exceed what excel can handle, and that data will subsequently get added to the initial data set as well. I would prefer to not have to open the data in Access to split it up into smaller fragments and attempt to run it through all of my procedures in excel, some of which need the whole data set to work. I am using Excel 2013, and unfortunately the tutorial I was trying to use at Lynda.com was for Power Query 2016, and I cannot find the same features displayed in the 2013 version to follow along, nor do I know if that program will actually work for my circumstances.
I haven't really ever worked in Access before, nor used Power Query, but I am looking for some advice as to which program may work the best for what I need to do with an already large and continually growing data set. It would be nice if I could use the capabilities of one of these programs to run the processes I have already figured out, but I would like to put my time into learning the one that best will fit my needs. This is scientific data, not business data, so I am not running accounting functions, but using a combination of text and numeric fields (dates etc.), along with geographic localities, to determine validity of the records. I am in this for the long haul, and I know I need to branch out and learn a new program that can handle the amount of data I will be having to process, the question is, which one?
Any thoughts and advice would be appreciated.
Thank you!
Maggie
I have a set of data that is almost 800,000 rows with multiple columns. I initially have to split the larger file up into smaller files to get it into ArcMap so I can spatially join the records and export it back to excel and then rejoin the data by bringing the newly acquired variables into the original set using VLOOKUP.
I then have to run this data through a series of procedures combining some of the variables and creating new variables according to certain parameters so I can flag and subsequently filter the record numbers to only what we consider valid by our criteria. I have done this in excel 2013 with a series of formulas, VLOOKUPS, and pivots. I am pretty happy with the process, BUT, the number of records I will have in the next download will exceed what excel can handle, and that data will subsequently get added to the initial data set as well. I would prefer to not have to open the data in Access to split it up into smaller fragments and attempt to run it through all of my procedures in excel, some of which need the whole data set to work. I am using Excel 2013, and unfortunately the tutorial I was trying to use at Lynda.com was for Power Query 2016, and I cannot find the same features displayed in the 2013 version to follow along, nor do I know if that program will actually work for my circumstances.
I haven't really ever worked in Access before, nor used Power Query, but I am looking for some advice as to which program may work the best for what I need to do with an already large and continually growing data set. It would be nice if I could use the capabilities of one of these programs to run the processes I have already figured out, but I would like to put my time into learning the one that best will fit my needs. This is scientific data, not business data, so I am not running accounting functions, but using a combination of text and numeric fields (dates etc.), along with geographic localities, to determine validity of the records. I am in this for the long haul, and I know I need to branch out and learn a new program that can handle the amount of data I will be having to process, the question is, which one?
Any thoughts and advice would be appreciated.
Thank you!
Maggie