Help with schema for a gains roster

OasisUnknown

New Member
Joined
Oct 15, 2015
Messages
46
Hello All,

This is going to be more of a theoretical question this time but it is for a schema I am trying to build.

In the end what I think I need the system to do.
I have people incoming to my work at various times so I am building a gains roster.
I need to compare my last gains roster to the new data that I have received to
1. see if there are duplicates. (I can base my duplicate check off Social security number)
2. see if their gain date has changed and if it has then update the data to the new gain date. if not then its a pure duplicate record and I don't need it because I already have it.

after that check is done I need to compare it to my already existing alpha roster to see if that person has already arrived.
so basically check again on the socials but this time compare the data to my alpha roster rather than the already existing gains roster.

If they already exist on the alpha roster they need to be removed from the gains.

To throw one more wrench into the mix I also need to assign the people who are gains to a workforce within the company.
I don't want to loose this data in the mix.

so to recap here is what I believe my workflow will be.
1. upload my alpha roster into the database
2. upload my new gains roster into the database (the raw data)
(checks get done as described above)
3. I am left with a scrubbed gains roster.
4. I will then take that scrubbed gains roster and assign a workforce to the new people who are on the roster. (the ones without a workforce already assigned)


The parts that I am confused on how to tackle is how to not loose my workforce data and how to update just the gaining date.

(also note I have to do things in this round about way because I get all the raw data from different systems and just take that unchanged raw data and import it into my database for analysis. I have no access or control over how the raw data is presented. I just export it to excel and import that excel file into my database.)

Any thoughts or suggestions would be appreciated.
 

Excel Facts

Best way to learn Power Query?
Read M is for (Data) Monkey book by Ken Puls and Miguel Escobar. It is the complete guide to Power Query.
Generally speaking, you will need to make a master table that has the data that 'lives on'. In this would be the "gaining date", SSN and whatever other fields are 'master' data.

All other data would be transient. You just import it, clean it, apply any updates to the master, then throw it away.

This is very high level....so post back with the parts of this you need clarification on.

Mark
 
Upvote 0
Thanks for the reply mark,

I was on the same line of thinking and actually already have the tables built to hold the persistent data.
The part that I am getting most hung up on is comparing the dates from the old record to the new record.

I know I can use a duplicate data query to find duplicates but I am not sure how to say if this date is different on the new record then replace it on the old record.

I could probably accomplish this through VBA and looping through each record set and comparing the dates but I am trying to avoid vba as much as possible and let the database do the work.
 
Upvote 0
Hmmm. You may need to give an example.

You don't want to have duplicate SSNs on your persistent data, so there should not be any looking for dups there.

If you initially have the persistent data (and a date), and you have added your transient data (with SSN and possibly a new date), you would write an update query that joins the two tables using SSN. You would compare the old/new dates and replace if different.

Or, if you always want to take the transient date, if present, just add a criteria to the update looking for a valid date in the transient data. This will 'filter' your update query to only update the records that need updated (or possibly update with the same data...which will be OK in this case).

OK?
 
Upvote 0
Hmmm. You may need to give an example.

(I don't know how to insert images in this forum so I did my best with table layouts.)

In the table below I am showing mock data for the Persistant table it has elements for the persons Social, Name, the date they are gaining, and the workforce that they are assigned to.
SSNNAMEGain DateWorkforce
123456789Smith, John A.20160805A
234567891Jones, Tlyer20160807B
345678912Doe, John20160805B

This table below is the Raw data that I get from my separate database. It shows the Social, Name, and gain date. It has no concept of the workforce because that database does not care where they work. But the dates may change as shown in the example for Doe, John. Also additions may appear such as in the case of Thompson, Don.
SSNNAMEGain Date
123456789Smith, John A.20160805
234567891Jones, Tlyer20160807
345678912Doe, John20160907
456789123Thompson, Don20160915


This is the cleaned persistent Table after the raw data has been analyzed. You can see that I did not loose any workforce data, The dates have updated accordingly and the new soldier has been add and is waiting input from me to assign him a workforce.
SSNNAMEGain DateWorkforce
123456789Smith, John A.20160805A
234567891Jones, Tlyer20160807B
345678912Doe, John20160907B
456789123Thompson, Don20160915
 
Last edited:
Upvote 0
The next step would also be to check to see if the person already has arrived and then remove them off the roster because they have arrived but if I can figure out this first part that next part will be similar.
 
Upvote 0
If you initially have the persistent data (and a date), and you have added your transient data (with SSN and possibly a new date), you would write an update query that joins the two tables using SSN. You would compare the old/new dates and replace if different.

Also this sounds like what I am trying to do so I guess I am confused on how to write the update query.
I will do some research on my own two because I just realized in my mind I was getting confused between a union query and an update query.
 
Upvote 0
To get from point A to B, try this....

1) Use the Query Wizard and make a "Find Unmatched Query Wizard". Match SSNs, showing the records in your 'temp' that are not in the master. Save this query. These are the records that need to be added.

2) Make an append query using the query you just saved. Now you have added all the missing SSNs.

3) If the date data is ALWAYS good in the temp data, make an update query and overlay the master date with the data from the temp data.

Important to note that not all the master dates will be updated. Only the one where you have a match. Remember, you are matching on SSN.

Good luck!

Mark
 
Upvote 0
If you want to you can write an insert query, then an update query without worrying about existing records since (in principle) they are updated to what they already are so no worry - just insert away and update away and its done. If your table has unique keys on the key fields, then you can even insert with no worries also, because duplicates will be discarded automatically. Otherwise, as shown, you just query first for the new records, insert them, then update the rest. This just means separating the raw data into two groups and treating them differently - ones to insert, and ones to update. It should be easy to do and easy to verify it's done correctly since the result can be counted to make sure group 1 + group 2 = total number of records. From an audit standpoint, you may want to save the raw data just in case, and also you probably need to test the data for accuracy (are names changing for instance - do you want to keep those changes or examine them, or ignore them).
 
Last edited:
Upvote 0

Forum statistics

Threads
1,221,810
Messages
6,162,108
Members
451,743
Latest member
matt3388

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top