Hi all,
I have a large data set of training information, where there are individual rows per completion. The issue I have is that there can be multiple rows per person as they may have attempted the training before.
What I'm after if possible is some code that will only retain one row of information per person, based on the following rules. The data is in the format of Unique Person ID in column W, the date in column B, and the training status in column T (either passed or failed).
1) Retain only the most recent row for the person that has 'training status' = passed
2) If the person does not have a row with 'training status' = passed, delete all rows other than the most recent row with 'training status' = failed
Thanks for any advice you can provide.
I have a large data set of training information, where there are individual rows per completion. The issue I have is that there can be multiple rows per person as they may have attempted the training before.
What I'm after if possible is some code that will only retain one row of information per person, based on the following rules. The data is in the format of Unique Person ID in column W, the date in column B, and the training status in column T (either passed or failed).
1) Retain only the most recent row for the person that has 'training status' = passed
2) If the person does not have a row with 'training status' = passed, delete all rows other than the most recent row with 'training status' = failed
Thanks for any advice you can provide.