Hi - I have a list of a about 150K line items, each with 5 cols as shown below. The Struct_ID (Col. A) represents a pipe and the rest of the cols are records of past inspections that have been done on that particular pipe. I'm looking to narrow down the list to include only the entries with the lastest inspection data and that include a Validated Condition Score. This would mean that, in my final list, there is only one entry for each Struct_ID. You'll note that currently, one Struct_ID could have multiple entries. Ex COM10046 (ln 32, 33 & 34) has three entries. These three lines represent inspections that were done in years 2019, 2014 and 2013. I'm hoping to extract the line with the lastest year, and that has an actual number in Col E "Validated Condition Score". In the case of COM10046, the "chosen" line would be L32 bc it is the latest year (2019) with an actual "Validated Condition Score".
There are probably several thousand occurences of Struct_ID's with multiple entries, but for all of them, there are no more than 10 entries for each 'multiple'. Many Struct_ID's do not have multiple entries. Single entry Struct_IDs (with a Validated Construction Score) are desired to be included in the final list. For multiple entries or single entries, I am only interested in those entries with a Validated Condition Score. Entries, single or multiple, with no Validated Condition Score do not need to be included. I envision the list to go from 150k entries to under 100k entries...
If it helps, it would be easy to construct another column and have a field generated 'toggle' ( "chosen" / "not chosen") for the chosen line item, and I can sort & extract the final list from that. It does not need to be dynamic...it is a one time operation. The final list should include all 5 original column data. The intent is for the final list to be used as a Vlookup table in another spreadsheet.
In the case of a multiple line entry of Struct ID that has the same Inspection Year, then the larger "Validated Condition Score" would get "chosen'.
Validated Condition Score include: 0,1,2,3,4,5. The Col includes "blanks", but these are nota valid score.
I believe I have covered everything, but I'll be online to provide additional clarity if required. Any help you could provide would be appreciated!
Thanks again!
Pete
There are probably several thousand occurences of Struct_ID's with multiple entries, but for all of them, there are no more than 10 entries for each 'multiple'. Many Struct_ID's do not have multiple entries. Single entry Struct_IDs (with a Validated Construction Score) are desired to be included in the final list. For multiple entries or single entries, I am only interested in those entries with a Validated Condition Score. Entries, single or multiple, with no Validated Condition Score do not need to be included. I envision the list to go from 150k entries to under 100k entries...
If it helps, it would be easy to construct another column and have a field generated 'toggle' ( "chosen" / "not chosen") for the chosen line item, and I can sort & extract the final list from that. It does not need to be dynamic...it is a one time operation. The final list should include all 5 original column data. The intent is for the final list to be used as a Vlookup table in another spreadsheet.
In the case of a multiple line entry of Struct ID that has the same Inspection Year, then the larger "Validated Condition Score" would get "chosen'.
Validated Condition Score include: 0,1,2,3,4,5. The Col includes "blanks", but these are nota valid score.
I believe I have covered everything, but I'll be online to provide additional clarity if required. Any help you could provide would be appreciated!
Thanks again!
Pete