I have a dataset of 100,000+ rows of claim values. I'm trying to find the top 10 modes for each state and print them on a subsequent tab with the value of the mode and frequency count. Here's some dummy data:
My output would ideally look like this for each state:
Happy to answer any questions, thanks in advance!
Sequence Number | State | Original Year | Sum of Total Payment |
1343306 | KY | 2014 | 17500 |
1343307 | VA | 2014 | 395000 |
1343310 | WV | 2014 | 995000 |
1343311 | FL | 2014 | 395000 |
1343312 | TX | 2014 | 175000 |
1343320 | NY | 2014 | 1250000 |
1343324 | CA | 2014 | 97500 |
1343332 | FL | 2014 | 125000 |
1343336 | FL | 2014 | 125000 |
1343337 | TN | 2014 | 625000 |
1343349 | FL | 2014 | 495000 |
1343351 | PA | 2014 | 145000 |
1343354 | FL | 2014 | 495000 |
My output would ideally look like this for each state:
State: | KY | ||
Most Frequent | Total Payment | Frequency | |
1 | 995,000 | 22 | |
2 | 250,000 | 19 | |
3 | 1,995,000 | 16 | |
4 | 300,000 | 13 | |
5 | 700,000 | 10 | |
6 | 900,000 | 8 | |
7 | 50,000 | 7 | |
8 | 36,000 | 5 | |
9 | 750,000 | 3 | |
10 | 655,000 | 2 |
Happy to answer any questions, thanks in advance!