Finding the most frequent combination

excelNewbie22

Well-known Member
Joined
Aug 4, 2021
Messages
531
Office Version
  1. 365
Platform
  1. Windows
hi!
how to find the most frequent combination of 1's and 0's (or just 1's), which bigger then 30,
could be 31 or 40, and etc....
it can skip columns, for instance, 30 and 39 and 40 to 42, or 1-9 and etc,
now, the data below is just a sample, the real data is way more big, like 2000 rows and 160 columns,
(which in this case i'll lookup to find combo which bigger then 120)
the possible combinations is soooooo biggggg, like hundres of millions,
i find it hard for excel to handle, but i thought maybe there's a clever way to approach it?

test
ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAPAQARAS
1123456789101112131415161718192021222324252627282930313233343536373839404142434445
2111111111111111111111111111111111111111111111
3111111111111011111111111111111111111111111110
4111111111111111111111111111111111111111111111
5111111111111111111111111111111111110111111111
6111111111111111111111111111111111111111111111
7111111111111111111111111111111111111111111110
8111111111101111111111111110111111111111111111
9111111111111111111111111111111111111111111111
10111111111110111111111111111111111111111111111
11111111111111111110111111111111111011111111111
12111111111111111111111111111111111111111111111
13111111111101111111111111111111111110111111111
14111111111111111111111111111111111111111111111
15111111111101111111111111111111111111111111111
16111111111111111111111111111111111111111111111
17111111111111111111111111111111111111111111111
test
 
I asked the same question. I think that whatever they're trying to accomplish... There must be another way to better analyze the data. But, who knows?

Well... I left it running for several hours and it crashed. And that was with sample data of only 16 rows. So... yeah. Sorry. Don't think it's gonna happen.
 
Upvote 0

Excel Facts

Why are there 1,048,576 rows in Excel?
The Excel team increased the size of the grid in 2007. There are 2^20 rows and 2^14 columns for a total of 17 billion cells.
lrobbo314 - completely understand, from the beginning i thought it was a long shot, but i had to ask,
and really appreciate you trying in python too!
thanks for all the time and effort!
 
Upvote 0
Well... I ran it again and it worked.

I used the first 41 columns from the sample data you provided.

This is the Python code.

Python:
import pandas as pd
from itertools import combinations
df = pd.read_csv("data.csv", header=None)
dict = {}
for index, row in df.iterrows():
    lst = list(row)
    combos = list(combinations(lst, 34))
    for combo_tup in combos:
        combo = "".join([str(e) for e in combo_tup])
        if not combo in dict:
            dict[combo] = 1
        else:
            dict[combo] += 1
output = zip(dict.keys(),dict.values())
odf = pd.DataFrame(output)
odf.to_csv('output.csv')

And these were the results.

output.csv
AB
1CombinationFrequency
21111111111111111111111111111111111219,418,251
3111111111011111111111111111111111119,348,875
4111111111101111111111111111111111114,697,540
5111111110111111111111111111111111114,090,895
611111111111111111111111111111011119,680,680
711111111111111111111111111110111118,824,584
811111111111011111111111111111111116,094,036
911111110111111111111111111111111115,670,756
1011111111111111111111111111111101113,727,440
1111111111011111111111111111111011112,390,850
1211111111111111011111111111101111112,165,800
1311111111101111111111111111111011112,125,200
1411111111011111111111111111110111111,912,680
1511111111011111111111101111111111111,863,225
1611111111111110111111111111101111111,749,300
1711111111011111111111110111111111111,719,900
1811111111101111111111110111111111111,656,200
1911111111111111011111111111110111111,499,400
2011111111111111111111111111101111111,409,632
2111111111101111111111111111110111111,345,960
2211111111111101111111111111111111111,320,176
2311111111111111101111111111110111111,299,480
2411111111111111101111111111101111111,299,480
2511111110111111111111111111110111111,275,120
2611111111101111111111101111111111111,242,150
2711111101111111111111111111111111111,236,564
2811111110111111111111111111111011111,214,400
2911111111111111111111101111111111111,151,150
3011111110111111111111101111111111111,146,600
3111111111111110111111111111011111111,082,900
3211111111101111111111111111111101111,062,600
3311111111101111111111111011111111111,051,050
3411111111111111011111111111111111111,047,200
351111111111111110111111111111111111994,840
361111111111111101111111111101111111928,200
371111111101111111111111111111110111910,800
381111111111111111111111111101111111906,192
391111111101111111111101111111111111859,950
401111111111111111111111011111111111837,200
411111111011111111111101111111111111764,400
421111111111111011111111111111011111749,700
431111111111111111111101111111111111743,820
441111111101111111111111101111111111675,675
451111111111011111111111111111101111672,980
461111111011111111111111011111111111655,200
471111111111110111111111111110111111649,740
481111111111110111111111111101111111649,740
491111111111111111111111111111111011584,112
501111111111111011111111111111111111549,780
511111111111111110111111111111101111499,800
521111111111011111111111011111111111496,860
531111111111111111011111111111011111487,305
541111111111011111111111101111111111455,455
551111111111111111011111111111111111447,678
561111111111011111111111111111110111425,040
571111110111111111111111111111011111425,040
581111111110111111111101111111111111420,420
591111111111111110111111111101111111408,408
601111111111111111011111111110111111357,357
611111111111111101111111111111101111357,000
621111111111011111111111111111011111346,104
631111111011111111111111111111110111331,200
641111110111111111111101111111111111308,700
651111111111111111111111101111111111300,300
661111111110111111111111110111111111300,300
671111110111111111111111111111101111289,800
681111110111111111111110111111111111286,650
691111111111011111111110111111111111273,273
701111111111111111011111111111101111270,725
711111111111011111111111110111111111210,210
721111111110111111111111111111111011202,400
731111111111101111111111111101111111185,640
741111111111111111111011111111111111177,100
751111111011111111111011111111111111163,800
761111101111111111111111111111111111138,426
771111111101111111111011111111111111135,135
781111111111110111111111111111011111129,948
791111111101111111111111111111111011124,200
801111111011111111111111101111111111120,120
811111111111011111111111111111111011106,260
82111111011111111111101111111111111195,550
83111111110111111111111111011111111190,090
84111111111110111111111111111011111186,632
85111111111111111101111111110111111185,085
86111111111111101111111111111110111183,300
87111111011111111111111101111111111176,440
88111111111111111110111111111111111174,613
89111111111111111011111111111111011171,400
90111111111101111111110111111111111170,070
91111110111111111111111111111101111169,552
92111111111111111110111111111101111163,063
93111111111111111101111111111111011162,475
94111110111111111111110111111111111152,920
95111111011111111111111111111111011150,400
96111111111111111111111111011111111150,050
97111111111011111111101111111111111150,050
98111111111111111110111111111110111147,775
99111111111101111111111111101111111145,045
100111111111111111111111111111111110135,537
101111111111111111110111111111011111135,035
102111110111111111111111111111110111130,240
103111111111011111111111111101111111130,030
104111111101111111111111111111111101128,800
105111110111111111111101111111111111126,460
106111111111111110111111111111111011123,800
107111110111111111111111011111111111122,932
108111111111101111111111111110111111122,880
109111111111111111110111111111111011115,925
110111111111011111111111111111111110113,800
111111111111101111111111111111111110110,120
11211110111111111111111111111111111117,050
11311111111110111111110111111111111116,435
11411111111111111111011111111011111116,435
11511111111011111111111111111111111015,400
11611111111111111110111111111111110115,355
11711110111111111111111111111110111115,040
11811110111111111111110111111111111113,150
11911111111111111111111111110111111113,003
12011110111111111111111011111111111112,940
12111111111111111101111111111111110112,856
12211111011111111111111111111111101112,520
12311111111111111111011111111111110112,205
12411111101111111111111111111111110112,100
12511110111111111111111111111111011111,050
1261111111111111111111111111111111110629
1271111111011111111111111111111111101600
1281111111111011111111111111111111110276
1291111111110111111111111111111111110240
1301110111111111111111011111111111111120
1311110111111111111111111111111011111120
1321110111111111111111111111111111111120
1331111111111111111011111111111111101119
1341111111111111111101111111111111101105
135111111110111111111111111111111111045
13611111111111111111011111111111111101
output
 
Upvote 0
wow! truly!

yet, while still really appreciate all your effort,
the result aren't what i meant,

taking the second result for example: 1111111110111111111111111111111111
i need the know which corresponding columns making this combination, it could be
a2:d2 & f2:ah2 = 34 or a2 & d2 & f2 & g2:aj2 = 34 or any of the 22,000,000 combinations,

and then, only keep this combination (along with his corresponding columns) if the combination repeats more then 10% from entire range (in above sample, only 16 rows),
if need to i can provide larger sample for testing

for me it'll take some time to understand how to test and use it in python since i know nothing

again, many thanks for keep trying to help!
 
Upvote 0

Forum statistics

Threads
1,223,967
Messages
6,175,672
Members
452,666
Latest member
AllexDee

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top