I'm looking at a consortium of databases which are all representative of the country (little to no skew in each database). Adult population in the country is 48m. The databases in order of size:
19m
12m
4m
4m
4m
3m
1m
0.5m
0.5m
I'm trying to calculate the net number of records I'm likely to achieve by matching these datasets together. Clearly it will be larger than 19m. What formula can I use to create a prediction?
19m
12m
4m
4m
4m
3m
1m
0.5m
0.5m
I'm trying to calculate the net number of records I'm likely to achieve by matching these datasets together. Clearly it will be larger than 19m. What formula can I use to create a prediction?