Hi friends,
Long time lurker, seldom poster here.
Was wondering if someone could help me with my data management problems.
Unfortunately, I have two limited sets of data that reflect the same thing but the data when compared to one another aren't consistent. I wish to find similarities between Data 1 to Data 2 and match Data 1's unique value to Data 2..
Background/More information
I have two data sets. Each data set has a street name, dimensions of a pipe and each pipe has their own unique ID.
Data 1 has 'Street name', 'Dimensions' and 'Unique ID 1'
Data 2 has 'Street name', 'Dimensions' and 'Unique ID 2'
(note data 2 is slightly more accurate)
Data 1
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Street_names[/TD]
[TD]Dimensions_1[/TD]
[TD]Unique_ID_1[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]8.4888[/TD]
[TD]101[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]26.0242[/TD]
[TD]102[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]4.4035[/TD]
[TD]103[/TD]
[/TR]
[TR]
[TD]13 Acton St[/TD]
[TD]9.3643[/TD]
[TD]104[/TD]
[/TR]
[TR]
[TD]13 Acton St[/TD]
[TD]65.9077[/TD]
[TD]105[/TD]
[/TR]
[TR]
[TD]14 Acton St[/TD]
[TD]18.4123[/TD]
[TD]106[/TD]
[/TR]
[TR]
[TD]20 Acton St[/TD]
[TD]58.6933[/TD]
[TD]107[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]9.0825[/TD]
[TD]108[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]5.6296[/TD]
[TD]109[/TD]
[/TR]
</tbody>[/TABLE]
Data 2
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Street_names[/TD]
[TD]Dimensions_2[/TD]
[TD]Unique_ID_2[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]4.39[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]7.99[/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]7.87[/TD]
[TD]3[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]52.58[/TD]
[TD]4[/TD]
[/TR]
[TR]
[TD]14 Acton St[/TD]
[TD]18.21[/TD]
[TD]5[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]31.71[/TD]
[TD]6[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]39.43[/TD]
[TD]7[/TD]
[/TR]
</tbody>[/TABLE]
My desired outcome (integrate data)
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Street_names[/TD]
[TD]Dimensions_2[/TD]
[TD]Unique_ID_2[/TD]
[TD]Unique_ID_1[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]4.39[/TD]
[TD]1[/TD]
[TD]103[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]7.99[/TD]
[TD]2[/TD]
[TD]101[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]7.87[/TD]
[TD]3[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]52.58[/TD]
[TD]4[/TD]
[TD]105[/TD]
[/TR]
[TR]
[TD]14 Acton St[/TD]
[TD]18.21[/TD]
[TD]5[/TD]
[TD]106[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]31.71[/TD]
[TD]6[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]39.43[/TD]
[TD]7[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]9.4[/TD]
[TD]8[/TD]
[TD]108[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]9.47[/TD]
[TD]9[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
Below is a snippet of the actual data set. Columns A to C is Data 1 and Column H to J is Data 2.
Columns E & F are the columns I've been using Index Match functions to attempt to match data 2 to data 1 (but it's not quite correct). The formulas I have used is shown below and it only matches similarities between the dimension columns of the two data sets. I wish I could make it more accurate by incorporating street addresses into the formula aswell.
Cell E2 has formula, {=INDEX($B$2:$B$23,MATCH(SMALL(ABS($B$2:$B$23-I2),ROW($B$1)),IF(COUNTIF($E$1:E1,$B$2:$B$23)<countif($b$2:$b$23,$b$2:$b$23),abs($b$2:$b$23-i2),"a"),0))}
Cell F2 has formula, {=INDEX($C$2:$C$24,MIN(IF((E2=$B$2:$B$24)*(COUNTIFS($E$1:E1,$B$2:$B$24,$F$1:F1,$C$2:$C$24)<countifs($b$2:$b$24,$b$2:$b$24,$c$2:$c$24,$c$2:$c$24)),match(row($b$2:$b$24),row($b$2:$b$24)),"a")))}
[TABLE="class: grid, width: 1232"]
<tbody>[TR]
[TD="align: center"]A[/TD]
[TD="align: center"]B[/TD]
[TD="align: center"]C[/TD]
[TD="align: center"]D[/TD]
[TD="align: center"]E[/TD]
[TD="align: center"]F[/TD]
[TD="align: center"]G[/TD]
[TD="align: center"]H[/TD]
[TD="align: center"]I[/TD]
[TD="align: center"]J[/TD]
[/TR]
[TR]
[TD]Street name 1[/TD]
[TD] Dimensions 1[/TD]
[TD] ID 1[/TD]
[TD][/TD]
[TD]Closest values to Data 2 from Data 1:[/TD]
[TD] ID 1 Adjacent[/TD]
[TD][/TD]
[TD]GIS Search value:[/TD]
[TD]Dimensions 2[/TD]
[TD]Street name 2[/TD]
[/TR]
[TR]
[TD]1 Acton Street[/TD]
[TD="align: right"]8.488800[/TD]
[TD]10710436[/TD]
[TD][/TD]
[TD="align: right"]4.2809[/TD]
[TD]10712536[/TD]
[TD][/TD]
[TD="align: right"]2800[/TD]
[TD="align: right"]4.31[/TD]
[TD]1 Acton ST[/TD]
[/TR]
[TR]
[TD]1 Acton Street[/TD]
[TD="align: right"]26.024200[/TD]
[TD]10710536[/TD]
[TD][/TD]
[TD="align: right"]8.4888[/TD]
[TD]10710436[/TD]
[TD][/TD]
[TD="align: right"]2814[/TD]
[TD="align: right"]7.99[/TD]
[TD]1 Acton ST[/TD]
[/TR]
[TR]
[TD]1 Acton Street[/TD]
[TD="align: right"]4.403500[/TD]
[TD]10710636[/TD]
[TD][/TD]
[TD="align: center"]#N/A[/TD]
[TD="align: center"]#N/A[/TD]
[TD][/TD]
[TD="align: right"]2805[/TD]
[TD="align: right"]7.87[/TD]
[TD]11 Acton ST[/TD]
[/TR]
[TR]
[TD]13 Acton Street[/TD]
[TD="align: right"]9.364300[/TD]
[TD]10710736[/TD]
[TD][/TD]
[TD="align: right"]54.0157[/TD]
[TD]10712436[/TD]
[TD][/TD]
[TD="align: right"]2804[/TD]
[TD="align: right"]52.58[/TD]
[TD]11 Acton ST[/TD]
[/TR]
[TR]
[TD]13 Acton Street[/TD]
[TD="align: right"]65.907700[/TD]
[TD]10710836[/TD]
[TD][/TD]
[TD="align: right"]18.4123[/TD]
[TD]10710936[/TD]
[TD][/TD]
[TD="align: right"]2821[/TD]
[TD="align: right"]18.21[/TD]
[TD]14 Acton ST[/TD]
[/TR]
[TR]
[TD]14 Acton Street[/TD]
[TD="align: right"]18.412300[/TD]
[TD]10710936[/TD]
[TD][/TD]
[TD="align: right"]32.819[/TD]
[TD]10711736[/TD]
[TD][/TD]
[TD="align: right"]2822[/TD]
[TD="align: right"]31.71[/TD]
[TD]22 Acton ST[/TD]
[/TR]
[TR]
[TD]20 Acton Street[/TD]
[TD="align: right"]12.346800[/TD]
[TD]10711036[/TD]
[TD][/TD]
[TD="align: center"]#N/A[/TD]
[TD="align: center"]#N/A[/TD]
[TD][/TD]
[TD="align: right"]2823[/TD]
[TD="align: right"]39.43[/TD]
[TD]22 Acton ST[/TD]
[/TR]
[TR]
[TD]20 Acton Street[/TD]
[TD="align: right"]58.693300[/TD]
[TD]10711136[/TD]
[TD][/TD]
[TD="align: right"]4.4035[/TD]
[TD]10710636[/TD]
[TD][/TD]
[TD="align: right"]2811[/TD]
[TD="align: right"]4.45[/TD]
[TD]23 Acton ST[/TD]
[/TR]
[TR]
[TD]23 Acton Street[/TD]
[TD="align: right"]9.082500[/TD]
[TD]10711436[/TD]
[TD][/TD]
[TD="align: right"]9.3643[/TD]
[TD]10710736[/TD]
[TD][/TD]
[TD="align: right"]2810[/TD]
[TD="align: right"]9.4[/TD]
[TD]23 Acton ST[/TD]
[/TR]
[TR]
[TD]23 Acton Street[/TD]
[TD="align: right"]5.629600[/TD]
[TD]10711236[/TD]
[TD][/TD]
[TD="align: center"]#N/A[/TD]
[TD="align: center"]#N/A[/TD]
[TD][/TD]
[TD="align: right"]2809[/TD]
[TD="align: right"]9.47[/TD]
[TD]23 Acton ST[/TD]
[/TR]
[TR]
[TD]23 Acton Street[/TD]
[TD="align: right"]10.573400[/TD]
[TD]10711336[/TD]
[TD][/TD]
[TD="align: right"]65.9077[/TD]
[TD]10710836[/TD]
[TD][/TD]
[TD="align: right"]2806[/TD]
[TD="align: right"]65.96[/TD]
[TD]23 Acton ST[/TD]
[/TR]
</tbody>[/TABLE]
Many thanks for reading this long post! Would really appreciate some help.
Thanks!
Andrew</countifs($b$2:$b$24,$b$2:$b$24,$c$2:$c$24,$c$2:$c$24)),match(row($b$2:$b$24),row($b$2:$b$24)),"a")))}
</countif($b$2:$b$23,$b$2:$b$23),abs($b$2:$b$23-i2),"a"),0))}
Long time lurker, seldom poster here.
Was wondering if someone could help me with my data management problems.
Unfortunately, I have two limited sets of data that reflect the same thing but the data when compared to one another aren't consistent. I wish to find similarities between Data 1 to Data 2 and match Data 1's unique value to Data 2..
Background/More information
I have two data sets. Each data set has a street name, dimensions of a pipe and each pipe has their own unique ID.
Data 1 has 'Street name', 'Dimensions' and 'Unique ID 1'
Data 2 has 'Street name', 'Dimensions' and 'Unique ID 2'
(note data 2 is slightly more accurate)
Data 1
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Street_names[/TD]
[TD]Dimensions_1[/TD]
[TD]Unique_ID_1[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]8.4888[/TD]
[TD]101[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]26.0242[/TD]
[TD]102[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]4.4035[/TD]
[TD]103[/TD]
[/TR]
[TR]
[TD]13 Acton St[/TD]
[TD]9.3643[/TD]
[TD]104[/TD]
[/TR]
[TR]
[TD]13 Acton St[/TD]
[TD]65.9077[/TD]
[TD]105[/TD]
[/TR]
[TR]
[TD]14 Acton St[/TD]
[TD]18.4123[/TD]
[TD]106[/TD]
[/TR]
[TR]
[TD]20 Acton St[/TD]
[TD]58.6933[/TD]
[TD]107[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]9.0825[/TD]
[TD]108[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]5.6296[/TD]
[TD]109[/TD]
[/TR]
</tbody>[/TABLE]
Data 2
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Street_names[/TD]
[TD]Dimensions_2[/TD]
[TD]Unique_ID_2[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]4.39[/TD]
[TD]1[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]7.99[/TD]
[TD]2[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]7.87[/TD]
[TD]3[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]52.58[/TD]
[TD]4[/TD]
[/TR]
[TR]
[TD]14 Acton St[/TD]
[TD]18.21[/TD]
[TD]5[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]31.71[/TD]
[TD]6[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]39.43[/TD]
[TD]7[/TD]
[/TR]
</tbody>[/TABLE]
My desired outcome (integrate data)
[TABLE="class: grid, width: 500"]
<tbody>[TR]
[TD]Street_names[/TD]
[TD]Dimensions_2[/TD]
[TD]Unique_ID_2[/TD]
[TD]Unique_ID_1[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]4.39[/TD]
[TD]1[/TD]
[TD]103[/TD]
[/TR]
[TR]
[TD]1 Acton St[/TD]
[TD]7.99[/TD]
[TD]2[/TD]
[TD]101[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]7.87[/TD]
[TD]3[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD]11 Acton St[/TD]
[TD]52.58[/TD]
[TD]4[/TD]
[TD]105[/TD]
[/TR]
[TR]
[TD]14 Acton St[/TD]
[TD]18.21[/TD]
[TD]5[/TD]
[TD]106[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]31.71[/TD]
[TD]6[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD]22 Acton St[/TD]
[TD]39.43[/TD]
[TD]7[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]9.4[/TD]
[TD]8[/TD]
[TD]108[/TD]
[/TR]
[TR]
[TD]23 Acton St[/TD]
[TD]9.47[/TD]
[TD]9[/TD]
[TD]#N/A[/TD]
[/TR]
[TR]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[/TR]
</tbody>[/TABLE]
Below is a snippet of the actual data set. Columns A to C is Data 1 and Column H to J is Data 2.
Columns E & F are the columns I've been using Index Match functions to attempt to match data 2 to data 1 (but it's not quite correct). The formulas I have used is shown below and it only matches similarities between the dimension columns of the two data sets. I wish I could make it more accurate by incorporating street addresses into the formula aswell.
Cell E2 has formula, {=INDEX($B$2:$B$23,MATCH(SMALL(ABS($B$2:$B$23-I2),ROW($B$1)),IF(COUNTIF($E$1:E1,$B$2:$B$23)<countif($b$2:$b$23,$b$2:$b$23),abs($b$2:$b$23-i2),"a"),0))}
Cell F2 has formula, {=INDEX($C$2:$C$24,MIN(IF((E2=$B$2:$B$24)*(COUNTIFS($E$1:E1,$B$2:$B$24,$F$1:F1,$C$2:$C$24)<countifs($b$2:$b$24,$b$2:$b$24,$c$2:$c$24,$c$2:$c$24)),match(row($b$2:$b$24),row($b$2:$b$24)),"a")))}
[TABLE="class: grid, width: 1232"]
<tbody>[TR]
[TD="align: center"]A[/TD]
[TD="align: center"]B[/TD]
[TD="align: center"]C[/TD]
[TD="align: center"]D[/TD]
[TD="align: center"]E[/TD]
[TD="align: center"]F[/TD]
[TD="align: center"]G[/TD]
[TD="align: center"]H[/TD]
[TD="align: center"]I[/TD]
[TD="align: center"]J[/TD]
[/TR]
[TR]
[TD]Street name 1[/TD]
[TD] Dimensions 1[/TD]
[TD] ID 1[/TD]
[TD][/TD]
[TD]Closest values to Data 2 from Data 1:[/TD]
[TD] ID 1 Adjacent[/TD]
[TD][/TD]
[TD]GIS Search value:[/TD]
[TD]Dimensions 2[/TD]
[TD]Street name 2[/TD]
[/TR]
[TR]
[TD]1 Acton Street[/TD]
[TD="align: right"]8.488800[/TD]
[TD]10710436[/TD]
[TD][/TD]
[TD="align: right"]4.2809[/TD]
[TD]10712536[/TD]
[TD][/TD]
[TD="align: right"]2800[/TD]
[TD="align: right"]4.31[/TD]
[TD]1 Acton ST[/TD]
[/TR]
[TR]
[TD]1 Acton Street[/TD]
[TD="align: right"]26.024200[/TD]
[TD]10710536[/TD]
[TD][/TD]
[TD="align: right"]8.4888[/TD]
[TD]10710436[/TD]
[TD][/TD]
[TD="align: right"]2814[/TD]
[TD="align: right"]7.99[/TD]
[TD]1 Acton ST[/TD]
[/TR]
[TR]
[TD]1 Acton Street[/TD]
[TD="align: right"]4.403500[/TD]
[TD]10710636[/TD]
[TD][/TD]
[TD="align: center"]#N/A[/TD]
[TD="align: center"]#N/A[/TD]
[TD][/TD]
[TD="align: right"]2805[/TD]
[TD="align: right"]7.87[/TD]
[TD]11 Acton ST[/TD]
[/TR]
[TR]
[TD]13 Acton Street[/TD]
[TD="align: right"]9.364300[/TD]
[TD]10710736[/TD]
[TD][/TD]
[TD="align: right"]54.0157[/TD]
[TD]10712436[/TD]
[TD][/TD]
[TD="align: right"]2804[/TD]
[TD="align: right"]52.58[/TD]
[TD]11 Acton ST[/TD]
[/TR]
[TR]
[TD]13 Acton Street[/TD]
[TD="align: right"]65.907700[/TD]
[TD]10710836[/TD]
[TD][/TD]
[TD="align: right"]18.4123[/TD]
[TD]10710936[/TD]
[TD][/TD]
[TD="align: right"]2821[/TD]
[TD="align: right"]18.21[/TD]
[TD]14 Acton ST[/TD]
[/TR]
[TR]
[TD]14 Acton Street[/TD]
[TD="align: right"]18.412300[/TD]
[TD]10710936[/TD]
[TD][/TD]
[TD="align: right"]32.819[/TD]
[TD]10711736[/TD]
[TD][/TD]
[TD="align: right"]2822[/TD]
[TD="align: right"]31.71[/TD]
[TD]22 Acton ST[/TD]
[/TR]
[TR]
[TD]20 Acton Street[/TD]
[TD="align: right"]12.346800[/TD]
[TD]10711036[/TD]
[TD][/TD]
[TD="align: center"]#N/A[/TD]
[TD="align: center"]#N/A[/TD]
[TD][/TD]
[TD="align: right"]2823[/TD]
[TD="align: right"]39.43[/TD]
[TD]22 Acton ST[/TD]
[/TR]
[TR]
[TD]20 Acton Street[/TD]
[TD="align: right"]58.693300[/TD]
[TD]10711136[/TD]
[TD][/TD]
[TD="align: right"]4.4035[/TD]
[TD]10710636[/TD]
[TD][/TD]
[TD="align: right"]2811[/TD]
[TD="align: right"]4.45[/TD]
[TD]23 Acton ST[/TD]
[/TR]
[TR]
[TD]23 Acton Street[/TD]
[TD="align: right"]9.082500[/TD]
[TD]10711436[/TD]
[TD][/TD]
[TD="align: right"]9.3643[/TD]
[TD]10710736[/TD]
[TD][/TD]
[TD="align: right"]2810[/TD]
[TD="align: right"]9.4[/TD]
[TD]23 Acton ST[/TD]
[/TR]
[TR]
[TD]23 Acton Street[/TD]
[TD="align: right"]5.629600[/TD]
[TD]10711236[/TD]
[TD][/TD]
[TD="align: center"]#N/A[/TD]
[TD="align: center"]#N/A[/TD]
[TD][/TD]
[TD="align: right"]2809[/TD]
[TD="align: right"]9.47[/TD]
[TD]23 Acton ST[/TD]
[/TR]
[TR]
[TD]23 Acton Street[/TD]
[TD="align: right"]10.573400[/TD]
[TD]10711336[/TD]
[TD][/TD]
[TD="align: right"]65.9077[/TD]
[TD]10710836[/TD]
[TD][/TD]
[TD="align: right"]2806[/TD]
[TD="align: right"]65.96[/TD]
[TD]23 Acton ST[/TD]
[/TR]
</tbody>[/TABLE]
Many thanks for reading this long post! Would really appreciate some help.
Thanks!
Andrew</countifs($b$2:$b$24,$b$2:$b$24,$c$2:$c$24,$c$2:$c$24)),match(row($b$2:$b$24),row($b$2:$b$24)),"a")))}
</countif($b$2:$b$23,$b$2:$b$23),abs($b$2:$b$23-i2),"a"),0))}