bagadiamohit
New Member
- Joined
- May 11, 2013
- Messages
- 5
Hi guys,
I gotta serious problem here.. any kind of help is much appreciated!!
I have two huge text files (130 MB)each with thousands of records in each. I need to compare the two files using vba or by any means and generate a spreadsheet which includes the header and with two additional columns. The two additional columns will be the file name and in the next column it should display in which particular column is error. Each record will be having multiple discrepancies. One file can have the records which cannot be found in the other file. So this condition should also be recorded in the spreadsheet.
Example:
File 1: Taking one record from each.
00000018063|112295|000|0005|0009|0013|1| | |Y| | |106| | |1| | | | | | | | | | | | |000822090|99996|000|112295|C| | | | |000000|00000|0|1264|112295|000003883|N|000|1272|00| |00000018063|N///
File 2:
00000018063|112295|000|0005|0013|0017|1| | |Y| | |106| | |1| | | | | | | | | | | | |000822090|99996|000|112295|C| | | | |000000|00000|0|1260|112295|000003883|N|000|1272|00| |00000018063|N///
In the above example, the records are from two files. The highlighted ones are the differences between the records. So the output should be like this..
[TABLE="class: grid, width: 4780"]
<tbody>[TR]
[TD]HH_NUMBER[/TD]
[TD]CLASSIFICATION_DATE[/TD]
[TD]PERSON_CODE[/TD]
[TD]MV_MIN_OF_PGM[/TD]
[TD]DURATION_IN_MINUTES[/TD]
[TD]MV_END_MOP[/TD]
[TD]HH_TIME_ZONE[/TD]
[TD]DATE_CLOSED[/TD]
[TD]LIVE_PLAY_IND[/TD]
[TD]ACPM_SAMPLE_IND[/TD]
[TD]LAN[/TD]
[TD]ORIGIN[/TD]
[TD]MM_MKT_CODE[/TD]
[TD]MM_METRO_IND[/TD]
[TD]PERSON_SEX[/TD]
[TD]USAGE_IND[/TD]
[TD]HISP_SAMPLE_IND[/TD]
[TD]VISITOR_IND[/TD]
[TD]HOH_IND[/TD]
[TD]LOH_IND[/TD]
[TD]PARTTIME_WW_IND[/TD]
[TD]FULLTIME_WW_IND[/TD]
[TD]WW_IND[/TD]
[TD]ACN_JOB_CODE[/TD]
[TD]CENSUS_JOB_CODE[/TD]
[TD]EDUCATION[/TD]
[TD]LONG_TERM_VISITOR[/TD]
[TD]AGE[/TD]
[TD]ACN_PROG_CODE[/TD]
[TD]RPRT_TCAST_NO[/TD]
[TD]COMPLEX_PROG_NO[/TD]
[TD]RPRT_DATE[/TD]
[TD]NET_CODE_AL[/TD]
[TD]PRIOR_VIEW_IND[/TD]
[TD]PRIOR_EVEN_IND[/TD]
[TD]PRIOR_RECORD_IND[/TD]
[TD]LIVE_VIEW_IND[/TD]
[TD]ACN_EVENT_DATE[/TD]
[TD]PLAY_DELAY[/TD]
[TD]SDP_IND[/TD]
[TD]NY_ST_TIME[/TD]
[TD]NY_DATE[/TD]
[TD]VIEWING_WEIGHT[/TD]
[TD]PRIOR_LIVE_REC_IND[/TD]
[TD]DMA_CODE[/TD]
[TD]NY_END_TIME[/TD]
[TD]PLAYBACK_SOURCE[/TD]
[TD]EXTENDED_HOME_IND[/TD]
[TD]PRIMARY_HHLD_ID[/TD]
[TD]NPMH_SAMPLE_IND[/TD]
[TD]File Mismatch[/TD]
[TD]Mismatch Reason[/TD]
[/TR]
[TR]
[TD]00000012596[/TD]
[TD]112295[/TD]
[TD]000[/TD]
[TD]0010[/TD]
[TD]0006[/TD]
[TD]0015[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD]Y[/TD]
[TD][/TD]
[TD][/TD]
[TD]106[/TD]
[TD][/TD]
[TD][/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000199085[/TD]
[TD]99875[/TD]
[TD]000[/TD]
[TD]112295[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000000[/TD]
[TD]00000[/TD]
[TD]0[/TD]
[TD]1329[/TD]
[TD]112295[/TD]
[TD]000003492[/TD]
[TD]N[/TD]
[TD]000[/TD]
[TD]1334[/TD]
[TD]00[/TD]
[TD][/TD]
[TD]00000012596[/TD]
[TD]N///[/TD]
[TD]Media Events[/TD]
[TD]Mismatches in MV_MIN_OF_PGM AND MV_END_MOP[/TD]
[/TR]
[TR]
[TD]00000012596[/TD]
[TD]112295[/TD]
[TD]000[/TD]
[TD]0014[/TD]
[TD]0006[/TD]
[TD]0019[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD]Y[/TD]
[TD][/TD]
[TD][/TD]
[TD]106[/TD]
[TD][/TD]
[TD][/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000199085[/TD]
[TD]99875[/TD]
[TD]000[/TD]
[TD]112295[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000000[/TD]
[TD]00000[/TD]
[TD]0[/TD]
[TD]1329[/TD]
[TD]112295[/TD]
[TD]000003492[/TD]
[TD]N[/TD]
[TD]000[/TD]
[TD]1334[/TD]
[TD]00[/TD]
[TD][/TD]
[TD]00000012596[/TD]
[TD]N///[/TD]
[TD]PROL[/TD]
[TD]Mismatches in MV_MIN_OF_PGM AND MV_END_MOP[/TD]
[/TR]
[TR]
[TD]00000011861[/TD]
[TD]112295[/TD]
[TD]002[/TD]
[TD]0126[/TD]
[TD]0001[/TD]
[TD]0126[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD]Y[/TD]
[TD][/TD]
[TD][/TD]
[TD]106[/TD]
[TD][/TD]
[TD]M[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]032[/TD]
[TD]000092153[/TD]
[TD]99873[/TD]
[TD]002[/TD]
[TD]112295[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000000[/TD]
[TD]00000[/TD]
[TD]0[/TD]
[TD]1110[/TD]
[TD]112295[/TD]
[TD]000003905[/TD]
[TD]N[/TD]
[TD]000[/TD]
[TD]1110[/TD]
[TD]00[/TD]
[TD][/TD]
[TD]00000011861[/TD]
[TD]N///[/TD]
[TD]Media Events[/TD]
[TD]Records present in Media Events file but missing in PROL file[/TD]
[/TR]
</tbody>[/TABLE]
The last two columns displays the record present in which file and the reason for mismatch.
Any help is highly appreciable!!! PLEASE try to help me out..
I gotta serious problem here.. any kind of help is much appreciated!!
I have two huge text files (130 MB)each with thousands of records in each. I need to compare the two files using vba or by any means and generate a spreadsheet which includes the header and with two additional columns. The two additional columns will be the file name and in the next column it should display in which particular column is error. Each record will be having multiple discrepancies. One file can have the records which cannot be found in the other file. So this condition should also be recorded in the spreadsheet.
Example:
File 1: Taking one record from each.
00000018063|112295|000|0005|0009|0013|1| | |Y| | |106| | |1| | | | | | | | | | | | |000822090|99996|000|112295|C| | | | |000000|00000|0|1264|112295|000003883|N|000|1272|00| |00000018063|N///
File 2:
00000018063|112295|000|0005|0013|0017|1| | |Y| | |106| | |1| | | | | | | | | | | | |000822090|99996|000|112295|C| | | | |000000|00000|0|1260|112295|000003883|N|000|1272|00| |00000018063|N///
In the above example, the records are from two files. The highlighted ones are the differences between the records. So the output should be like this..
[TABLE="class: grid, width: 4780"]
<tbody>[TR]
[TD]HH_NUMBER[/TD]
[TD]CLASSIFICATION_DATE[/TD]
[TD]PERSON_CODE[/TD]
[TD]MV_MIN_OF_PGM[/TD]
[TD]DURATION_IN_MINUTES[/TD]
[TD]MV_END_MOP[/TD]
[TD]HH_TIME_ZONE[/TD]
[TD]DATE_CLOSED[/TD]
[TD]LIVE_PLAY_IND[/TD]
[TD]ACPM_SAMPLE_IND[/TD]
[TD]LAN[/TD]
[TD]ORIGIN[/TD]
[TD]MM_MKT_CODE[/TD]
[TD]MM_METRO_IND[/TD]
[TD]PERSON_SEX[/TD]
[TD]USAGE_IND[/TD]
[TD]HISP_SAMPLE_IND[/TD]
[TD]VISITOR_IND[/TD]
[TD]HOH_IND[/TD]
[TD]LOH_IND[/TD]
[TD]PARTTIME_WW_IND[/TD]
[TD]FULLTIME_WW_IND[/TD]
[TD]WW_IND[/TD]
[TD]ACN_JOB_CODE[/TD]
[TD]CENSUS_JOB_CODE[/TD]
[TD]EDUCATION[/TD]
[TD]LONG_TERM_VISITOR[/TD]
[TD]AGE[/TD]
[TD]ACN_PROG_CODE[/TD]
[TD]RPRT_TCAST_NO[/TD]
[TD]COMPLEX_PROG_NO[/TD]
[TD]RPRT_DATE[/TD]
[TD]NET_CODE_AL[/TD]
[TD]PRIOR_VIEW_IND[/TD]
[TD]PRIOR_EVEN_IND[/TD]
[TD]PRIOR_RECORD_IND[/TD]
[TD]LIVE_VIEW_IND[/TD]
[TD]ACN_EVENT_DATE[/TD]
[TD]PLAY_DELAY[/TD]
[TD]SDP_IND[/TD]
[TD]NY_ST_TIME[/TD]
[TD]NY_DATE[/TD]
[TD]VIEWING_WEIGHT[/TD]
[TD]PRIOR_LIVE_REC_IND[/TD]
[TD]DMA_CODE[/TD]
[TD]NY_END_TIME[/TD]
[TD]PLAYBACK_SOURCE[/TD]
[TD]EXTENDED_HOME_IND[/TD]
[TD]PRIMARY_HHLD_ID[/TD]
[TD]NPMH_SAMPLE_IND[/TD]
[TD]File Mismatch[/TD]
[TD]Mismatch Reason[/TD]
[/TR]
[TR]
[TD]00000012596[/TD]
[TD]112295[/TD]
[TD]000[/TD]
[TD]0010[/TD]
[TD]0006[/TD]
[TD]0015[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD]Y[/TD]
[TD][/TD]
[TD][/TD]
[TD]106[/TD]
[TD][/TD]
[TD][/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000199085[/TD]
[TD]99875[/TD]
[TD]000[/TD]
[TD]112295[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000000[/TD]
[TD]00000[/TD]
[TD]0[/TD]
[TD]1329[/TD]
[TD]112295[/TD]
[TD]000003492[/TD]
[TD]N[/TD]
[TD]000[/TD]
[TD]1334[/TD]
[TD]00[/TD]
[TD][/TD]
[TD]00000012596[/TD]
[TD]N///[/TD]
[TD]Media Events[/TD]
[TD]Mismatches in MV_MIN_OF_PGM AND MV_END_MOP[/TD]
[/TR]
[TR]
[TD]00000012596[/TD]
[TD]112295[/TD]
[TD]000[/TD]
[TD]0014[/TD]
[TD]0006[/TD]
[TD]0019[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD]Y[/TD]
[TD][/TD]
[TD][/TD]
[TD]106[/TD]
[TD][/TD]
[TD][/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000199085[/TD]
[TD]99875[/TD]
[TD]000[/TD]
[TD]112295[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000000[/TD]
[TD]00000[/TD]
[TD]0[/TD]
[TD]1329[/TD]
[TD]112295[/TD]
[TD]000003492[/TD]
[TD]N[/TD]
[TD]000[/TD]
[TD]1334[/TD]
[TD]00[/TD]
[TD][/TD]
[TD]00000012596[/TD]
[TD]N///[/TD]
[TD]PROL[/TD]
[TD]Mismatches in MV_MIN_OF_PGM AND MV_END_MOP[/TD]
[/TR]
[TR]
[TD]00000011861[/TD]
[TD]112295[/TD]
[TD]002[/TD]
[TD]0126[/TD]
[TD]0001[/TD]
[TD]0126[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD]Y[/TD]
[TD][/TD]
[TD][/TD]
[TD]106[/TD]
[TD][/TD]
[TD]M[/TD]
[TD]1[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]032[/TD]
[TD]000092153[/TD]
[TD]99873[/TD]
[TD]002[/TD]
[TD]112295[/TD]
[TD]C[/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD][/TD]
[TD]000000[/TD]
[TD]00000[/TD]
[TD]0[/TD]
[TD]1110[/TD]
[TD]112295[/TD]
[TD]000003905[/TD]
[TD]N[/TD]
[TD]000[/TD]
[TD]1110[/TD]
[TD]00[/TD]
[TD][/TD]
[TD]00000011861[/TD]
[TD]N///[/TD]
[TD]Media Events[/TD]
[TD]Records present in Media Events file but missing in PROL file[/TD]
[/TR]
</tbody>[/TABLE]
The last two columns displays the record present in which file and the reason for mismatch.
Any help is highly appreciable!!! PLEASE try to help me out..