In E:\Test folder, I have multiple INPUT text files. Each text file contains multiple datasets. Each dataset is a record that begins with the ID tag and ends with the ENDID tag. I want to extract the VALUES of each MAIN TAG in the file and put them into an EXCEL sheet for easier viewing. The SHEET will have a column each for the MAIN tags and the values for each tag in the succeeding rows.
To illustrate better, please find below a sample of TWO datasets for illustration. The datasets are separated from each other by a SPACE (like the one below)
The OUTPUT Excel Sheet would contain the following columns:
The values for these columns would be the values for these specific tags in the document. Each ID tag is UNIQUE for each dataset. The ENDID has the same value as the ID tag and has no specific use except that it indicates the end of the dataset.
Can anyone help get started with this? I have little knowledge with VBA. Any help would be greatly appreciated.
HTML:
<ID> TAG VALUE:
Value between <ID>[space] and [NEXT LINE]<AGE>
<AGE> TAG VALUE:
Value between <AGE>[space] and [NEXT LINE]<PUBDATE>
<PUBDATE> TAG VALUE:
Value between <PUBDATE>[space] and [NEXT LINE]<WRITER>
<WRITER> TAG VALUE:
Value between <WRITER>[space] and [NEXT LINE]<LANGUAGE>
<LANGUAGE> TAG VALUE
Value between <LANGUAGE>[space] and [NEXT LINE]<PAGE>
<PAGE> TAG VALUE:
Value between <PAGE>[space] and [NEXT LINE]<DRAWINGS>
<DRAWINGS> TAG VALUE:
Value between <DRAWINGS>[space] and [NEXT LINE]<TITLE>
<TITLE> TAG VALUE:
Value between <TITLE>[space] and [NEXT LINE]<INVENTION>
<INVENTION> TAG VALUE:
Value between <INVENTION>[space] and [NEXT LINE]<DESCRIPTION>
<DESCRIPTION> TAG VALUE:
Value between <DESCRIPTION>[space] and [NEXT LINE]<USES>
<USES> TAG VALUE:
Value between <USES>[space] and [NEXT LINE]<ADDITIONAL>
<ADDITIONAL> TAG VALUE:
Value between <ADDITIONAL>[space] and [NEXT LINE]<FIGURE>
<FIGURE> TAG VALUE:
Value between <FIGURE>[space] and [NEXT LINE]<ENDID>
To illustrate better, please find below a sample of TWO datasets for illustration. The datasets are separated from each other by a SPACE (like the one below)
HTML:
<ID> 1132234-122X3
<AGE> 23
<PUBDATE> 20101022
<WRITER> 6996
<LANGUAGE> GER
<PAGE> 31
<DRAWINGS> 1/5
<TITLE> Morphometric device for determining necessity of food ingestion
<INVENTION> This device includes <BOLD>housing</BOLD>, <BOLD>transparent tubes</BOLD>, <BOLD>light source</BOLD>, <BOLD>diffusing screen</BOLD>, photodetectors, video camera, interface connected to computer for processing information, and software specific of data processing.
<DESCRIPTION> A method of designing feeding strategy for marine and continental fish species.
<USES> For determining necessity of food ingestion of fish species.
<ADDITIONAL> The device allows designing feeding strategy for fish species by evaluating nutritional state and growth potential.
<FIGURE> The drawing is a side view of morphometric device.
<ENDID> 1132234-122X3
<ID> 1222235-122X1
<AGE> 23
<PUBDATE> 20101022
<WRITER> 6996
<LANGUAGE> GER
<PAGE> 31
<DRAWINGS> 1/5
<TITLE> Thermal solar collector for absorbing heat from atmosphere
<INVENTION> A thermal solar collector comprises a coil that circulates a <BOLD>water/glycol mixture</BOLD>, and a <BOLD>thermodynamic panel<BOLD> coupled to solar collector via separators, <BOLD>polyurethane panel or insulator</BOLD>. Two circuits are provided to optimize the solar energy collecting panel.
<DESCRIPTION> A thermal solar collector for absorbing heat from the atmosphere for air conditioning, and for producing hot water for swimming pools.
<USES> A thermal solar collector for absorbing heat from the atmosphere
<ADDITIONAL> The thermal solar collector has a great capacity to absorb solar heat, saves cost, and requires only little space for installation.
<FIGURE> The drawings are perspective views of the thermal solar collector.
<ENDID> 1222235-122X1
The OUTPUT Excel Sheet would contain the following columns:
HTML:
Column A: All <ID> values
Column B: All <AGE> values
Column C: All <PUBDATE> values
Column D: All <WRITER> values
Column E: All <LANGUAGE> values
Column F: All <PAGE> values
Column G: All <DRAWINGS> values
Column H: All <TITLE> values
Column I: All <INVENTION> values
Column J: All <DESCRIPTION> values
Column K: All <USES> values
Column L: All <ADDITIONAL> values
Column M: All <FIGURE> values
The values for these columns would be the values for these specific tags in the document. Each ID tag is UNIQUE for each dataset. The ENDID has the same value as the ID tag and has no specific use except that it indicates the end of the dataset.
Can anyone help get started with this? I have little knowledge with VBA. Any help would be greatly appreciated.