How to extract XML from HTML, I think?

bubbapost

Board Regular
Joined
Mar 11, 2009
Messages
116
Hello,

I have a friend who is recieving a data dump & in a format that is not easily imported into excel. I think it's a mixture of HTML & XML. I would like to create a macro that will extract the XML from the other data.

Here is a sample:

<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px"><test this is the body of the text/BODY></BODY>'
'<BODY source_object_id= source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5764" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">.&nbsp;test this is text.</BODY></BODY>'
.1,'<BODY source_object_id="194713" source_object_type=""><HEAD>
<META content="MSHTML 6.00.2900.5803" name=GENERATOR></HEAD>
<BODY style="BORDER-RIGHT: 0px; BORDER-TOP: 0px; BORDER-LEFT: 0px; BORDER-BOTTOM: 0px">

Will anyone help me?

Thank you.
 

Excel Facts

Copy a format multiple times
Select a formatted range. Double-click the Format Painter (left side of Home tab). You can paste formatting multiple times. Esc to stop

Forum statistics

Threads
1,223,164
Messages
6,170,444
Members
452,326
Latest member
johnshaji

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top