Remove specific HTML part between two tags.

strooman

Active Member
Joined
Oct 29, 2013
Messages
339
Office Version
  1. 2016
Platform
  1. Windows
I want to completely remove the part between <header data-component="header"> and </header> including these two tags. There can be more or less text between the tags
How can I do that in VBA?

HTML:
<header data-component="header">
    <div>
        <h1>Example Domain</h1>
        <p>This domain is for use in illustrative examples in documents. You may use this
        domain in literature without prior coordination or asking for permission.</p>
        <p><a href="https://www.iana.org/domains/example">More information...</a></p>
    </div>
</header>

Below is a sample of the complete HTML page

HTML:
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>FancyBox Lightbox Example</title>
    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/fancybox/3.5.7/jquery.fancybox.min.css" />
</head>

<body>
    
    
    <header data-component="header">
    <div>
        <h1>Example Domain</h1>
        <p>This domain is for use in illustrative examples in documents. You may use this
        domain in literature without prior coordination or asking for permission.</p>
        <p><a href="https://www.iana.org/domains/example">More information...</a></p>
    </div>
    </header>
    
    
    <div class="gallery">
        <a href="2560.webp" data-fancybox="gallery" data-caption="Caption for image 1">
            <img src="2560.webp" alt="Image 1" width="250">
    </div>

    <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.6.0/jquery.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/fancybox/3.5.7/jquery.fancybox.min.js"></script>
    <script>
        $(document).ready(function() {
            $('[data-fancybox="gallery"]').fancybox({
                // Options can be added here
            });
        });
    </script>
</body>
</html>
 

Excel Facts

Select all contiguous cells
Pressing Ctrl+* (asterisk) will select the "current region" - all contiguous cells in all directions.
Try.
VBA Code:
Sub del_tag()
    Dim xHtml As String
    xHtml = Range("A1") ' Change to your html resource
    
    x1 = Split(xHtml, "<header data-component=""header"">")(0) ' the text before <header
    x2 = Split(xHtml, "</header>")(1) ' the text after </header>
    
    Result = x1 & x2
End Sub
 
Upvote 0
Solution
You're welcome.
I'm glad it works.

Note that there can only be one set of <header data-component="header"> and </header> in HTML by my code.
And avoid to put HTML in Excel cell (ex: xHtml = Range("A1") ) because of a cell only holding 32,767 characters.
 
Upvote 0

Forum statistics

Threads
1,226,264
Messages
6,189,928
Members
453,582
Latest member
Browny2821

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top