Automatically extracting data from a standard format Word file into Excel

lokaris2

New Member
Joined
Apr 13, 2015
Messages
2
Hello,

I'd like to understand how to extract specific rows from a (Standardized format) word file into an Excel spreadsheet.

Here's the Word file I've uploaded to Box... https://app.box.com/s/s262fcw0nv4jz8bjyij9ik8dzsqmpive

... however in case someone doesn't want to download the file for the fear of a virus I'll try to explain the content here:

- 1st page: a basic table of contents generated with Word itself, on article titles.
- Next pages: full articles like this:

---

Huawei and China Unicom Sign Smart Home Gateway Development Agreement

405 words
24 March 2015
M2 Presswire
MTPW
English
© 2015, M2 Communications. All rights reserved.

Shenzhen, China - Huawei today announced it has signed a Smart Home Gateway Joint Development Framework Agreement with the Research Institute of China United Network Communications Limited (China Unicom). This marks the joint effort of both parties to innovate new techniques and standards for smart home networks, further enhancing their cooperation in this field.

Fiber-to-the-home (FTTH) networks bring greater service experience for end users. The China Unicom “Smart WO Home” business will make use of the FTTH high speed network and intelligent platform, a key component in the smart home network. The smart home gateway is the entry point to the household internet and provides users with a full range of smart home services, including home security, home appliance control, health monitoring and home entertainment. Through this partnership, China Unicom Research Institute and Huawei will jointly promote the innovation of the FTTH smart home gateway as the standard for smart services, providing “Smart WO Home” for users.

[...more content...]

Document MTPW000020150325eb3o003jy


---

That is, the elements of a typical article are:
1) Title (in bold font): Verizon hires Belkim WeMo exec to head smart home ops
2) Publication date: 26 March 2015
3) Source: Telecompaper Americas
4) Article first paragraph: Shenzhen, China - Huawei today announced it has signed a Smart Home Gateway Joint Development Framework Agreement with the Research Institute of China United Network Communications Limited (China Unicom). This marks the joint effort of both parties to innovate new techniques and standards for smart home networks, further enhancing their cooperation in this field.
5) Article further paragraphs / further body (can stretch to several pages)
6) Document ID number: Document TELAM00020150326eb3q0008d


I'd love to extract these elements into an Excel spreadsheet by using a macro - any thoughts welcome!

Thank you,
Lokaris
 

Excel Facts

Highlight Duplicates
Home, Conditional Formatting, Highlight Cells, Duplicate records, OK to add pink formatting to any duplicates in selected range.

Forum statistics

Threads
1,223,237
Messages
6,170,930
Members
452,367
Latest member
TePunaBloke

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top