Automate data export from multiple word tables into excel

andrewhoddie

Board Regular
Joined
Dec 21, 2008
Messages
118
Hi

I am looking to automate a process as best I can. I have a word document which has a number of different tables in it. These tables are used to collect data and can vary in size and have automated numbering which continues over the different tables.

The current process involves me completing the tables in word and then manually copying a number of rows of data from word into excel which is very time consuming. The data does need to be in multiple tables as we break the report down into different headings for the end user and we can present it so ideally the section does not go over a page break.

The word document is made up of a mix of text and tables and it is only certain tables I need to extract. Is it possible to take the data from each of these different tables and automatically populate one excel sheet?

Just to complicate things further I ideally only want to bring through data from Columns 1,3,4,5
Column 1 = auto numbering (number)
Column 3 = Recommendation (text)
Column 4 = rating (this is currently an icon)
Column 5 = rating size (text)

These would go into the following columns in excel

Column 1 - A
Column 3 - B
Column 4 - C
Column 5 - E

I dont know if this would also be possible to do but if the text in column 3 said something like "No Action" this row is excluded from the export that would also be amazing but that would be an ideal world.

If there are changes I can make to the word document I am willing to consider this but I cant guarantee the boss will agree.

Any help gratefully accepted.

many thanks
 

Excel Facts

What is =ROMAN(40) in Excel?
The Roman numeral for 40 is XL. Bill "MrExcel" Jelen's 40th book was called MrExcel XL.
Hi andrewhoddie. This code assumes that you want data from every table and that the only pics in the document are in the table(s). You will need to adjust the file path for the document. Save your wb before testing. HTH. Dave
VBA Code:
Sub test2()
Dim WordApp As Object, TableTot As Integer, TableStart As Integer
Dim irow As Integer, icol As Integer, TableColArr As Variant
Dim ws As Worksheet, ColCnt As Integer, RowCnt As Integer, ishp As Variant

'create Word app
On Error Resume Next
Set WordApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then
Set WordApp = CreateObject("Word.Application")
End If
On Error GoTo 0
'show Word doc and leave open
WordApp.Visible = True
'********adjust file path to suit
WordApp.Documents.Open ("C:\Testfolder\testdoc.docx")

TableColArr = Array(1, 3, 4, 5) 'table columns
Set ws = ActiveWorkbook.Sheets("Sheet1")
RowCnt = 1 'sheet start row
With WordApp.ActiveDocument
    TableTot = .Tables.Count
'loop tables
    For TableStart = 1 To TableTot
        'output tbl by row contents
        With .Tables(TableStart)
            For irow = 1 To .Rows.Count ' table rows
                'skip table rows with 3rd col "No Action"
                If Application.WorksheetFunction.Clean(.cell(irow, 3).Range.Text) <> "No Action" Then
                ColCnt = 0 'sheet columns
                For icol = LBound(TableColArr) To UBound(TableColArr) 'table columns
                ColCnt = ColCnt + 1
                If ColCnt <> 3 Then 'move text to sheet
                ws.Cells(RowCnt, ColCnt) = Application.WorksheetFunction.Clean(.cell(irow, TableColArr(icol)).Range.Text)
                Else 'copy pic to sheet
                If Application.WorksheetFunction.Clean(.cell(irow, TableColArr(icol)).Range.Text) <> vbNullString Then
                Set ishp = WordApp.ActiveDocument.InlineShapes(irow)
                ishp.Range.CopyAsPicture
                ws.Cells(RowCnt, ColCnt).PasteSpecial
                Application.CutCopyMode = False
                End If
                End If
                Next icol
                End If
            RowCnt = RowCnt + 1 'sheet rows
            Next irow
        End With
    Next TableStart
End With
'unselect pic
ws.Cells(1, 1).Select
End Sub
 
Upvote 0
Hi @NdNoviceHlp

Thanks very much for your help, I am having issues with some of this. When I run this VBA script I get a Microsoft Visual Basic for Applications 400 error. It appears to be when it comes to the end of the first table or after it pastes the icon from the 4th column.

If I remove the icon I get the same 400 error, also when the icon copies it does not paste into a specific cell - not sure if this is the issue but ideally the icon would be in the cell as this will be used for sorting.

I have got the data from the first table upto the data after the first icon.

Is there anything that you could suggest that would help me get past this issue?

many thanks
Andrew
 
Upvote 0
Hi Andrew. It likely has to do with how your doc is set up. Are you transferring data from all tables in the doc? it's difficult to get the inline shape index of the pics if there are either non table pics, or only some tables are accessed. My trials pasted the pic to the cell so I'm not sure what's up there. It would be very helpful if you could post a sample mock doc that doesn't contain any confidential/personal info. Dave
 
Upvote 0
Andrew I found out how to define the index of an inlineshape in a document here....
'Determine index number for inline shape
Here's some adjusted code. I changed the table row transfer to exclude table headers. You will have to adjust the file path to suit. Dave
VBA Code:
Sub test2()
Dim WordApp As Object, TableTot As Integer, TableStart As Integer
Dim irow As Integer, icol As Integer, TableColArr As Variant, Rng As Variant
Dim ws As Worksheet, ColCnt As Integer, RowCnt As Integer, ishp As Variant

'create Word app
On Error Resume Next
Set WordApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then
Set WordApp = CreateObject("Word.Application")
End If
On Error GoTo 0
'show Word doc and leave open
WordApp.Visible = True
'********adjust file path to suit
WordApp.Documents.Open ("C:\Testfolder\testdoc.docx")

TableColArr = Array(1, 3, 4, 5) 'table columns
Set ws = ActiveWorkbook.Sheets("Sheet1")
RowCnt = 1 'sheet start row
With WordApp.activedocument
    TableTot = .Tables.Count
'loop tables
    For TableStart = 1 To TableTot
        'output tbl by row contents
        With .Tables(TableStart)
            'exclude table header
            For irow = 2 To .Rows.Count ' table rows
                'skip table rows with 3rd col "No action."
                If Application.WorksheetFunction.Clean(.cell(irow, 3).Range.Text) <> "No action." Then
                ColCnt = 0 'sheet columns
                For icol = LBound(TableColArr) To UBound(TableColArr) 'table columns
                ColCnt = ColCnt + 1
                If ColCnt <> 3 Then 'move text to sheet
                ws.Cells(RowCnt, ColCnt) = Application.WorksheetFunction.Clean(.cell(irow, TableColArr(icol)).Range.Text)
                Else 'copy pic to sheet
                'check for pic
                If Application.WorksheetFunction.Clean(.cell(irow, TableColArr(icol)).Range.Text) <> vbNullString Then
                'get inlineshape index
                Set Rng = WordApp.activedocument.Content
                Rng.End = .cell(irow, TableColArr(icol)).Range.End
                Set ishp = WordApp.activedocument.inlineShapes(Rng.inlineShapes.Count)
                ishp.Range.CopyAsPicture
                ws.Cells(RowCnt, ColCnt).PasteSpecial
                Application.CutCopyMode = False
                End If
                End If
                Next icol
                RowCnt = RowCnt + 1 'sheet rows
                End If
            Next irow
        End With
    Next TableStart
End With
'unselect pic
ws.Cells(1, 1).Select
End Sub
 
Upvote 0
After a PM exchange, it was determined that the contents of column 1 of the tables were actually bulleted numbering. So the previous code was producing erroneous results for table 1 columns. This code will give the bulleted number of table column 1 of the extracted row (if column 1 is bulleted otherwise it just returns the contents of table column 1). Again, adjust the file path (and sheet number) as necessary. HTH. Dave
VBA Code:
Sub test2()
Dim WordApp As Object, TableTot As Integer, TableStart As Integer, BulletCount As Integer
Dim irow As Integer, icol As Integer, TableColArr As Variant, Rng As Variant
Dim ws As Worksheet, ColCnt As Integer, RowCnt As Integer, ishp As Variant

'create Word app
On Error Resume Next
Set WordApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then
Set WordApp = CreateObject("Word.Application")
End If
On Error GoTo 0
'show Word doc and leave open
WordApp.Visible = True
'********adjust file path to suit
WordApp.Documents.Open ("C:\Testfolder\testdoc.docx")

TableColArr = Array(1, 3, 4, 5) 'table columns
Set ws = ActiveWorkbook.Sheets("Sheet1")
RowCnt = 1 'sheet start row

With WordApp.activedocument
TableTot = .Tables.count
'loop tables
For TableStart = 1 To TableTot
With .Tables(TableStart)
    'output tbl by row contents
    'exclude table header (row1)
    For irow = 2 To .Rows.count ' table rows
        'count 1st column table bullets if numbered
        If .cell(irow, 1).Range.ListFormat.ListType = 3 Then ' 3 is Simple numeric list
        BulletCount = BulletCount + 1
        End If
        'skip table rows with 3rd col "No action."
        If Application.WorksheetFunction.Clean(.cell(irow, 3).Range.Text) <> "No action." Then
        ColCnt = 0 'sheet columns
            For icol = LBound(TableColArr) To UBound(TableColArr) 'table columns
            ColCnt = ColCnt + 1
            If ColCnt <> 3 Then 'move text to sheet
                'if table bullets in Column 1 output bullet number to sheet
                If BulletCount > 0 And ColCnt = 1 Then
                ws.Cells(RowCnt, ColCnt) = BulletCount
                Else
                ws.Cells(RowCnt, ColCnt) = Application.WorksheetFunction.Clean(.cell(irow, TableColArr(icol)).Range.Text)
                End If
            Else 'copy pic to sheet
            'check for pic
            If Application.WorksheetFunction.Clean(.cell(irow, TableColArr(icol)).Range.Text) <> vbNullString Then
            'get inlineshape index
            Set Rng = WordApp.activedocument.Content
            Rng.End = .cell(irow, TableColArr(icol)).Range.End
            Set ishp = WordApp.activedocument.inlineShapes(Rng.inlineShapes.count)
            ishp.Range.CopyAsPicture
            ws.Cells(RowCnt, ColCnt).PasteSpecial
            Application.CutCopyMode = False
            End If
            End If
            Next icol
        RowCnt = RowCnt + 1 'sheet rows
        End If
    Next irow
End With
Next TableStart
End With
'unselect pic
ws.Cells(1, 1).Select
End Sub
 
Upvote 0
Solution
You are welcome Andrew. It was an interesting project. I learned a couple of things along the way. To get the index number of a document's inlineshapes, you need to set a range from the start of the doc to the end range that includes the inline shape. The index number can then be inferred by doc.inlinesshapes.count. Also, when a table has bulleted numbers in Word, you can't directly retrieve the bulleted number. Word stores some kind of hidden location numbers in bulleted table cells that are retrievable but are not the bullet numbers. For example, if a Word doc that has 2 tables with bulleted numbers each with 3 rows and 6 total number bullets, the cells in the first table will indicate that 1, 2 & 3 are stored in the bullet cells, but instead of returning 4, 5, & 6 in the 2nd table, the bullet cells again return 1,2 & 3. So it seems that you need to count each bullet if you want to identify the number of the current bullet. Anyways, glad that you were able to adjust the code for your needs. Thanks for posting your outcome. Dave
 
Upvote 0

Forum statistics

Threads
1,224,884
Messages
6,181,558
Members
453,053
Latest member
Kiranm13

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top