Extract data from tables in Word to Excel

NielsChristiansen

New Member
Joined
May 23, 2016
Messages
2
Dear all,

I'm trying to extract data from Word to Excel. The data is contained in tables and subtables within these tables.
The main tables are divided into "sections".

In these sections I need to extract the "chapter" and "title" given that the section has a subtable.
The "chapter" and "title" can have the form "A.1" and "Title title title". The title is placed 3 cells above the subtable in the Word table. See image for better description.

I could give the subtables a name in Word, which is the way it is done in the attached code.
The problem is that the title changes depending on the project, so ideally I would like to offset from the location of the subtable to get the chapter and title.

This requires (I think) knowing the cell location of the subtable (nested table) in the parent table. I've not been able to find a way to obtain the cell of the current subtable.

I've attached a picture of the Word table and the current code in Excel VBA (sorry for Danish commenting). The code is probably not relevant for the question.

Thank you so much! :)

Forum.png




Code:
Sub Importer_KS_Tabeller()


Dim wdDoc As Object
Dim wdFileName As Variant
Dim iRow As Long 'row index in Excel
Dim iCol As Integer 'column index in Excel
Dim resultRow As Long
Dim tableRows As Integer
Dim tableStart As Integer
Dim tableTot As Integer
Dim subtableStart As Integer
Dim subtableTot As Integer
Dim cursubTable As Integer
Dim HovedTitel As String
Dim Undertabel_titel As String
Dim lSpace As Long
Dim rSpace As Long
Dim SubtableCounter As Integer
Dim MissingTitle As Integer


On Error Resume Next


'Ryd Excel arket
ActiveSheet.Range("A:AZ").ClearContents
ActiveSheet.Range("A:AZ").ClearFormats


wdFileName = Application.GetOpenFilename("Word files (*.docx),*.docx", , _
"Vælg standdardbeskrivelse hvor KS-tabeller skal udtrækkes fra")


If wdFileName = False Then Exit Sub '(user cancelled import file browser)


Set wdDoc = GetObject(wdFileName) 'open Word file


'With wdDoc
    tableTot = wdDoc.Tables.Count
    If tableTot = 0 Then
       MsgBox "Dokumentet indeholder ingen tabeller.", _
       vbExclamation, "Importer Word Tabel"
   'ElseIf tableNo > 1 Then
   ' tableNo = InputBox("Standardbeskrivelsen indeholder " & tableNo & " tabeller." & vbCrLf & _
    '   "Vælg hvilken der skal startes fra", "Importer Word tabel", "1")
   End If


 'Indsæt hovedoverskrift (fx 01 - Tag)
    'Indsæt nummer og titel
    HovedTitel = WorksheetFunction.Clean(wdDoc.Tables(1).Cell(1, 1).Range.Text)
    
    lSpace = InStr(HovedTitel, " ")
    
    rSpace = InStrRev(HovedTitel, " ")
            
    Cells(2, 1) = "'" & Trim(Left(HovedTitel, lSpace))
    
    Cells(2, 2) = "'" & Trim(Right(HovedTitel, rSpace))
        
    'Tilføj formatering på hovedoverskrift
    Range(Cells(2, 1), Cells(2, 2)).Font.Name = "Melior LT Std"
    Range(Cells(2, 1), Cells(2, 2)).Font.Size = 18
    
'Vælg linje hvor de kopierede tabeller skal indsættes
   resultRow = 4


'With .Tables(tableStart)


'START LOOP MED HOVEDTABELLER
        For tableStart = 2 To tableTot
            'Indsæt underoverskrift (fx. 01.A Udskiftning af tegl...)
                      
            'Indsæt nummering
            Cells(resultRow, 1) = WorksheetFunction.Clean(wdDoc.Tables(tableStart).Cell(1, 1).Range.Text)
            'Indsæt titel
            Cells(resultRow, 2) = WorksheetFunction.Clean(wdDoc.Tables(tableStart).Cell(1, 2).Range.Text)
            
            'Tilføj formatering
             Range(Cells(resultRow, 1), Cells(resultRow, 2)).Font.Name = "Calibri"
             Range(Cells(resultRow, 1), Cells(resultRow, 2)).Font.Size = 13
             Range(Cells(resultRow, 1), Cells(resultRow, 2)).Font.Bold = True
                       
            resultRow = resultRow + 2
            
'Stop
'START LOOP MED UNDERTABELLER
            subtableTot = wdDoc.Tables(tableStart).Tables.Count
                    For cursubTable = 1 To 2 'subtableTot
                            'Kig kun i cellen (1,1)
                                   If WorksheetFunction.Clean(wdDoc.Tables(tableStart).Tables(cursubTable).Cell(1, 1).Range.Text) = "Emne" Then
                                       
                                        'Indsæt nummer og overskrift der gælder for aktuel KS-tabel
                                        'Indsæt nummering
                                        
                                        'Tjek om tabeltitel er ens ift. sidste undertabel?
                                        
                                            Undertabel_titel = wdDoc.Tables(tableStart).Tables(cursubTable).Title
                                            
                                                If Undertabel_titel = "" Then
                                                Undertabel_titel = "OBS Mangler_Titel!"
                                                MissingTitle = MissingTitle + 1
                                                End If
                                                                                     
                                            lSpace = InStr(Undertabel_titel, " ")
    
                                            rSpace = InStrRev(Undertabel_titel, " ")
            
                                            Cells(resultRow, 1) = "'" & Trim(Left(Undertabel_titel, lSpace))
    
                                            Cells(resultRow, 2) = "'" & Trim(Right(Undertabel_titel, Len(Undertabel_titel) - lSpace))
                                            
                                            'Tilføj formatering for undertabel titel
                                            Range(Cells(resultRow, 1), Cells(resultRow, 2)).Font.Name = "Calibri"
                                            Range(Cells(resultRow, 1), Cells(resultRow, 2)).Font.Size = 11
                                            Range(Cells(resultRow, 1), Cells(resultRow, 2)).Font.Bold = True
                                            
                                            resultRow = resultRow + 2
                                    
                                    'Indsæt indhold fra undertabel
                                                                                    
                                                For iRow = 1 To wdDoc.Tables(tableStart).Tables(cursubTable).Rows.Count
                                                 For iCol = 1 To wdDoc.Tables(tableStart).Tables(cursubTable).Columns.Count
                                                        Cells(resultRow, iCol) = WorksheetFunction.Clean(wdDoc.Tables(tableStart).Tables(cursubTable).Cell(iRow, iCol).Range.Text)
                                                    Next iCol
                                                    resultRow = resultRow + 1
                                                Next iRow
                                    
                                                'Tilføj formattering for undertabel
                                                iRow = iRow - 1
                                                Range(Cells(resultRow - iRow, 1), Cells(resultRow - 1, 4)).Font.Name = "Calibri"
                                                Range(Cells(resultRow - iRow, 1), Cells(resultRow - 1, 4)).Font.Size = 10
                                                Range(Cells(resultRow - iRow, 1), Cells(resultRow - 1, 4)).Borders.LineStyle = xlContinous
                                                Range(Cells(resultRow - iRow, 1), Cells(resultRow - 1, 4)).Borders.Color = RGB(217, 217, 217)
                                                
                                                
                                                Range(Cells(resultRow - iRow, 1), Cells(resultRow - iRow, 4)).Font.Bold = True
                                                Range(Cells(resultRow - iRow, 1), Cells(resultRow - iRow, 4)).Interior.Color = RGB(217, 217, 217)
                                                
                                        'Stop
                                                
                                  SubtableCounter = SubtableCounter + 1
                                  
                                  Else
                                  
                                  'Do nothing


                                  End If
                        'Afstand mellem indsatte KS-tabeller? Hvis ikke fjernes denne linje
                        resultRow = resultRow + 1
                        
                    Next cursubTable
                '**** LOOP MED UNDERTABELLER
        Next tableStart
    '**** LOOP MED HOVEDTABELLER


'Formater alt: Topjuster, Ombryd tekst
Range(Cells(1, 1), Cells(resultRow, 4)).VerticalAlignment = xlTop
Range(Cells(1, 1), Cells(resultRow, 4)).WrapText = True


'End With
        
'End With
        
MsgBox ("Færdig! Der blev udtrukket i alt " & SubtableCounter & " KS-tabeller.")


If MissingTitle > 0 Then
MsgBox ("OBS - Der mangler titel på i alt " & MissingTitle & " KS-tabeller.")
End If


End Sub
 

Excel Facts

What does custom number format of ;;; mean?
Three semi-colons will hide the value in the cell. Although most people use white font instead.
Assuming the nested table is in the 2nd column on the 4th row, to get the nested table's content you could use code like:
Code:
With wdDoc.Tables(1).Cell(4, 2).Range
  MsgBox Split(.Cells(1).Tables(1).Cell(1, 2).Range.Text, vbCr)(0)
End With
 
Last edited:
Upvote 0
Thank you for your reply Macropod.

I'm able to get the content of the subtables through a loop (attached), its only the chapter and heading missing.

Your suggestion is great, but the position of the chapter+title varies since the parent table contains many subtables (e.g. picture in OP has only one parent table, thus chapter is in cell (1,1) and (5,1).

For iRow = 1 To wdDoc.Tables(tableStart).Tables(cursubTable).Rows.Count
For iCol = 1 To wdDoc.Tables(tableStart).Tables(cursubTable).Columns.Count
Cells(resultRow, iCol) = WorksheetFunction.Clean(wdDoc.Tables(tableStart).Tables(cursubTable).Cell(iRow, iCol).Range.Text)
Next iCol
resultRow = resultRow + 1
Next iRow
 
Upvote 0

Forum statistics

Threads
1,223,240
Messages
6,170,951
Members
452,368
Latest member
jayp2104

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top