Alternative approach to fix those problematic excel files when excel sees 1048576 as the last row # & you can't fix it.

johnnyL

Well-known Member
Joined
Nov 7, 2011
Messages
4,546
Office Version
  1. 2007
Platform
  1. Windows
I have created this thread to take a different approach than in my previous thread.

I was inspired by a post by @Dan_W in that thread that suggested to look at the xml code in the excel file.

This thread has the same goal to attempt to alter those problematic excel files that you have tried to erase the extra blank rows below the range of data in a sheet that seemingly can't be deleted. IE. after you attempt to delete those rows and then save the file, reopen the file, those extra 'blank' rows are still there & you remain with a seemingly way too big file.


I have seen some steps suggested on the internet to manually change the extension of an excel file to a zip entension, unzip that file, perform some alterations to an internal file in the zip file, rezip the files, rename the zip file back to the original file extension. I have NOT seen a script that performs all of those steps via a script for you.

I decided to create a script that will perform all of those manual steps I found, as well as a few others that will actually achieve the objective of correcting those problematic excel files that seem to have the lastcell row 'locked' to the maximum row # allowed in excel in a sheet.

When I say 'locked' I am referring to excel file sheets that you have tried to erase the extra blank rows below the range of data in a sheet that seemingly can't be deleted. IE. after you attempt to delete them and then save the file, reopen the file, those extra 'blank' rows are still there & you remain with a seemingly way too big file.

This script will ask you for the excel file that has the problematic sheet of 'ghost' blank rows that seeming can't be deleted. It will then ask you for a row # that should be the row # to start deleting all the way to the end of the file, which in these cases will be row # 1048576. ;)

The script will then run and produce a new excel file that has '_shortened' added to the original excel file name. In other words, the script will not alter the original excel file that is selected by the user at the beginning of the script.

When the script is completed, you should see a newly created excel file that is noticeably smaller in file size than the original problematic file. Open up the new excel file that was created, in excel and see if those problematic 'blank' rows are gone.

VBA Code:
Sub ZipAndUnzipExcelFileV_DeleteRows()
'
    Dim DialogBox                       As FileDialog
    Dim StartingRowToDelete             As Long
    Dim SourceFullName                  As String
'
    Set DialogBox = Application.FileDialog(msoFileDialogFilePicker)                 ' Create DialogBox to ask for file to use
    DialogBox.AllowMultiSelect = False                                              ' Allow only one file to be selected
    DialogBox.Title = "                                                                   "
    DialogBox.Title = DialogBox.Title & _
            "Select (.xlsx, .xlsm, .xlam) file to remove excess rows from"          ' Establish DialogBox.Title
'
    If DialogBox.Show <> -1 Then                                                    ' If no file selected then ...
        Exit Sub                                                                    '   Exit the sub
    Else                                                                            ' Else
        SourceFullName = DialogBox.SelectedItems(1)                                 '   Save the Path with file name & extension to SourceFullName
    End If                                                                          '
'
    StartingRowToDelete = Application.InputBox("Select the start row # to " & _
            "start deleting from", "Row #", Type:=1)                                ' Ask user what row to start deleting rows from
'
    Dim startTime                               As Single
    startTime = Timer                                                               ' Start the stopwatch
'
    Dim XML_EndRowDeleteCode                    As Double, XML_StartRowDeleteCode       As Double
    Dim XML_File                                As Long
    Dim FSO                                     As Object, ZipFile_Builder_Extracter    As Object
    Dim AdditionToFileName                      As String
    Dim SourceFileExtension                     As String, SourceFileName               As String, SourceFilePath   As String
    Dim XML_ReplacementSheetDataString          As String, XML_RowDeleteString          As String
    Dim XML_FileContents                        As String, XML_SheetFile                As String
    Dim TemporaryZipFullFilePathNameExtension   As Variant, TemporaryZipFilePath        As Variant
'
    With CreateObject("Scripting.FileSystemObject")                                 '
        SourceFilePath = .GetParentFolderName(SourceFullName) & "\"                 '   Get the path of the file selected
        SourceFileName = .GetBaseName(SourceFullName)                               '   Get the file name of the file selected
        SourceFileExtension = .GetExtensionName(SourceFullName)                     '   Get the extension of the file selected
    End With                                                                        '
'
    AdditionToFileName = "Temp" & Format(Now, " dd-mmm-yy h-mm-ss")                 ' Use date/time to create unique file name to add to selected file name
    TemporaryZipFullFilePathNameExtension = SourceFilePath & _
            AdditionToFileName & ".zip"                                             ' Save unique file name to TemporaryZipFullFilePathNameExtension
'
    On Error Resume Next                                                            ' Ignore errors and continue
    FileCopy SourceFullName, TemporaryZipFullFilePathNameExtension                  ' Copy and rename original file to a zip file with a unique name
'
    If Err.Number <> 0 Then                                                         ' If error detected then ...
        MsgBox "Unable to copy " & SourceFullName & vbNewLine _
        & "Verify the selected file is closed and try again"                        '   Alert the user
        Exit Sub                                                                    '   Exit the sub
    End If                                                                          '
    On Error GoTo 0                                                                 ' Enable Excel error handling
'
    TemporaryZipFilePath = SourceFilePath & AdditionToFileName & "\"                ' Save folder path for TemporaryZipFilePath
    MkDir TemporaryZipFilePath                                                      ' Create the temporary folder to unzip to
'
    Set ZipFile_Builder_Extracter = CreateObject("Shell.Application")               ' Create ZipFile_Builder_Extracter
    ZipFile_Builder_Extracter.Namespace(TemporaryZipFilePath).CopyHere _
            ZipFile_Builder_Extracter.Namespace(TemporaryZipFullFilePathNameExtension).items    ' Extract the files into the temporary folder
'
'-----------------------------------------------------------------------------------
'
'   loop through each file in the \xl\worksheets folder of the unzipped file
'
'       How will the problematic sheet be determined that needs the 'last row' corrected? ... Would have to enter the Codename to check for in each file ???
'
    XML_SheetFile = Dir(TemporaryZipFilePath & "\xl\worksheets\*.xml*")             ' Save xml file name into XML_SheetFile
'
    Do While XML_SheetFile <> ""                                                    ' Loop while an xml file is found
'       Read text of the xl\worksheets file to a variable
        XML_File = FreeFile                                                         '
        Open TemporaryZipFilePath & "xl\worksheets\" & _
                XML_SheetFile For Input As XML_File                                 '   Open the found xml file
        XML_FileContents = Input(LOF(XML_File), XML_File)                           '   Save the contents of the file into XML_FileContents
        Close XML_File                                                              '   Close the XML_File
'
'-----------------------------------------------------------------------------------
'
'       Do your code here ...
        XML_StartRowDeleteCode = 0
        XML_StartRowDeleteCode = InStr(1, XML_FileContents, "</row><row r=" & """" & _
                StartingRowToDelete & """")                                         '   Look through XML_FileContents for "</row><row r=" &
'                                                                                   '           """" & StartingRowToDelete & """" ... 3280
        If XML_StartRowDeleteCode > 0 Then
            XML_EndRowDeleteCode = InStr(XML_StartRowDeleteCode, _
                    XML_FileContents, "</sheetData>")                               '   Look through XML_FileContents for "</sheetData>" ... 41883802
'
            XML_RowDeleteString = Mid(XML_FileContents, XML_StartRowDeleteCode, _
                    XML_EndRowDeleteCode - XML_StartRowDeleteCode)                  '   Load the entire range of XML rows to delete into XML_RowDeleteString
'
            XML_FileContents = Replace(XML_FileContents, XML_RowDeleteString, "")   '   Erase the string that pertains to the 'Blank' rows range
'
            XML_FileContents = Replace(XML_FileContents, "</sheetData>", _
                    "</row></sheetData>")                                           '   Correct the end of the last row line remaining in the file
'
            XML_FileContents = Replace(XML_FileContents, " zeroHeight=" & _
                    """" & "1" & """", "")                                          '   Delete the hidden rows flag
        End If
'
'-----------------------------------------------------------------------------------
'
'       Output final changes to the file
        XML_File = FreeFile                                                         '
        Open TemporaryZipFilePath & "xl\worksheets\" & _
                XML_SheetFile For Output As XML_File                                '
        Print #XML_File, XML_FileContents                                           '   save data back to xml file
        Close XML_File                                                              '   Close the xml file
'
        XML_SheetFile = Dir                                                         '   Look for additional Excel file names
    Loop                                                                            ' Loop back check for next XML_File in directory
'
'   *********************************************
'
'   Create empty Zip File
    If Len(Dir(SourceFilePath & AdditionToFileName & ".zip")) > 0 Then Kill _
            SourceFilePath & AdditionToFileName & ".zip"                            ' If zip file already exists then delete it
    Open SourceFilePath & AdditionToFileName & ".zip" For Output As #1              '
    Print #1, Chr$(80) & Chr$(75) & Chr$(5) & Chr$(6) & String(18, 0)               ' Write signature for an empty Zip file ...
'                                                                                   '       "PK|-                  "   18 spaces(null) at the end
    Close #1                                                                        '
'
'
'   Copy files into the zip file
    ZipFile_Builder_Extracter.Namespace(SourceFilePath & AdditionToFileName & _
            ".zip").CopyHere _
            ZipFile_Builder_Extracter.Namespace(TemporaryZipFilePath).items         ' Copy items to zip file
'
    On Error Resume Next                                                            ' Ignore errors and continue
    Do Until ZipFile_Builder_Extracter.Namespace(SourceFilePath & _
        AdditionToFileName & ".zip").items.Count = _
        ZipFile_Builder_Extracter.Namespace(TemporaryZipFilePath).items.Count       ' Loop until all items have been copied to the zip file
        Application.Wait (Now + TimeValue("0:00:01"))                               '   Wait 1 second to allow more compressing of the files being added
    Loop                                                                            ' Loop back
    On Error GoTo 0                                                                 ' Enable Excel error handling
'
'   Delete the temporary files & folders created earlier
    Set FSO = CreateObject("scripting.filesystemobject")                            ' Create filesystemobject
    FSO.deletefolder SourceFilePath & AdditionToFileName                            ' Delete all temporary files & folders that were created
'
'   Rename the newly created file to the original file extention
    Name SourceFilePath & AdditionToFileName & ".zip" As SourceFilePath & _
            SourceFileName & "_Shortened" & "." & SourceFileExtension               ' Set name of new 'shortened' file
'
Debug.Print "Completed in " & Timer - startTime & " Seconds."                       ' Display total run time to 'Immediate window' ... CTRL+G
MsgBox "Completed in " & Timer - startTime & " Seconds."                            ' Display pop up box that alerts user that script is complete
End Sub

This code appears to work at least twice as fast as the code in my previous thread mentioned.

Also let me know if the link to the test file in the previously mentioned thread is no longer valid.

Let me know how it works on those problematic files & anything that needs to be changed.

Right now the code is set up just for excel files with one sheet. If files with with multiple sheets are desired, additional code will be needed.
 

Excel Facts

How can you turn a range sideways?
Copy the range. Select a blank cell. Right-click, Paste Special, then choose Transpose.
Also let me know if the link to the test file in the previously mentioned thread is no longer valid.
It says "File doesn't exist".
Could you upload the test file again?
 
Upvote 0
So having gone over this a dozen times now, I think I'm relatively happy with the code below (save for one point, which I discuss below).
Basically, I've opted to take a 'surgical' approach to the exercise and have used a method to pinpoint and extract the relevant files from the ZIP archive rather than
unzip the entire archive (purely for speed/efficiency reasons). When I originally suggested editing the XML, what I didn't appreciate was that there would result a chain
reaction of issues in other files the moment you amend one of them! Having happily repaired the Sheet1.xml file and thinking that was the end of the matter, enter calcChain.xml.
Long (and painful) story short, have had to edit two files from the ZIP archive.

I should point out that I also opted not to use FileSystemObject - it is staggeringly slow compared to the native VBA options, so I have also included in the code below the 'helper'
functions that I used to replace FSO with.

Two final points:- (1) the approach below requires the user to stipulated from which row they want the deletion to start from; and (2) the code below is not its final form. I have hardcoded two edits that are particular to the sample file you provided. They shouldn't be too difficult to fix, but I just wanted to post something for you sooner rather than later so you had an idea of the gamut of possible solutions. I'll let you know when I've dealt with the hardcoded points, but do please let me know your thoughts in the interim.

VBA Code:
    Dim ShellApp As Object
    Dim ZipFilename As Variant, WBFilename As Variant, FilenameParts As Variant
    Dim DeleteStart As Variant, TempDirectory As Variant
    Dim WSDir As Object, WSFile As Object, FileObject As Object
    
    Sub PhantomRows()
        
        ' Create a copy of the source workbook, and save it
        ' as a copy of the origiinal renamed with a ZIP extension
        WBFilename = Application.GetOpenFilename
        If WBFilename = False Then Exit Sub
        If Len(WBFilename) = 0 Or Dir(WBFilename, vbArchive) = vbNullString Then Exit Sub
        DeleteStart = VBA.InputBox("Please provide the row for the start of the deletion range.", "Identify the start row", Application.ActiveCell.ROW)
        If DeleteStart = False Then Exit Sub

        Dim StartTimer As Single, EndTimer As Single
        
        StartTimer = Timer
        FilenameParts = Split(WBFilename, ".")
        ZipFilename = Replace(WBFilename, FilenameParts(1), "zip")
        NewName = FilenameParts(0) & "(CORRECTED)" & Format(Now, "ddmmyy-hhnnss") & "." & FilenameParts(1)
        TempDirectory = MakeTempDirectory("XMLEDIT")
        
        FileCopy WBFilename, ZipFilename
                            
        Set ShellApp = CreateObject("Shell.Application")
                
        calcChainFix
    
        WorksheetFix
                
        ' Rename the file back into a workbook
        Name ZipFilename As NewName
        
        EndTimer = Timer
        Debug.Print "Time taken: " & EndTimer - StartTimer
        
    End Sub
    
    Sub calcChainFix()
        
        ' Move calcChainFix.xml from the ZIP container to the temporary directory
        Dim TargetFile As String, XMLCode As String, FinalXML As String
        Dim TempObject As Object, CCFile As Object, XLDIR As Object
        Set XLDIR = ShellApp.Namespace(ZipFilename & "\xl\")
        
        Set CCFile = XLDIR.items.item("calcChain.xml")
        Set TempObject = ShellApp.Namespace(TempDirectory)
        TempObject.MoveHere CCFile.Path, 20
                
        TargetFile = TempDirectory & "\" & CCFile
        XMLCode = ReadFile(TargetFile)
        
        ' Start editing the XML code
        startchar = InStr(XMLCode, "><")
        nextchar = InStr(startchar, XMLCode, DeleteStart - 1) - 10
        nextchar = InStr(CLng(nextchar), XMLCode, "<")
        firstsection = left(XMLCode, startchar)
        secondsection = Mid(XMLCode, nextchar)
        FinalXML = firstsection & secondsection
        tmpchar = InStr(startchar, FinalXML, "/>") - 1
        If InStr(tmpchar, FinalXML, "i=") = 0 Then
            FinalXML = left(FinalXML, tmpchar) & " i=""3" & Mid(FinalXML, tmpchar)
        End If
        
        ' Replace the existing file and move the new file into the ZIP container
        Kill TargetFile
        CreateFile CStr(TargetFile), FinalXML
                
        XLDIR.MoveHere TempObject.items.item(0), 20
        Do Until TempObject.items.count = 0
            DoEvents
        Loop
    End Sub
    
    
    Sub WorksheetFix()
        
        ' Get the filesize of the workbook containing the problematic worksheet and
        ' compare it with the filesizes of each worksheet's uncomopressed XML file.
        
        ' If the XML file is larger than the container XLSM/XLSX/ZIP file, then that is
        ' strong indication that the worksheet contains the phantom rows.
        
        Dim ZipFileSize As Long
        ZipFileSize = FileLen(ZipFilename)
        
        Set WSDir = ShellApp.Namespace(ZipFilename & "\xl\worksheets\")
        
        For Each FileObject In WSDir.items
            If FileObject.Size > ZipFileSize Then
                Set WSFile = FileObject
                Exit For
            End If
        Next
        
        ' Move worksheet file from the ZIP container to the temporary directory
        
        Dim TargetFile As String, XMLCode As String, FinalXML As String
        Dim TempObject As Object
        Set TempObject = ShellApp.Namespace(TempDirectory)
        TempObject.MoveHere WSFile.Path, 20
        
        TargetFile = TempDirectory & "\" & WSFile
        XMLCode = ReadFile(TargetFile)
    
        ' Start editing the XML code
        
        StartXML = "<row r=" & Chr(34) & CLng(DeleteStart) & Chr(34)
        startchar = InStr(XMLCode, StartXML) - 1
        StopXML = "</sheetData>"
        StopChar = InStr(XMLCode, StopXML)
        
        firstsection = left(XMLCode, CLng(startchar))
        firstsection = Replace(firstsection, "zeroHeight=""1""", vbNullString)
        firstsection = Replace(firstsection, ":G130", ":G122")
        firstsection = Replace(firstsection, ":M130", ":M122")
        firstsection = Replace(firstsection, "1048576", DeleteStart - 1)
        secondsection = Mid(XMLCode, StopChar)
        FinalXML = firstsection & secondsection
        
        ' Replace the existing file and move the new file into the ZIP container
        
        Kill TargetFile
        CreateFile CStr(TargetFile), FinalXML
        
        WSDir.MoveHere TempObject.items.item(0), 20
        
        Do Until TempObject.items.count = 0
            DoEvents
        Loop
        
    End Sub

And these are the helper functions:

VBA Code:
Function MakeTempDirectory(DirName As String) As String
        
        ' Note that this routine assumes that the
        ' directory C:\TEMP already exists.
        
        Dim TempDir As String
        TempDir = "C:\TEMP\" & DirName
        
        If Len(Dir(TempDir, vbDirectory)) Then
            If Dir(TempDir & "\*") <> vbNullString Then
                Kill TempDir & "\*.*"
            End If
        Else
            MkDir TempDir
        End If
        
        MakeTempDirectory = TempDir
    
    End Function
    
    Sub CreateFile(FileName As String, Contents As String)
        
        Dim FileHandled As Long
        FileHandled = FreeFile
        Open FileName For Output As #FileHandled
            Print #FileHandled, Contents
        Close #FileHandled
    
    End Sub
    
    Function ReadFile(FileName As String) As String
        
        ' The ReadFile routine is faster than FileSystemObject
        ' for reading the contents of a file.
        
        Dim EOFile As Long
        Dim FileHandled As Integer
        Dim Code As String
        FileHandled = FreeFile
        
        Open FileName For Binary As #FileHandled
            EOFile = LOF(FileHandled)
            Code = Space$(EOFile)
            Get #FileHandled, , Code
        Close #FileHandled
        
        ReadFile = Code
    
    End Function
 
Upvote 0
And for your reference, I ran the above over you Paper2_only.xlsm file, setting the row for deletion from Row 123. The most recent results were:
Time taken: 6.073975
Time taken: 5.95459
Time taken: 5.636719
 
Upvote 0
Sweet! Thank you @Dan_W. I am going to start looking at it now. Sounds pretty sweet.
 
Upvote 0
Let me just start off by saying "Wow!" Thank you @Dan_W for making the code I was trying to use look like a 3rd grader homework project. :(

Your approach you used in some sections was way better than the code I used.

With that being said, I looked at your code, added some more 'What if' coding, and threw in an alternative approach to the handling of the 'calcChain.xml' file. I probably changed a few other things, but I don't recall off hand.

Anyways, the code from post #4 here averages out to 5.8593752 seconds on my computer.

The following code that I came up with averages out to 4.7093754 seconds:

VBA Code:
Option Explicit
'
    Dim FileObject                              As Object
    Dim ZipFile_Builder_Extracter               As Object
    Dim WorksheetsFile                          As Object, WorksheetsFolder As Object
    Dim TemporaryZipFullFilePathNameExtension   As String
    Dim FilenameParts                           As Variant
    Dim StartingRowToDelete                     As Variant
    Dim TemporaryZipFilePath                    As Variant

Sub PhantomRowsEditV2()
'
' Create a copy of the source workbook, and save it as a copy of the origiinal renamed with a ZIP extension
'
    Dim SourceFullName                          As Variant
'
    SourceFullName = Application.GetOpenFilename("Excel Files (*.xls*), *xls*", _
            , "Select (.xlsx, .xlsm, .xlam) file to remove excess rows from")           ' Ask the user to select an excel file
'
    If SourceFullName = False Then Exit Sub                                             ' If no file selected then Exit Sub
'
    StartingRowToDelete = VBA.InputBox("Please provide the row # of the file to start deleting from.", _
            "Enter the start row of the problematic file to start deleting lines from.")
'
    If StartingRowToDelete = vbNullString Then Exit Sub
'
    Dim StartTime                               As Single
    StartTime = Timer                                                                   ' Start the stopwatch
'
    Dim EndTimer                                As Single
    Dim AdditionToFileName                      As String
    Dim FullFileNameWithoutExtension            As String
    Dim NewName                                 As String
'
    FilenameParts = Split(SourceFullName, ".")                                          ' Split the selected file name into sections separated by '.'
    TemporaryZipFullFilePathNameExtension = Replace(SourceFullName, _
            FilenameParts(UBound(FilenameParts)), "zip")                                ' Change extension of file selected to a zip extension
'
    AdditionToFileName = Format(Now, " dd-mmm-yyyy h-mm-ss")                            ' Use date/time to create unique name to add to selected file name
'
    FullFileNameWithoutExtension = Left$(SourceFullName, InStrRev(SourceFullName, ".") - 1) ' Get the current path name & File name of file
    NewName = FullFileNameWithoutExtension & "_Shortened" & AdditionToFileName & _
            "." & FilenameParts(UBound(FilenameParts))                                  ' Create a new name for the file including the extension
'
    TemporaryZipFilePath = MakeTempDirectory("XMLEDIT")
'
    FileCopy SourceFullName, TemporaryZipFullFilePathNameExtension                      ' Copy and rename original file to a zip file with a unique name
'
    Set ZipFile_Builder_Extracter = CreateObject("Shell.Application")                   ' Create ZipFile_Builder_Extracter
'
''    Call calcChainFix                                                                   ' Call the calcChainFix Subroutine
'
    Call calcChainDelete                                                                ' Call the calcChainDelete Subroutine
'
    Call WorksheetFix                                                                   ' Call the WorksheetFix Subroutine
'
    Name TemporaryZipFullFilePathNameExtension As NewName                               ' Change the newly created file extension back to
'                                                                                       '       the original file extention
'
    EndTimer = Timer - StartTime                                                        ' Stop the stopwatch
Debug.Print "Completed in " & EndTimer & " Seconds."                                    ' Display results to the 'Immediate' window (CTRL+G)
MsgBox "Completed in " & EndTimer & " Seconds."                                         ' Display pop up box that alerts user that script is complete
End Sub


Sub calcChainDelete()
'
'   Delete the 'calcChain.xml' from the zip file & let Excel calculate it when the new file is loaded into Excel.
'
    Dim CCFile                  As Object, TempObject           As Object, XLDIR    As Object
    Dim Original_XML_File       As String
'
    Set XLDIR = ZipFile_Builder_Extracter.Namespace(TemporaryZipFullFilePathNameExtension & _
            "\xl\")                                                                     ' "xl"
    Set CCFile = XLDIR.items.Item("calcChain.xml")                                      ' "calcChain.xml"
    Set TempObject = ZipFile_Builder_Extracter.Namespace(TemporaryZipFilePath)          ' "XMLEDIT"
'
    TempObject.MoveHere CCFile.Path, 20                                                 ' Remove 'calcChain.xml' from the zip file
'
    Original_XML_File = TemporaryZipFilePath & "\" & CCFile                             ' Save Full path, File name & Extension of the 'calcChain.xml' file
'
' Delete the existing file
    Kill Original_XML_File                                                              ' Delete the original XML file from the temp folder
End Sub


Sub calcChainFix()
'
'   Move calcChainFix.xml from the ZIP container to the temporary directory
'
    Dim XML_StartRowDeleteCode  As Long
    Dim First_XML_Section       As String, Second_XML_Section   As String
    Dim CCFile                  As Object, TempObject           As Object, XLDIR    As Object
    Dim FinalXML                As String, Original_XML_File    As String, XMLCode  As String
    Dim nextchar
    Dim tmpchar
'
    Set XLDIR = ZipFile_Builder_Extracter.Namespace(TemporaryZipFullFilePathNameExtension & _
            "\xl\")                                                                     ' "xl"
    Set CCFile = XLDIR.items.Item("calcChain.xml")                                      ' "calcChain.xml"
    Set TempObject = ZipFile_Builder_Extracter.Namespace(TemporaryZipFilePath)          ' "XMLEDIT"
'
    TempObject.MoveHere CCFile.Path, 20                                                 ' Move 'calcChain.xml' to the temp folder
'
    Original_XML_File = TemporaryZipFilePath & "\" & CCFile                             ' Save Full path, File name & Extension of the 'calcChain.xml' file
    XMLCode = ReadFile(Original_XML_File)
'
' Start editing the XML code
    XML_StartRowDeleteCode = InStr(XMLCode, "><")
    nextchar = InStr(XML_StartRowDeleteCode, XMLCode, StartingRowToDelete - 1) - 10
    nextchar = InStr(CLng(nextchar), XMLCode, "<")
    First_XML_Section = Left(XMLCode, XML_StartRowDeleteCode)
    Second_XML_Section = Mid(XMLCode, nextchar)
    FinalXML = First_XML_Section & Second_XML_Section
    tmpchar = InStr(XML_StartRowDeleteCode, FinalXML, "/>") - 1
'
    If InStr(tmpchar, FinalXML, "i=") = 0 Then
        FinalXML = Left(FinalXML, tmpchar) & " i=""3" & Mid(FinalXML, tmpchar)
    End If
'
' Replace the existing file and move the new file into the ZIP container
    Kill Original_XML_File                                                              ' Delete the original XML file from the temp folder
    CreateFile CStr(Original_XML_File), FinalXML                                        ' Place the editted file version in the temp folder
'
    XLDIR.MoveHere TempObject.items.Item(0), 20                                         ' Put editted XML file back into zip file
'
    Do Until TempObject.items.Count = 0                                                 ' Wait till file has finished being placed back into zip file
        DoEvents
    Loop
End Sub

Sub WorksheetFix()
'
'   Get the filesize of the workbook containing the problematic worksheet and compare it with the filesizes of each worksheet's uncomopressed XML file.
'
'   If the XML file is larger than the container XLSM/XLSX/ZIP file, then that is strong indication that the worksheet contains the phantom rows.
'
    Dim ZipFileSize As Long
'
    ZipFileSize = FileLen(TemporaryZipFullFilePathNameExtension)                        ' Get the size of the file ... 2650947
'
    Set WorksheetsFolder = ZipFile_Builder_Extracter.Namespace(TemporaryZipFullFilePathNameExtension & _
            "\xl\worksheets\")                                                          ' "worksheets"
'
    For Each FileObject In WorksheetsFolder.items                                       ' Loop through each item in the 'worksheets' folder
        If FileObject.Size > ZipFileSize Then                                           '   If a very large file is found then ...
            Set WorksheetsFile = FileObject                                             '       Save the name of the file
            Exit For                                                                    '       Exit the For loop
        End If
    Next                                                                                ' Loop back
'
    If WorksheetsFile = vbNullString Then                                               ' If a problematic file was not found then ...
        MsgBox "Abnormally large file was not found"                                    '   Alert user that a problematic file wasn't found
        Exit Sub                                                                        '   Exit Sub
    End If
'
' Move worksheet file from the ZIP container to the temporary directory
    Dim XML_EndRowDeleteCode    As Long, XML_StartRowDeleteCode     As Long
    Dim TempObject              As Object
    Dim FinalXML                As String, Original_XML_File        As String, XMLCode  As String
    Dim First_XML_Section       As String, Second_XML_Section       As String
    Dim XML_EndSearchString     As String, XML_StartSearchString    As String
'
    Set TempObject = ZipFile_Builder_Extracter.Namespace(TemporaryZipFilePath)
    TempObject.MoveHere WorksheetsFile.Path, 20                                         ' Move problematic Large file to the temp folder
'
    Original_XML_File = TemporaryZipFilePath & "\" & WorksheetsFile                     ' Save Full path, File name & Extension of the problematic file
    XMLCode = ReadFile(Original_XML_File)
'
' Start editing the XML code
    XML_StartSearchString = "<row r=" & Chr(34) & CLng(StartingRowToDelete) & Chr(34)   ' <row r=" CLng(StartingRowToDelete) "
'
    XML_StartRowDeleteCode = InStr(XMLCode, XML_StartSearchString) - 1                  ' Find starting position of XML_StartRowDeleteCode ... 92679
'
    If XML_StartRowDeleteCode > 0 Then                                                  ' If XML_StartSearchString is found in the XML file then ...
        XML_EndSearchString = "</sheetData>"
'
        XML_EndRowDeleteCode = InStr(XMLCode, XML_EndSearchString)                      '   Find ending position of XML_StartRowDeleteCode ... 17851904
'
        First_XML_Section = Left(XMLCode, CLng(XML_StartRowDeleteCode))                 '   Get all XML data up to the rows to be deleted
        First_XML_Section = Replace(First_XML_Section, "1048576", StartingRowToDelete - 1)  '   Replace the last row flag
        First_XML_Section = Replace(First_XML_Section, "zeroHeight=""1""", vbNullString)    '   Delete the Hidden Rows flag
'
        Second_XML_Section = Mid(XMLCode, XML_EndRowDeleteCode)                         '   Get data from end of XML file
'
        FinalXML = First_XML_Section & Second_XML_Section                               '   Combine the two sections and save to FinalXML
    Else                                                                                ' Else
        MsgBox "The row chosen to start deleting from does not exist in the XML file."  '   Alert the user that the row number doesn't exist in the file
        Exit Sub                                                                        '   Exit the sub
    End If
'
' Replace the existing file and move the new file into the ZIP container
    Kill Original_XML_File                                                              ' Delete the original XML file from the temp folder
    CreateFile CStr(Original_XML_File), FinalXML                                        ' Place the editted file version in the temp folder
'
    WorksheetsFolder.MoveHere TempObject.items.Item(0), 20                              ' Put editted XML file back into zip file
'
    Do Until TempObject.items.Count = 0                                                 ' Wait till file has finished being placed back into zip file
        DoEvents
    Loop
End Sub
    
Sub CreateFile(FileName As String, XML_FileContents As String)
'
    Dim XML_File As Long
'
    XML_File = FreeFile
'
    Open FileName For Output As #XML_File                                               ' Open the XML file
        Print #XML_File, XML_FileContents                                               '   save data back to xml file
    Close #XML_File                                                                     ' Close the XML file
End Sub
    
Function ReadFile(FileName As String) As String
'
'   The ReadFile routine is faster than FileSystemObject for reading the contents of a file.
'
    Dim XML_File            As Long
    Dim XML_FileLength      As Long
    Dim XML_FileContents    As String
'
    XML_File = FreeFile                                                                 '
'
    Open FileName For Binary As #XML_File                                               ' Open the XML file
        XML_FileLength = LOF(XML_File)                                                  '   Get XML_FileLength
        XML_FileContents = Space$(XML_FileLength)                                       '   Create blank XML_FileContents string
        Get #XML_File, , XML_FileContents                                               '   Load contents of XML file into XML_FileContents string
    Close #XML_File                                                                     ' Close the XML file
'
    ReadFile = XML_FileContents                                                         ' Pass XML_FileContents back to the calling Sub
End Function
    
Function MakeTempDirectory(TempFolder As String) As String
'
'   Note that this routine assumes that the directory C:\TEMP already exists.
'
    Dim FolderExists    As String
    Dim TempDir         As String
'
    FolderExists = Dir("C:\TEMP\", vbDirectory)                                         ' Check to see if folder already exists
'
    If FolderExists = "" Then MkDir "C:\TEMP\"                                          ' If folder does not exist then create it
'
    TempDir = "C:\TEMP\" & TempFolder                                                   ' Save full path of folder & Temporary Folder name to TempDir
'
    If Len(Dir(TempDir, vbDirectory)) Then                                              ' If the TempDir exists then ...
        If Dir(TempDir & "\*") <> vbNullString Then                                     '   If there are any files/folders in the TempDir then ...
            Kill TempDir & "\*.*"                                                       '       Delete them
        End If
    Else                                                                                ' Else
        MkDir TempDir                                                                   '   Create the TempDir
    End If
'
    MakeTempDirectory = TempDir
'
End Function


Test the results to make sure I didn't take too many shortcuts. Please let me know.
 
Upvote 0
Those are some amazing results. I had a quick look at your revisions (and the very helpful commenting) - and I see what you've done. You've definitely improved what I had, thank you.

One of your revisions jumped out at me - the part where if it doesn't identify a problematic worksheet. I had failed to account for that possibility, so it's good that you picked it up, but I think we need to take it a little bit further. As it is this check doesn't take place until AFTER the calcChainFix routine, and arguably should be the first thing we do. I will need to check the code, but I don't think that there is any really problem flipping the order of the calcChainFix and the WorksheetFix. Also, I wonder if just exiting the subroutine is enough. I will take a look when I get a chance later today.
 
Upvote 0
I don't see a problem with running the 'WorksheetFix' first, and like you said, it makes more sense to do it that way anyways.

I don't know if you noticed, with regards to the 'calcChain.xml' file, I chose to do a 'calcChainDelete' as opposed to the 'calcChainFix'.
I am not confident in the way the 'calcChainFix' works. It seems to be too specific to the test file. Example, I have seen "i=1" in a few files as well as some other things such as "l=1".
The deletion of the 'calcChain.xml' file seems to work, Excel regenerates it when the excel file is loaded into Excel. It doesn't display an error, it just regenerates it. Excel will not prompt you to save the file either, so you have to save the Excel file in order to save the regenerated 'calcChain.xml'.

Edit: Forgot to mention, In regards to files with multiple sheets, I am not quite sure how the 'calcChainFix' would handle that, considering there is only one 'calcChain.xml' file that covers all of the sheets.
 
Last edited:
Upvote 0

Forum statistics

Threads
1,225,738
Messages
6,186,734
Members
453,369
Latest member
juliewar

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top