Search, concatenate, move, then delete

dguenther

Board Regular
Joined
Jun 15, 2011
Messages
75
Hi,

I am trouble doing the following in a macro:


OUTCOME: The court reversed the judgment of the district court. The court
remanded the question of the amount of costs to the district court for a further
remand to the arbitrators to make an appropriate award.
JUDGES:
*** [...***1] Joseph F. Baca, Justice. Dan Sosa, Jr., Chief Justice, Richard E.
Ransom, Justice, Kenneth B. Wilson, Justice, concur. Seth D. Montgomery,
Justice, Concurring in part, Dissenting in part. ...
2 of 27 DOCUMENTS
State v. Ybarra
No. 18,506
SUPREME COURT OF NEW MEXICO
111 N.M. 234; 804 P.2d 1053; 1990 N.M. LEXIS 415
November 28, 1990, Filed​

I need excel to:

(1) search for judges in the first row
(2) concatenate everything in the row(s) after judges and before the next line that starts with a number.
(3) move concatenated rows to G1
(4) delete "judges" and everything before it EXCEPT row 1



Thanks if you can help.
 
Last edited:
First off, yes, I want to continue working on your problem. You need to understand that I this volunteering more for me than for the person asking the question. I have always liked doing puzzles and the questions that get raised in forums and newsgroups provide a ready supply of them (some, like your question, more challenging than others).

I'd rather patch my code to handle the Record No issue in code instead of having you manually delete the word Record everywhere (it might get deleted from some location where it shouldn't be removed... you never know). And this issue points out one of the problems with trying to parse a semi-controlled data structure... code needs specific things to lock on to in order to work properly and the anomolies I pointed out earlier, along with this Record No one, show what can happen when the structure varies from what is expected.

I think I will be going away from my computer for awhile, so I'll look at your new files when I get back and see if I can figure out what's what.
 
Upvote 0

Excel Facts

Formula for Yesterday
Name Manager, New Name. Yesterday =TODAY()-1. OK. Then, use =YESTERDAY in any cell. Tomorrow could be =TODAY()+1.
Glad to provide you with some challenges then :)

It's your choice whether you want to just add in the delete."record".macro into your code. If adjusting the code is easier then by all means. But unless a judge's name was "Record" or some derivative, it is not going to be critical to the game plan. Actually the only purpose the docket number is going to have is to be the unique identifier each case gets. If anyone wants to see the case they can use the citation.

--
Another thing is this other kind fellow offered me up a code to try and solve my problem. I emailed him to tell him someone was working on it, but I guess he didn't get it or something. Anyways, here is his code that attempts to do the same thing. It only works on the one example sheet he sent and I am still trying to figure out why that is. It's not like it doesn't parse well on other examples, it just doesn't run or something.

Here's a link to the file, and the code posted underneath.
https://my.syncplicity.com/share/peaanec4k7/raw_inputfromeugene.xlsm



Code:
Sub ExtractData()

    Dim re As New RegExp
    Dim i As Long, j As Long, f As Long
    Dim sh As Worksheet, this As Worksheet
    Dim sColA As String, sColB As String, sColC As String, sColD As String, sColE As String
    
    Application.ScreenUpdating = False
    
    Set this = ActiveSheet
    Set sh = CreateReportSheet
    
    With this
        
        For i = 1 To .Cells(.Rows.Count, "A").End(xlUp).Row
        
            re.Pattern = "\s*\d+\s*of\s*\d+\s*\w*"
        
            If re.Test(.Cells(i, "A")) Then
                
                j = i + 2 'Offset from head of section
                f = f + 1 'Row offset for Report sheet
                
                ' Column A.
                sColA = vbNullString
                While Not IsEmpty(.Cells(j, "A"))
                    sColA = sColA & .Cells(j, "A") & " "
                    j = j + 1
                Wend
                
                ' Column B.
                j = j + 1
                sColB = vbNullString
                While Not IsEmpty(.Cells(j, "A"))
                    sColB = sColB & .Cells(j, "A") & " "
                    j = j + 1
                Wend
                
                ' Columns C.
                j = j + 3
                sColC = vbNullString
                While Not IsEmpty(.Cells(j, "A"))
                    sColC = sColC & .Cells(j, "A") & " "
                    j = j + 1
                Wend
                
                ' Column D.
                sColD = Trim$(.Cells(j + 2, "A"))
                
                ' Column E.
                re.Pattern = "\s*\[\s*\.+\s*\*+\s*\d+\]\s*"
                While Not re.Test(.Cells(j, "A"))
                    j = j + 1
                Wend
                
                sColE = vbNullString
                While Not IsEmpty(.Cells(j, "A"))
                    sColE = sColE & .Cells(j, "A") & " "
                    j = j + 1
                Wend
                
                ' Transfer all data to Report sheet.
                With sh
                    .Cells(f, "A") = Trim$(sColA)
                    .Cells(f, "B") = Trim$(sColB)
                    .Cells(f, "C") = Trim$(sColC)
                    .Cells(f, "D") = Trim$(sColD)
                    sColE = Trim$(re.Replace(sColE, vbNullString))
                    sColE = Right$(sColE, Len(sColE) - 3)
                    .Cells(f, "E") = sColE
                End With
        
            End If
        
        Next
    
    End With

    Application.ScreenUpdating = True
    
End Sub


Private Function CreateReportSheet() As Worksheet

    Dim sh As Worksheet
    
    Application.DisplayAlerts = False
    
    On Error Resume Next
    Set sh = Sheets("Report")
    
    If Err.Number = 0 Then
        Sheets("Report").Delete
    End If
    On Error GoTo 0
    
    Set sh = Sheets.Add(After:=Sheets(Sheets.Count))
    With sh: .Name = "Report": .Tab.ColorIndex = 6: End With

    Application.DisplayAlerts = True
    Set CreateReportSheet = sh

End Function
 
Upvote 0
I already tested your "non-working" file. All works perfect. No error occurs.
The best solution, as I wrote you, is to create one folder with all your lots of files and process each of them. :)
 
Upvote 0
Apparently, there is a time limit to how long your files can remain on the syncplicity site you posted them to... all four of your example files posted yesterday are no longer available. Can you upload them again and give new links to them? Sorry, I didn't think I had to download them immediately yesterday, so I figured I would pick them up today... but they are not there any more.
 
Upvote 0
Okay, I got them and I will look at them. Remind me though, I don't remember what the issue was with those last two (the ones with the spaces)?
 
Upvote 0
Also Eugene,

When I try your code, for example, on this sheet with about 170 cases on it, and it ran well until the end of the "crunching" when excel spit out a
au7gad.png



error and the debugging looked like
15gxac6.png


yet the parser still ran. The problem was it stopped about 3 cases short of getting them all done. Thoughts? Could it just struggle with long amounts of cases?

File in question:
https://my.syncplicity.com/share/op3o4ycv3x/ny_state_cases,_combinedmanycases.XLSM
 
Upvote 0
Okay, I got them and I will look at them. Remind me though, I don't remember what the issue was with those last two (the ones with the spaces)?



Sorry, those ones didn't have problems. Those were just examples of a way that the files could be downloaded from my end. I thought maybe having the spaces in there would alleviate coding issues. But if it makes more, you can just ignore them.

Thanks,
Danny
 
Upvote 0
Okay, I figured out what the problem was... it was the formatting of the data. The files with the blank rows screwed things up. so I patched the code to remove them. Column A, which was formatted as Text in prior files was not formatted that way in the files that failed, so I added a command to convert them to text format. The reason this matter was because when they were not text, dates became real Excel dates early on in my parsing... the problem here is that I needed the dates as text so I could find the commas in order to distinguish actual date from citations that looked like dates (real Excel dates are numbers which don't have commas). In the file that mostly worked, there was a record that did not have a CASE SUMMARY. I looked for those words during my date parsing... I used that text to reduce some loops, but since it is apparently possible to have records without summaries, I used the JUDGES keyword instead. Here is the macro which, to my eyes (which are not trained to look at this kind of data) appears to process your files correctly. As I cautioned earlier, test it out against some known manually processed resulting files.
Code:
Sub ParseCases()
  Dim X As Long, LastRow As Long, NO As Long, NOS As Long, FIRSTDATE As Long, LASTDATE As Long, JUDGES As Long, DOCs As Long, NextDOCs As Long, Cell As Range
  '  Get last row of file
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  '  Make sure Column A is formatted as text
  Range("A1:A" & LastRow).NumberFormat = "@"
  '  Delete all blank rows
  Range("A1:A" & LastRow).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
  '  I noted some non-breaking spaces in one of the files (ASCII-160), so let's convert them to real spaces
  Range("A1:A" & LastRow).Replace Chr$(160), " ", xlPart
  '  Get rid of leading/trailing spaces
  For Each Cell In Range("A1:A" & LastRow)
    If Left(Cell.Value, 1) = " " Then Cell.Value = Trim(Cell.Value)
  Next
  '  Put an ending marker for the Do..Loop to find (the Nos. marker is needed in order for the docket numbers check code to be able to work correctly)
  Cells(LastRow + 1, "A").Value = "Nos. Ending DOCUMENTS"
  '  Combine multiple "to be saved" lines into single lines
  DOCs = 1
  Do
    '  Get docket number (there are two ways that cell can start.... No. or Nos.)
    NO = Cells(DOCs, "A").Resize(LastRow).Find("No. *", LookAt:=xlWhole, SearchDirection:=xlNext, MatchCase:=True).Row
    On Error Resume Next
    NOS = Cells(DOCs, "A").Resize(LastRow).Find("Nos. *", LookAt:=xlWhole, SearchDirection:=xlNext, MatchCase:=True).Row
    On Error GoTo 0
    '  Use the minimum one
    If NOS < NO Then NO = NOS
    '  Get JUDGES row number
    JUDGES = Cells(DOCs, "A").Resize(LastRow).Find("JUDGES:", LookAt:=xlWhole, SearchDirection:=xlNext, MatchCase:=True).Row
    '  Get first date line
    For X = NO + 3 To JUDGES
      If IsDate(Left(Cells(X, "A").Value, InStr(Cells(X, "A").Value, ",")) & " 2000") Then
        FIRSTDATE = X
        Exit For
      End If
    Next
    '  Get last date line
    For LASTDATE = JUDGES To NO Step -1
      If IsDate(Left(Cells(LASTDATE, "A").Value, InStr(Cells(LASTDATE, "A").Value, ",")) & " 2000") Then Exit For
    Next
    '  Combine multiple date lines
    If LASTDATE > FIRSTDATE Then
      Cells(FIRSTDATE, "A").Value = Join(WorksheetFunction.Transpose(Range(Cells(FIRSTDATE, "A"), Cells(LASTDATE, "A"))), "; ")
      Range("A" & (FIRSTDATE + 1) & ":A" & LASTDATE).Clear
    End If
    '  Combine multiple lines for the "case names"
    If NO - DOCs > 2 Then
      Cells(DOCs + 1, "A").Value = Join(WorksheetFunction.Transpose(Range(Cells(DOCs + 1, "A"), Cells(NO - 1, "A"))))
      Range("A" & (DOCs + 2) & ":A" & (NO - 1)).Clear
    End If
    '  Combine multiple lines for the "citations"
    If FIRSTDATE - NO > 3 Then
      Cells(NO + 2, "A").Value = Join(WorksheetFunction.Transpose(Range(Cells(NO + 2, "A"), Cells(FIRSTDATE - 1, "A"))))
      Range("A" & (NO + 3) & ":A" & (FIRSTDATE - 1)).Clear
    End If
    '  Get next DOCUMENT row number
    NextDOCs = Cells(DOCs, "A").Resize(LastRow).Find(" DOCUMENTS", LookAt:=xlPart, SearchDirection:=xlNext, MatchCase:=True).Row
    '  Combine multiple lines for the JUDGES
    If NextDOCs - JUDGES > 2 Then
      Cells(JUDGES + 1, "A").Value = Join(WorksheetFunction.Transpose(Range(Cells(JUDGES + 1, "A"), Cells(NextDOCs - 1, "A"))))
      Range("A" & (JUDGES + 2) & ":A" & (NextDOCs - 1)).Clear
    End If
    '  Check to see if we are at the end of file
    NextDOCs = Cells(DOCs, "A").Resize(LastRow).Find(" DOCUMENTS", LookAt:=xlPart, SearchDirection:=xlNext, MatchCase:=True).Row
    If NextDOCs > LastRow Then Exit Do
    '  Update the DOCs variable and run next loop
    DOCs = NextDOCs
  Loop
  Columns("A").SpecialCells(xlCellTypeBlanks).EntireRow.Delete
  '  The file has been normalized as to its structure, so let's run the code I originally posted to parse it
  '  ========================================================================================================
  '  Get updated LastRow (remember, we might have deleted some rows earlier)... the minus one excludes the "Ending DOCUMENTS" flag we placed after the last row
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row - 1
  '  Now that each DOCUMENTS page is structured the same, lets move the data into the correct columns
  DOCs = 1
  Do
    '  Fill Columns A thru E with data
    Cells(DOCs, "A").Resize(, 5) = WorksheetFunction.Transpose(Cells(DOCs + 1, "A").Resize(5))
    '  Fill Column F with the judges
    JUDGES = Cells(DOCs, "A").Resize(LastRow).Find("JUDGES:", LookAt:=xlWhole, SearchDirection:=xlNext, MatchCase:=True).Row
    Cells(DOCs, "F").Value = Cells(JUDGES + 1, "A").Value
    '  Get the next DOCUMENTS cells
    NextDOCs = Cells(DOCs, "A").Resize(LastRow).Find(" DOCUMENTS", LookAt:=xlPart, SearchDirection:=xlNext, MatchCase:=True).Row
    '  Exit the loop if it is greater than the last row (if it is, then it is the "Nos. Ending DOCUMENTS" text we put in the blank cell under the last data row to mark the end of the file.
    If NextDOCs > LastRow Then Exit Do
    '  If we didn't exit the loop, then update the DOCs variable for the next loop
    DOCs = NextDOCs
  Loop
  '  The only thing that will be in Column B is text we placed there as part of the parsing process; all other cells in the column are blanks and we use SpecialCells to locate them in order to delete the rows we already parsed
  Range(Cells(2, "B"), Cells(LastRow, "B")).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
  '  Find out the current last row (now that all the deletions are done
  LastRow = Cells(Rows.Count, "A").End(xlUp).Row
  '  And delete the end of file flag we put there
  Cells(LastRow, "A").Clear
End Sub
Oh... I did not patch my code to handle the "Record No" problem you mentioned and I'll tell you why. That may not be the only variation that could occur. Record could be abbreviated, mixed case or it could be the word "Docket" or who knows what else. If you come across files that do not use No. or Nos., then change them manually as you said you did already. The best way to make the change is to include the word No (no period) in your "Find what" field; that is, if you see "Record No. ##" or "Record Nos. ####", put "Record No" (without the quotes) in the "Find what" field and put "No" in the "Replace with" field (always end up with capital N small o).
 
Last edited:
Upvote 0

Forum statistics

Threads
1,224,531
Messages
6,179,379
Members
452,907
Latest member
Roland Deschain

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top