Expert Coders: Is this possible or conceivable? Coding votes into binary

dguenther

Board Regular
Joined
Jun 15, 2011
Messages
75
Excel 2007 on Windows 7

Hi,
I don't know if what I am about to ask is even possible. I kind of think not. But, Mr Excel people have done amazing things for me so far.

Here is the situation:

I have hundreds of files like this one:
https://my.syncplicity.com/share/bbxisyzjvs/WV_macro_applied.xlsm

The first sheet is a bunch of legal case information, spread out over hundreds of rows in column A.

The second sheet is always called "Report". It has taken the information from the first sheet and nicely put every case into its own row, separating data into it proper column. Super kudos to Rick Rothstein and Sektor for those codes.

One further transformation is necessary. I have resigned myself to thinking I will have to do this by hand, but maybe someone can surprise me. The last column of the second sheet contains the voting records of the judges on the case. Those need to be coded into 1,0, and 9.

A 1 means that the judge voted in the majority (or concurred)in the case.
A 0 means that the judge voted in the minority (dissented) in the case.
A 9 means that the judge did not vote or did not participate.

In a separate sheet, needs to be the following:

The list of docket numbers (always going to be column B from the "report" sheet) transposed to go on row A of the new "Binary" sheet.

Then magic occurs. The judges’ votes (always in column E of the Report sheet) in each case need to be put into the 0,1,9 category. There can be no other information in the "binary" sheet (per the requirements of the statistical program later used to process this stuff).

Problem is, the formatting is not beautiful. And they vary state by state. Here are three example files where I translated by hand, the judges’ votes from the “Report” sheet to the “Binary” sheet I created. (On two of them, I only did the first 5)

https://my.syncplicity.com/share/6xwkn0vp31/FL_cases_macro_applied.xlsm
https://my.syncplicity.com/share/gbe5eiem8e/VT_cases_macro_applied.xlsm
https://my.syncplicity.com/share/bbxisyzjvs/WV_macro_applied.xlsm

They can look like anything from the typical (I would venture to guess 75% of them look like this):

Adkins, C.J., and Roberts, Overton and Hatchett, JJ., concur. Boyd, J., dissents....

{So, obviously Adkins, Roberts, Overton, and Hatchett would all get a 1. Boyd would get a 0.

- to a little more complex

England, Justice. Boyd, Overton and Sundberg, JJ., concur. Adkins, C.J., dissents with an opinion, with which Chappell, Circuit Judge, concurs. ...


{Here, Boyd, Overton, and Sundberg get 1s, while Adkins and Chappell get 0s.}

- to really complicated

Adkins, Chief Justice. McCain, J., and McCrary and Lee, Circuit Court Judges, concur. Overton, J., concurring in part and dissenting in part with opinion with which Drew (Retired), J., concurs.

{Here, Adkins, McCain, McCrary and Lee get 1s. Overton and Drew get 0s. Even a partial dissent is to be coded as dissent.}

=====

So from those example files you can see what an end product needs to be. 1,0, and 9s. Deceivingly simple sounding.

I will update this post with my thoughts and experiences as I try to apply a good solution to save me weeks, possibly months! of time.

Even anything that can automate or help part of this problem would be extremely appreciated. Even just thoughts about how to go about approaching it in the most efficient way would help.


Questions, thoughts, encouragement:
Reply to this post and/or
dsguenth [at] wustl.edu
 
OK- I am going to press pause for a second. Let me finish heuristics in a better fashion. Better complete than half cooked.

-D
 
Upvote 0

Excel Facts

Can Excel fill bagel flavors?
You can teach Excel a new custom list. Type the list in cells, File, Options, Advanced, Edit Custom Lists, Import, OK
Hi Danny,

OK, I've been looking at the processing of the voting record. I've produced version 2 of the VotingResults function which simply splits the record into sentences, and for each sentence checks for the decision keywords ('Concurs' 'dissents', 'Took no part' and the various aliases), and for the required Judge name within the same sentence,
If the judge name is absent, it returns 9 and checks the next sentence
If the Judge name is present and 'dissent' (or aliases) is present, it returns 0
If the judge name is present and 'concur' (or aliases) is present, it returns 1
If the Judge nameis present and 'Took no part' (or aliases) is present, it returns 9
If the judge name is present and no decision keywords are present it returns 9
If the Judge name is present, and more than 1 decision keyword is present, it returns either 10, 11 or 19, depending on the first decision keyword found.

I'll await your completion of the heuristics before doing any more.

I see that you spotted my deliberate mistake re the 0 and 1 assignment - there are two constants miConcurValue and miDissentValue which I got the wrong way round, as you say, an easy fix :)

Should we incorporate the judge initials (where available) to reduce the issue of duplicate names?
 
Upvote 0
Hello,

I hope I didn't sound demanding or anything when I said I wanted to wait to finish the heuristics. I just want to make sure I am using your time as efficiently as possible! I hope for you this process is still intriguing and interesting, not purely laborious.

Anyways, I there are some unanswered questions from you I need to answer.

Will get to those quickly!
 
Upvote 0
Hi Danny, not at all!

I only posted back to try to let you know what direction I was going, for you to bear in mind when checking out the huristics, maybe saying 'Well that approach works except for record types xxx, yyy, where we need to use approach zzz'

If we can articulate how the code can recognise the patterns any record type, that would be great.

Best wishes

Alan
 
Upvote 0
So, here are "the heuristics".

https://my.syncplicity.com/share/evrcyfnti0/heuristics-permalinked.xlsx

Well not precisely. For now, it is a nonrandom sampling from each state. I picked one that seemed either weird, long, hard, really easy, really common, or something stuck out about it. In the morning, I will look through the sheet with fresh eyes and start coming out with some useful if then quasi code statements. For instance,

if
$judgename$ concur* in part and dissent* in part

then code 0

which means, if we come across any verified judges name, and then you run across concur or any of it's derivatives and also dissent or any of it's derivatives you know it's a dissent. or simply, a concur and a dissent is a dissent. this appears a lot.

Alrighty then! Until the morning...
 
Upvote 0
Hi Danny, That sounds great!

I'll have a look too, and see if I can find any exceptions to my approach of 'For each sentence, find the judge name and 'concur', 'Dissent', 'took no Part' (and aliases) in the same sentence & return accordingly'. I'm sure there will be - life's never that simple :¬/

Maybe we go for a pattern approach (we need to define how we recognise the pattern), and if no pattern match, go for the above as a fall back.
 
Upvote 0
Haha, prepare yourself for an ugly look into the inconsistencies of 50 state court reporters. But, I expect lots of trial and error and running various versions of the code. What really helped that one time was me hand coding a small section, running the code, and then seeing where the code went wrong.

On that note, could the next version of code possibly add in highlighting of any cell it processes out as a 11,10, or 19? It would just help my eyes not glaze over the digits. Then I know immediately which votes to look at.

-Danny
 
Upvote 0
:)

I thought of that, but then remembered conditional formatting.

I think my biggest issue will be recognising patterns in code, and recognising ends of sentences.
One thing I realised is that 'J.J' and 'C.J.' are NOT judge initials, but shortened form of his/her title (CJ = County Judge or something?)
 
Upvote 0
Alright, I have looked at 100+ examples. Good news: The code as is can handle tons of them. By far the most common format is

SUNDBERG, C.J., ADKINS, OVERTON, ENGLAND, ALDERMAN and McDONALD, JJ., Concur. BOYD, J., Dissents....

I typed up a bunch of stuff about them which is now the up to date

https://my.syncplicity.com/share/00usdwpxem/h_notes.doc file.

Let me know if what I did was helpful, not helpful, would have been good if you could, etc etc


-Danny
 
Upvote 0

Forum statistics

Threads
1,224,550
Messages
6,179,459
Members
452,915
Latest member
hannnahheileen

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top