The Supreme Court of the United States has essentially unconstrained discretion to set its own docket. In order to earn review, four (out of nine) Justices simply have to decide that a case is sufficiently "important." But what makes a case important enough to merit the Supreme Court's attention? This project analyzes the corpus of the Supreme Court's decisions, and, more specifically, its descriptions of its decision to grant review, in order to discern what counts as "important" when it comes to Supreme Court review.
Problem
The Supreme Court of the United States has essentially unconstrained discretion to set its own docket. In order to earn review, four out of nine Justices have to decide that a case is sufficiently "important." But what qualities make a case important enough to deserve the Supreme Court's attention? Our project aims to answer this question by analyzing a large subset of the Court’s decisions dating back to 1925. In terms of the larger scheme of legal analysis, as well as natural language processing, this project contributes additional understanding to the often-vague justifications given by the Supreme Court as to why they take on cases. In its application of word vectors to a vocabulary of legal jargon, the project also provides an evaluation of its performance on this type of dataset. Understanding the Court’s decision making is a significant task, as the few cases that reach the Supreme Court have a significant impact on laws, culture, and the course of history. Furthermore, over the course of the semester, the project has laid down the foundations for future extended analysis of how the Supreme Court's criteria for cases has evolved throughout different eras.
Goal
Our work this semester can be viewed as a journey to find a methodology for analyzing Supreme Court decisions for granting Certiorari in discrete periods of time such as Warren Court, Taft Courts etc. Our ultimate goal was to see changes over time. Our methodology consisted of first running term frequencies, then word vectors, then using expert evaluation to validate the quality of those word vectors. We hope to achieve a result where we can firmly pinpoint what influenced the Supreme Court to grant a case Certiorari. We hope to answer the following- whether or not the guidelines for granting Certiorari have remained unwavering since 1925, or if they change over time as membership on the Court slowly changes every few years. Ideally, we would be able to pinpoint certain cases or decisions that were common in various time periods.
Data
The corpus comes from Washington University Law’s Supreme Court Database. The Database contains over two hundred pieces of information about each case decided by the Court between the 1791 and 2018 terms. Some of the details included per case are: the identity of the court whose decision the Supreme Court reviewed, the parties to the suit, the legal provisions considered in the case, and the votes of the Justices. The dataset also includes more subjective categories, most relevantly, Certiorari reason. Granting a “writ of certiorari” means that at least four justices have determined that the circumstances of the case in question are sufficient to warrant review by the Supreme Court. The certiorari reason was manually coded by the expert judgement, but may present some bias into the corpus.
Solution/Model
It was observed that each case was in HTML file format and most case details were present within the main body paragraphs encompassed within the <p> </p> tags of the text files. In order to extract the information within these HTML tags, a RegEx expression was determined to be the most effective. First, we ran a RegEx match on the given data to collect all case paragraphs with “certiorari” or “cert”. These paragraphs were loaded into a Pandas dataframe under the column name “cert_paragraph”, so that each set of paragraphs could be associated with their corresponding case.
Upon extracting the words, we calculated word frequencies and created a word bank, forgoing stop words like “because” and “the”. Upon further analysis, words with similar meanings show similar trends in the word bank. From this word bank, we filtered out a list of words that Professor Narechania deemed important, based on his domain knowledge. Using this lest, we applied Word2Vec, a word embedding algorithm.
[[{"fid":"1941","view_mode":"default","fields":{"format":"default","field_file_image_alt_text[und][0][value]":false,"field_file_image_title_text[und][0][value]":false},"type":"media","field_deltas":{"1":{"format":"default","field_file_image_alt_text[und][0][value]":false,"field_file_image_title_text[und][0][value]":false}},"attributes":{"style":"float: left;","class":"media-element file-default","data-delta":"1"}}]]
The corpus was divided into time periods corresponding to presiding chief justice. Still, for the Roberts Court, there are only two non-null certiorari decisions. Because of insufficient data, the courts of Roberts and Vinson were not included. Initially, each “block” showed very similar topics, which was expected, as overall, Certiorari decisions are very similar content-wise to one another, replete with the same names and jargon for the various legal actors and processes. After removing "generic" terms (list in code, includes united states, decision, judgement, trial, question), there appeared some notable differences between each court— particularly, the presence and absence of “New York” and “California.” The topics “federal” or “state” appear in all examined courts, which is consistent with our previous understanding: the Supreme Court often takes cases in order to resolve conflicts between states’ interpretation and the federal interpretation of the law. Notably, “habeas” is present in both the Burger and Warren courts, and absent from the rest. The Hughes, Taft, and Stone courts all include some mention of tax, property, income, or bankruptcy— a financial theme not present in Burger and Warren courts.
Word embeddings are defined as giving a numerical representation of words. Both Continuous Bag of Words Model and Skip-gram are used to learn word representations by using neural networks. In the CBOW architecture, the model predicts words based on a window of surrounding context words, where the order of the context words does not affect the prediction. However, Skip-Gram architecture makes use of the current word to predict the surrounding window of context words. This architecture weighs nearby words more heavily than distant words. In order to fully understand the corpus, we used both architectures with the following different cuts:
CBOW, small window (3), whole corpus, .5 cos similarity cutoff
CBOW, medium window (6), whole corpus, .5 cos similarity cutoff
CBOW, large window (9), whole corpus, .5 cos similarity cutoff
Skipgram, small window (3), whole corpus, .5 cos similarity cutoff
Skipgram, medium window (6), whole corpus, .5 cos similarity cutoff
Skipgram, large window (9), whole corpus, .5 cos similarity cutoff
The resulting word vectors provide a mathematical relationship between different words and phrases in the corpus. By mapping each word/phrase to a high-dimensional vector space, we’re able to calculate the cosine similarity between any two-word vectors and have a measurable relationship between the words. Our approach finds the word vectors with the highest cosine similarity to several indicator words including “important”, “questions_presented”, “certiorari”, “importance”, and “reasoning” in order to understand what the Court truly determines is important when deciding to accept a case. For example, we found that “revenue_laws” had high cosine similarity with many of these indicator words, indicating that perhaps the Supreme Court has historically accepted cases because they believe the issue of revenue laws to be particularly important.
Going forward, we plan to build word vector models for different subsets of the decisions over time, to see how the Court’s criteria for importance has changed over the years.
We heavily relied on Professor Narechania’s expertise to gain insights, confirming which word vector parameters worked the best. In the future, we will be focusing more on different windows and cuts to gain a more comprehensive picture. Upon receiving this, we will focus on determining which factors are the most influential in persuading the Supreme Court to grant certiorari. When extracting our Cert reason paragraphs, we also ran into the issue that some cases were not in HTML format, and thus could not be extracted. We will also be focusing on extracting these cases and using these to calculate new word embeddings to see if they affect our final result.
Ankur Jain
Kat de Jesus Chua
Natasha Batra
Anwen Wu