pexels-ekaterina-bolovtsova
California allocated $6.87 million to UC Berkeley to develop a database on police misconduct and use-of-force records. (Photo/ Ekaterina Bolovtsova, Pexels)

California allocated $6.87 million in its 2023-24 budget to UC Berkeley to develop the Police Records Access Project, a first-of-its-kind, state-wide database of police misconduct and use-of-force records.

Berkeley’s Institute for Data Science, Graduate School of Journalism and partners will collect, curate and make accessible records that a 2019 state law unlocked for the public. It will help communities, journalists, public defenders, prosecutors, and police departments develop a deeper understanding of California policing.

“There is an information gap getting in the way of protecting people,” said Saul Perlmutter, the faculty director of the Berkeley Institute for Data Science (BIDS) who initiated this project at Berkeley. “Using data science and artificial intelligence to make that connection offers a classic example of the promises of modern information technologies.”

California researchers, students, public defenders, journalists and advocates have been working for two years to develop this infrastructure. They’ve filed information requests with all 700 state police departments, secured more than 175,000 files, and identified ways technology could enable and accelerate making sense of these records. 

So far, reporters have used this information to reveal inequitable enforcement, physical harm inflicted by officers, and a lack of consequences for police that break the law. These investigations highlight the need, but there’s more to do, a project leader said.

“This is a validation of our work,” said David Barstow, chair of Berkeley’s Investigative Reporting Program who is helping lead the project. The funding will “transform that effort and give us an opportunity to make enormous headway towards solving a really big problem.”

This work has so far been organized by the Community Law Enforcement Accountability Network (CLEAN). The CLEAN initiative was launched by a consortium that includes BIDS, National Association of Criminal Defense Lawyers, the California Reporting Project, Stanford University’s Big Local News and groups across Berkeley. Berkeley participants include the Investigative Reporting Program, the Data Science Discovery program and the Effective Programming, Interaction, and Computation with Data Lab. Many of these entities are part of the College of Computing, Data Science, and Society, which recently became the first new college at Berkeley in more than 50 years.

The new state funds are available for three years.

Data science tools, police records ‘enormously consequential’

Project leaders expect it will be years before the public-facing database goes live. Initial work has been painstaking. Up to three people review each file – whether it’s a video, a handwritten note or typed documentation – to extract data and ensure information shared publicly is correct.

Computer scientists are developing cutting-edge data ingestion and data science tools to help organize records and automate and accelerate this process. BIDS is also leading efforts to identify users and uses of the database, so it can ultimately meet their separate needs.

"This is the epitome of why data science as a field can play such an important role in our lives," said Saul Perlmutter, BIDS faculty director. "Data science is supposed to make it possible to use data that's available to accomplish goals and to make the world a better place."

Artificial intelligence is turning previously unusable raw police data into an information resource, Perlmutter said.

“This is the epitome of why data science as a field can play such an important role in our lives,” he said. “Data science is supposed to make it possible to use data that's available to accomplish goals and to make the world a better place.”

Records and resources will continue to be available to reporters through the California Reporting Project before the database is complete. When it becomes public, it could also be used in a number of ways to understand and address systemic and individual issues in state policing.

For example, police captains hiring officers could assess past conduct of candidates from other departments in California. Prosecutors and public defenders could scrutinize a police officer’s credibility for court cases. Communities and families impacted by policing could gain insight into specific incidents and officers. Public policy experts could execute new kinds of research.

This project could also serve as a model for other states seeking to create similar police misconduct resources and for journalists using data tools to report on large records releases quickly and thoroughly.

“We are making and we've made enormous progress. We're extremely optimistic and we think that this funding from the state is exactly what we're going to need to get this thing to the finish line,” said Barstow. “If we can solve this problem here, it could be enormously consequential.”

Read journalists’ findings based on CLEAN collaborations