Tens of Thousands of Pacer Documents Could Have Failed Redactions, Study Suggests
Those black boxes used to hide redacted information in court documents filed with Pacer don’t always work, according to a study that suggests tens of thousands of the online filings may have failed redactions.
Timothy Lee, a PhD candidate in computer science at Princeton, conducted the study. He found 194 documents with not-so-hidden information that included trade secrets; personal information; and names of witnesses, jurors or plaintiffs. Lee wrote about his study at Freedom to Tinker, a blog hosted by Princeton’s Center for Information Technology Policy.
Lee explains that PDF documents have multiple layers, and text may still be under a blacked-out rectangle. Extracting the information can be as easy as cutting and pasting.
Lee found the 194 documents by writing a computer program to detect redaction boxes in the 1.8 million Pacer documents in Princeton’s collection. The software identified about 2,000 documents with redactions. Of those, 194 had redactions that didn’t work.
Lee notes that Pacer reportedly has about 500 million documents. He cautioned that Princeton’s Pacer documents aren’t a random sample, so it’s difficult to estimate just how many Pacer documents have similar problems. Still, he writes, it’s safe to say there are thousands, and probably tens of thousands, of documents in Pacer with failed redactions.
Lee recommends that the federal judiciary use software like the program he developed to scan Pacer documents for failed redactions. He advises lawyers who want to avoid the problem to check out recommendations (PDF) developed by the National Security Agency.