Predictive coding: the future of electronic discovery?

(this article originally published 6/20/12)

If you keep up with news of the legal technology world, you’ve already heard about something called predictive coding, and about why it’s a game-changer in the field of eDiscovery (electronic discovery). And with recent legal cases both showing federal support of the technology and attempting to regulate its use, the judicial system seems to assume it’s here to stay.

And why shouldn’t it?

Why We Love It

Let’s say ne’er-do-well John E. Guilt got caught embezzling company funds and is being brought to court for it. He doesn’t much like the idea of jail time and is claiming innocence, the greedy rascal. Prosecutors are now faced with the task of sifting through all his personal and company emails from the last five years to look for evidence, which wouldn’t be so bad if there weren’t 3 million of those to go through before the case against him can be fully prepared (a relatively normal figure). And, with recent legal events like the rulings of Judges Peck and Carter in da Silva Moore v. Publicis Groupe (which supported a preference for the use of current predictive coding software over manual review techniques) and the US v. Metter et al ruling (which limits the amount of time prosecution can take to analyze and present electronic evidence), the prosecutors handling Mr. Guilt’s case are most likely going to turn to predictive coding to help them churn out their evidence on time.

Mr. Guilt’s prosecutors use a well-known predictive-coding software like Recommind’s Axcelerate, plug in Mr. Guilt’s emails, babysit it for the first few trial runs, then sit back and wait for their results to pop out. It gets through them in a few days (rather than the months a team of poorly-equipped manual reviewers might have taken), organizes those results for efficient access, cross-lists pieces of related information, avoids the false positives and negatives that generally come from manual review, automatically prioritizes documents by importance, and does it all 60-90% faster and cheaper than the team of unmotivated, underpaid interns who would have done the job using clumsy keyword-based searches in years past. The cherry on top? Axcelerate does it all with higher consistency and quality than any manual review team armed with a notepad and Google-type search engine ever could. What’s not to love?

Why We’re Not Pinning Our Hopes and Dreams On It

Your much-abused interns (and, especially, the third-party computer-forensic investigator that you’ve hired to help nail Johnny Guilt) have more going for them than you may realize. While companies like Recommind are quick to point out that manual review misses 25-50% of documents, they don’t claim it’s perfect, either – in fact, as Recommind’s Craig Carpenter puts it, “perfection is not the goal” compared to improvement over manual review. And the aforementioned court rulings aren’t wholehearted endorsements of it, either. Judge Carter from the da Silva appeal wrote, “There simply is no review tool that guarantees perfection…. [t]here are risks inherent in any method of reviewing electronic documents.” We tend to agree, and for a couple of important reasons.

First of all, predictive coding is absolutely perfect…for the honest criminal who knows he should go to jail, feels really really bad about what he did, and wants to make it up to society by gift-wrapping all the incriminating evidence for them. (We’d really like to meet one of those, but we’re also still holding out for proof of unicorns and leprechauns.) More than likely, your tech-savvy criminal is going to want to hide or destroy (spoliate) electronic evidence if he knows he’s been caught, so there’s a good chance he’s going to try to get rid of it or, barring that, to encrypt it. Encrypting electronic evidence is unexpectedly successful when it comes to predictive coding, because the software often can’t read encrypted files and won’t list it in search results. The software might have noticed something unreadable was there, but it’s probably not going to tell you about it. And sometimes, your really tech-savvy criminal will be able to remove evidence and leave only an indicator that something was deleted. Unfortunately, your predictive coding software isn’t going to find that, either.

In addition to encryption and deletion, there’s also the option to simply hide the stuff you don’t want the lawyers to find, and predictive coding software won’t always see it. For instance, there’s something called alternate data streams which allows you to hide a document within the structure of another document. Your software might find the outer “shell” document, which is a flier for the homeless shelter where you’ve been volunteering twice a week, but it won’t see the embedded document, creatively titled “My Scheme to Take Over the World.” For the especially devious, there’s also the option of hiding documents in completely unrelated file formats (steganography) – like hiding a document in an image file. Once again, predictive coding will find the picture, but not what’s hidden within it.

And, last but not least, there’s the issue that some criminals are intimately familiar with predictive coding software, and they know how to defend themselves against it (anti-forensic technology). It’s the reason why you may not want to put one of those “Protected by ADT” signs in your front yard if you have an ADT home security system – if you’re targeted by a criminal who used to work for ADT and knows how to get around it, there’s a good chance he’ll rob you blind, expensive security system or no. If predictive coding technology is ruled legally sufficient for all methods of electronic discovery, criminals will be able to accurately predict the methods which will likely incriminate them, and they can learn how to avoid them. It’s much more difficult for a criminal to know the methods of examination and analysis that, say, a forensic investigator would use, because he’ll use a wider range of tools (some of which use predictive coding, and some of which don’t).

Are we trying to start a blood-feud with all advocates of predictive coding technology? Not at all. We think predictive-coding softwares are great tools, but people are often quick to assume that they can replace the whole toolbox. So what method has the efficiency of predictive coding without losing the intelligence and problem-solving abilities of a human examiner? As you’ve probably guessed, we say that nothing can beat a forensic computer investigator. The right investigator has experience, certifications, the “imagination” to think of outside-the-box solutions, a thorough knowledge of the capabilities of hardware and software, expertise in a wide range of popular and lesser-known investigation tools, and the ability to put himself in the shoes of another computer expert. Best of all, you never have to pay to download his newest update. You can find the one we recommend here.


Leave a Reply

Your email address will not be published. Required fields are marked *