By Brendan Clemente
This past March, Duke Law’s Professor Brandon Garrett released his newest book, Autopsy of a Crime Lab: Exposing the Flaws in Forensics. Professor Garrett founded the Wilson Center for Science and Justice and studies the use of forensic evidence in criminal cases. Brendan Clemente, Duke Law & Technology Review’s (DLTR) Managing Editor, sat down with Professor Garrett to discuss the book.
Thank you for joining DLTR to discuss your new book, Autopsy of a Crime Lab: Exposing the Flaws in Forensics. What made you want to delve into this topic in this book?
My introduction to forensics came after law school. I took evidence in law school, for which I am glad now that I am now teaching it. We did not cover expert evidence. I did not take law and science classes, and I went to law school having turned away from math and science, like most of us lawyers do.
When I was in practice, I worked at a civil rights firm where there were two types of cases one could gravitate toward: police brutality cases and wrongful conviction cases. I told the partners I wanted to work on the police brutality cases. The wrongful convictions heavily involved science, including DNA testing, and that sounded intimidating. I then discovered that policing cases also involved scientific evidence issues, mostly regarding ballistics and autopsy reports, but also police practices expert testimony. One just cannot get away from science as a lawyer.
Despite my interest in police use of force cases, one of my first trials involved a man exonerated by DNA testing who had been convicted in the first place based on bite-mark testimony, as well as eyewitness testimony. I learned a lot about DNA in that case. The state, for whatever reason, did not want to concede the chain of custody relating to DNA. There was no question as to the integrity of the results, but at trial, we had to put on witnesses to discuss every step of the DNA collection process. We had a whole tour of the medical examiners’ lab since we were putting the medical examiner DNA analyst on the stand. So, I really got to know the entire DNA testing process.
And there was bite mark evidence in the case, so I was exposed to how forensic evidence can go terribly wrong. In practice, I worked on a series of other wrongful conviction cases, including representing one of the Exonerated Five. People have understandably focused on the false confessions in the case. But digging into the transcript, there was also hair comparison testimony, there was soil comparison testimony, and there had been DNA testing at the time that had excluded each of the five youths.
When I became a law professor, I was really interested in understanding better the role that unreliable forensics played in criminal cases and whether this was something systematic. That general interest crystallized when the National Academy of Sciences contacted me and asked that I report to them regarding the role of forensic expert trial testimony in DNA exoneration cases. By that time, I had made a name for myself with an early study looking at appellate litigation and habeas litigation by DNA exonerees. It is time-consuming to study trial transcripts because it is not easy to even get them. My former boss and friend Peter Neufield from the Innocence Project had started scanning files of exoneree trial transcripts. We decided to collaborate on the project. That report provided the kernel for a more comprehensive study of DNA exonerees’ trials: my first book, Convicting the Innocent. In that work, I carefully detailed what went wrong at trial, including with the forensics. Yet the trials did not tell the whole story. Sometimes there were errors in the lab that were not apparent at trial. And I knew that forensics could go wrong at the crime scene, or due to bias, or in other hidden ways.
Over time, I started to realize that there were a range of serious issues with forensics. There are issues when police arrive at the crime scene; there are issues with quality control in the lab, and, as troubling as my initial foray into forensics was, there were a lot of other windows that I needed to open. So over time, I decided that I should tackle a separate book looking at how forensics can go wrong and how to solve this complex problem.
What does the typical crime lab look like? The book discusses how the typical crime lab doesn’t look like it does on CSI. How rigorous are crime labs’ processes? How are forensics used in criminal cases?
Crime labs are kind of like policing agencies: they vary widely. There are thousands of police agencies in this country. There are really big professional agencies. There are medium-sized, highly professional agencies. But then there are small police departments which might have a police chief and one other police officer––think Mayberry. There are a huge number of police departments that have less than 20 sworn officers, and it is the same way with crime labs. There are a lot of cop shops where there may just be a handful of people who are part of the police department.
Today, we have hundreds of large, accredited labs required to have certain protocols and standards. They may have to satisfy federal rules to be part of the federal DNA databank. Sometimes states have a large main and central state lab, or there are regional labs. In other states, like North Carolina, there may be a state lab, but there are also any number of small police labs. So, it’s really, really varied how much professionalism, training, and resources there are. And there may be people like dental experts who testify about bite marks that are not part of labs but are routinely called by prosecutors as expert witnesses. There are some private labs, especially in the DNA area.
In many jurisdictions, the police officers collect evidence from crime scenes without major involvement of scientists, while in some places, the crime lab is more involved in making sure the evidence is collected properly.
Accreditation is voluntary and is about written protocols but not about quality of work. No one is actually auditing the quality of the work. Crime labs are not like scientific labs or clinical labs, where there is spot-checking, quality assurance, and regulators doing site visits on the labs to make sure that work is being done accurately. Any factory or facility that has emissions will have onsite inspectors that check air quality and emissions. None of that kind of regulation exists for crime labs. So, they’re called labs, but they’re not regulated like labs would be in a medical setting or in a manufacturing setting.
Can you discuss the Brandon Mayfield case, where Mr. Mayfield was wrongly convicted for the 2004 Madrid train bombing based on shaky fingerprint evidence? In that case, the first analysts who compared those fingerprints thought that there were significant reasons to think the fingerprints from the scene did not match Mayfield’s prints. But the FBI ended up interpreting the results as a “match.” And the expert who testified that this was a fingerprint match presented it in court as essentially foolproof scientific testimony from someone with worlds of experience. How does all that happen?
The Bandon Mayfield case is powerful and cautionary for a number of reasons. Reason number one is that prior to that case, the claim was often made by examiners that fingerprinting was infallible or that, if there were mistakes, it was due to some ill-intentioned, corrupt police examiner, someone who was incompetent or had not been well-trained. This could not have been a more high-profile case: a federal terrorism prosecution. The FBI knew this was a really serious case, and they brought in top people. These were three different FBI or former FBI analysts who were all extremely experienced. So, you couldn’t say that these were people who were incompetent, or who hadn’t been minimally trained, or were at a small lab without standards.
On TV shows, you see multiple people working on a case. In real life, you don’t usually have multiple people working on a case unless it’s a really big case like this one. So that makes the case really compelling––the fact that so many experienced people made errors.
In a more specific way, what makes the case so interesting is the role that different cognitive biases played. And you know, there’s a lot more to be said about the reliability of fingerprinting in general.
But what was really interesting about the case was how the errors came in part from the biases that these experienced analysts had. They may have been biased towards a match because of other information they had about Mr. Mayfield. They knew that he was Muslim and that he was a lawyer in Oregon. It was suspected that he had some connection to Spain, and yet he had never been there. There really were some similarities between his print and the print at the scene, but there were some real differences, and they glossed over the differences. You can see from their process how they glossed over the differences through circular reasoning. They may have been biased in part by stereotypes about who is a terrorist.
They also may have been biased by each other. They may have been reinforcing each other’s confidence. The first one may have been uncertain, but he sees the other two agree and then becomes confident in a match.
Other aspects of the process that they used may have biased them. They were looking at the suspect print and the latent print from the crime scene at the same time. In looking back and forth, one has a tendency to see similarities. It’s like those kids puzzles where you’re trying to see the differences between two playground scenes. It takes a long time to see that the kid on the swing set in one scene is wearing his hat forward, and the kid on the swing set in the other scene is wearing the hat backward. You see the two playgrounds next to each other, and it all looks the same. It’s the same phenomenon as thinking dogs look like their owners and vice versa. When you see two things together a lot, you tend to focus on similarities.
So, there are multiple ways that the process risked cognitive bias: they had task-irrelevant information may have exacerbated stereotypes, the way they were working with each other, and the way that they were looking at the patterns. These all may have contributed to this high-profile mismatch.
It’s interesting to see that the analysts look at the prints next to each other on a screen and make a human comparison.
It shocks people to hear how low-tech these methods remain. People think that firearms evidence, fingerprint evidence, and other pattern evidence involves technology, statistics, and, well, science. When people learn that the analysts are still just eyeballing it, some find it somewhat terrifying given how many important cases revolve around pattern evidence.
Another thing you stressed throughout the book is the problem with some of the phrases that experts use in court. For example, “match,” “100% certainty,” or “individualization,” How does a juror acquit after hearing them?
I went into this research assuming that kind of exaggerated language and overstated conclusions must really impact jurors, especially after reading cases where experts used exaggerated, forceful language. But over the course of conducting a series of studies, I learned that maybe because people have been hearing about 100% fingerprint certainty––it’s been in our culture for decades now––actually the key word that causes people to place a lot of weight on fingerprint evidence is just “fingerprint.”
When you hear there is some kind of fingerprint match, it doesn’t matter whether you hear 100% certainty, or beyond a reasonable scientific certainty, or just, “I’ve compared them and found an identification.” We tried in these studies all sorts of formulations, and generally, jurors assume the evidence is incredibly powerful because it’s a fingerprint.
When you actually tell them information about how fingerprint evidence can go wrong, then jurors step back, but that’s really contrary to people’s assumptions. People assume that if an expert comes in and tells them something about a fingerprint, that is rock solid evidence just as powerful as DNA.
One of the big policy prescriptions you offer is that judges should tell jurors the percentage likelihood of a fingerprint match. You give the example in the book of computerized fingerprint analysis that provides a confidence percentage instead of using a similarity threshold to display a “match” or “non-match.” Do you think that this could make fingerprint evidence more reliable?
The most important thing is that jurors hear about the limitations of fingerprint evidence. It’s a good thing to move from words to numbers, but jurors can have a hard time with numbers, and they need to be explained clearly. If jurors hear about the strength of a match, and they hear it’s a 1000-to-1 match or a 10,000-to-1 match, they might not know what that means, and for some good reason. What they still may be hearing is “fingerprint” and “match,” and whatever the number is, it’s a fingerprint. So, they really need to understand what makes some fingerprint comparisons more reliable than others. The judicial system has let us down by permitting experts to testify without critically examining the limitations of the methods. If an expert is taking the stand, it has to be very clear to the jury what the limitations of the method are. The expert has to be asked and answer questions about it, and in addition, there needs to be a defense expert admitted to explain those limitations.
That relates to how Daubert, the Supreme Court case outlining the standards to admit expert testimony, has tended to work in these cases. For anyone who hasn’t read Daubert, one key factor is whether the expert’s method is scientifically reliable. The reliability of some of the methods we’ve talked about, like fingerprinting, may depend on the situation. Still, other methods, like bite marks, are almost inherently unreliable. Can you speak about why testimony like bite mark testimony has continued to come into courtrooms years after we learn how unreliable it can be?
Even with fingerprint evidence, only comparatively recent studies have been done that can tell us how reliable the method is. We know that there are significant error rates now. And we also know that fingerprint evidence is probably more reliable than some other types of forensics. But it is shocking that we have been using fingerprints for over 100 years, and we have allowed it in court for that long without any careful studies being done about its reliability.
That state of affairs, with no research, continued post-Daubert when there was a supposed reliability test, where judges were supposed to be gatekeepers about the reliability of evidence. If no study has been done, why would we say it’s reliable because it’s been used in the past? The whole point should be that if reliability is unknown, for all we know, this method is astrology. The burden is on the party seeking the introduction of the evidence. If that party doesn’t have scientific evidence through scientific studies of reliability, then the baseline should be that the method is astrology, that it should never get in.
And that is obviously not how the courts have viewed things. What little evidence we have about bite mark evidence suggests dramatic error rates. Nevertheless, courts continue to let the evidence in. There haven’t been adequate studies done, but the insufficiently rigorous error-rates studies that have been done on bite mark comparisons suggest massive error rates. We’re not sure that experts can reliably say that a bite mark is a human bite mark, much that a bite mark belonged to a particular human. It is shocking to many scholars that courts have been so lackadaisical about bite mark evidence. There was an expectation that Daubert would cause judges to be more rigorous about expert evidence used in criminal cases, and that expectation turned out to be overly optimistic.
While you may not directly address it in the book, how should courts apply stare decisis in cases involving forensics? Do you think that Daubert requires reapplying the reliability standard to potential expert testimony without as much regard for precedent but instead with more sensitivity to updated science?
It does. Daubert does not suggest that one of the factors is, “what did judges say about this ten years ago?” The Court talked about general acceptance in the relevant scientific community, but not general acceptance among judges, who aren’t experts in the scientific discipline. We see many state courts where the old case law predates Daubert, and yet you have courts now relying on law that is no longer good law in order to let in forensic experts. We trust that judges will keep abreast of changes in the law, but unfortunately, in the area of evidence law, it is so hard to be reversed on appeal, and as a result, errors can continue unabated.
There is a potential argument that adopting all of your ideas offered in the book, from audits to new procedures, will be too costly. How should we balance the costs and benefits of new forensics procedures, and are there any low-cost, high-yield solutions you have in mind?
To be sure, crime labs have backlogs. And quality control is not free. But if we want reliable forensics, we do have to pay for it. One reason, though, that labs have backlogs is poor quality control. When you have police officers who aren’t trained in submitting the evidence, labs spend an enormous amount of time testing evidence that isn’t really suitable for testing. So having scientific oversight of the evidence function process can generate enormous efficiencies.
And there are enormous backlogs, for example, in untested rape kits. One of the challenges there is that often rape kits are submitted in cases where identity isn’t in dispute, or where there are reasons to think that it won’t be valuable evidence, and crime labs aren’t given any guidance because they weren’t involved in the evidence submission process.
Scientific and enforcement priorities are both important. Law enforcement should be generating priorities based on crime patterns and community interests in crime-solving, but we need scientists to be involved to set priorities related to preserving evidence, focusing on evidence that can actually be tested instead of evidence that won’t lead to anything probative. We often have labs re-testing evidence that is unsuitable for any kind of testing.
There are also enormous costs from lacking quality control once things go wrong. Inevitably, they do, and they have. States are spending tens of millions of dollars because their labs were shut down, and thousands of cases have needed to be reopened. Just recently, the metropolitan lab in DC was shut down. That has come at enormous expense. They are now outsourcing everything to private labs. No one knows how many years of cases will have to be reopened and reexamined having to do with misconduct in their firearms unit. It may have seemed like two years ago that it would cost too much to have rigorous quality controls in that lab. But I would hope that all would agree now that if they had only spent some money on quality control, they could have avoided disaster.
Given the number of scandals and audits at crime labs, I do not understand why we can’t sit down and draft comprehensive regulations for crime labs. A big part of my work in the upcoming years will be to draft model legislation to regulate crime labs.
Finally, what are some of the institutional incentives in play here, particularly when crime labs aren’t independent?
Crime labs may not have independent scientific oversight because the labs are not independent or scientific. A lab may have been an arm of law enforcement and may report to law enforcement or to prosecutors. It may be part of the state department of public safety. Sometimes it may have an independent board, but that board does not meet often and does not meaningfully regulate the lab. There are very few independent labs in the country. Some of the ones that are independent also don’t have quality control. There are some labs with budgetary independence, like the metropolitan lab, but not independent oversight or quality control. That doesn’t mean that making a lab independent is a bad idea. But with independence has to come accountability.
We can require scientific oversight and quality control. In clinical labs, federal legislation requires adoption of quality control standards and testing protocols, and site visits. Law enforcement all around the country is dependent on federal grants––these federal grant programs could be tied to quality controls just as for clinical labs. And states could get involved, like the Forensic Science Commission in Texas that has exercised some oversight. Outside of Texas, we really haven’t seen that kind of oversight. More states could be like Texas. Texas has been the pioneer in forensics reform, in part because of hard lessons learned in wrongful conviction cases. I hope other states will follow Texas’s lead.
Thank you, Professor. We really appreciate your time.