Technology

When queried by judge, chatbot had less faith in its output than expert who used it

  •  
  •  
  •  
  • Print

Robots

When confronted with evidence that an expert witness used a chatbot to cross-check lost-value calculations, a judge in Saratoga County, New York, decided to query the source. (Photo illustration by Sara Wadford/ABA Journal)

When confronted with evidence that an expert witness used a chatbot to cross-check lost-value calculations, a judge in Saratoga County, New York, decided to query the source.

“Are your calculations reliable enough for use in court?” New York Judge Jonathan Schopf asked Microsoft Copilot, a generative artificial intelligence chatbot.

“When it comes to legal matters,” the chatbot replied, “any calculations or data need to meet strict standards. I can provide accurate info, but it should always be verified by experts and accompanied by professional evaluations before being used in court.”

Schopf also asked Microsoft Copilot to calculate the value of $250,000 invested in the Vanguard Balanced Index Fund from Dec. 31, 2004, through Jan. 31, 2021. He posed the question on three different computers and got three different answers: $949,070, $948,209 and $951,000.

Ars Technica covered Schopf’s Oct. 10 decision.

Schopf “found that Copilot had less faith in its outputs than [the expert witness] seemingly did,” Ars Technica concluded.

The expert testified for a trust beneficiary who said a Bahamas rental property should have been sold as an estate asset in 2004, rather than by the trustee in January 2022. During that time, the trustee, the beneficiary’s aunt, had sometimes traveled to the property, combining upkeep with vacation use.

The property sold for $485,000 in 2022, netting $323,721 after operating losses. The son had contended that he could have invested the sales proceeds if the property had been sold earlier.

Citing inherent unreliability issues surrounding AI, Schopf concluded that lawyers have a duty to disclose its use and the evidence that it generated. The courts should then have a hearing to determine whether the evidence can be admitted based on general acceptance in the relevant field.

Schopf commented on the chatbot use, despite saying the son failed to prove that his aunt breached her fiduciary duties. And if she had breached her duties, the son failed to prove damages, the judge said.

Ars Technica spoke with Eric Goldman, an internet law expert, who told the publication that attorneys retain expert witnesses for their specialized expertise.

“It doesn’t make any sense for an expert witness to essentially outsource that expertise to generative AI,” Goldman said.

The case is Matter of Weber.

Give us feedback, share a story tip or update, or report an error.