Mata v. Avianca
Two lawyers filed a brief with six fake cases. The judge noticed.
What They Said
Steven Schwartz, a 30-year veteran attorney at the New York firm Levidow, Levidow & Oberman, was handling a routine personal injury claim against Avianca Airlines on behalf of his client Roberto Mata. Avianca moved to dismiss on statute of limitations grounds. Schwartz needed case law to push back.
Schwartz had heard of ChatGPT. He used it to research supporting precedents and, by his own later sworn testimony, “did not comprehend that it could fabricate cases.” When ChatGPT returned citations, he asked the chatbot whether the cases were real. The chatbot said they were. He filed the brief.
What Actually Happened
The opposing counsel could not find the cases. Neither could the court’s clerks. Six of the citations Schwartz had submitted to the U.S. District Court for the Southern District of New York — including Varghese v. China Southern Airlines, Shaboon v. Egyptair, and Petersen v. Iran Air — did not exist. ChatGPT had invented the names, the docket numbers, the judges, and the holdings. The model had even generated plausible-looking quoted passages.
Judge P. Kevin Castel ordered Schwartz and his colleague Peter LoDuca to appear at a sanctions hearing in June 2023. Schwartz testified under oath that he had used ChatGPT as a “supercharged search engine” and that he had no idea the model could hallucinate. The hearing transcript became required reading in legal-tech circles. Castel imposed a $5,000 sanction on the attorneys and the firm and ordered them to send written notice of the sanction to every judge falsely identified as the author of one of the fictitious opinions.
The story dominated coverage for weeks. The firm’s reputation took years to repair. The case became the canonical citation in every bar association ethics opinion on generative AI that followed. By 2024, at least seven other lawyers across the U.S. and Canada had been sanctioned for the same error, and judges in multiple jurisdictions began requiring attorneys to certify whether AI was used in their filings.
The Root Cause
Schwartz treated a generative model as a retrieval system. ChatGPT was trained to produce plausible-sounding text; it was not connected to a legal research database. The model had no way to distinguish between citing a real opinion and writing one. Schwartz’s verification step — asking the chatbot to confirm the cases were real — was structurally meaningless because it was the same system that had invented them.
The second failure was professional. A 30-year litigator used a tool he did not understand on a federal filing without running a single Westlaw or LexisNexis search to verify the output. The firm had no policy on AI use, no training, and no sampling check on associate work product. The supervising partner countersigned the brief without independent review.
The Pattern to Watch For
Generative models are persuasive in a way that retrieval systems are not. They do not return “no results found.” They return a confident, fluent answer to any question, which is exactly the failure mode in regulated work. If your professionals — lawyers, doctors, auditors, advisors — are using a generative tool where their license depends on verifiable citations, the output must be cross-checked in a system of record. There is no other safe configuration.
What You Should Steal
Levidow’s mistake was the absence of a verification step that any first-year associate would have performed before AI existed. Codify the rule: any external citation produced by an AI tool must be opened in the system of record — Westlaw, PubMed, Bloomberg, your CRM — before it appears in a deliverable. Make the verification log auditable. The check takes three minutes and would have prevented every sanction case that has followed Mata.