[SystemSafety] Bugs in LLM generated proofs

Wed Jan 14 17:40:27 CET 2026

On 14/01/2026 16:23, Derek M Jones wrote:

> You have probably heard about how LLMs have been solving
> mathematical problems, and how the LLMs have used the Lean
> proof assistant to verify the correctness of the proof.
>
> Guess what?  Not only do LLMs generate buggy code, they also
> generate buggy proofs.  A particularly insidious bug occurs
> when the LLM rewrites the original to create something that is
> easier to prove, but does not prove that the rewritten
> question is the same as the original one (sometimes it isn't).
>
> Lots of discussion and links here
> https://www.lesswrong.com/posts/rhAPh3YzhPoBNpgHg/lies-damned-lies-and-proofs-formal-methods-are-not-slopless 
>
>
> ...

Alan Turing raised the interesting question of whether human 
intelligence requires  a random element that can also lead to errors. In 
a 1947 lecture to the London Mathematical Society, Turing said “if a 
machine is expected to be infallible, it cannot also be intelligent. 
There are several mathematical theorems that say almost exactly that. 
But these theorems say nothing about how much intelligence may be 
displayed if a machine makes no pretence at infallibility.”

Martyn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/pipermail/systemsafety/attachments/20260114/9daedddc/attachment.html>