Show HN: Formally verified polygon intersection – Opus 4.8 oneshots, prev failed
This project presents the first formally verified implementation of a polygon intersection algorithm using Lean 4. The verification ensures correctness for all possible polygon configurations, with AI agents (Claude Opus 4.8) autonomously writing proofs and code. Human review is limited to a 87-line specification. The article discusses algorithmic challenges, the role of formal verification, and the evolution of AI agent capabilities.
Article intelligence
Key points
- First formally verified polygon intersection algorithm, built with Lean 4 proof assistant.
- AI agents (Claude Opus 4.8) autonomously generated proofs and implementation; human reviews only 87 lines of specification.
- Formal verification guarantees correctness for infinitely many input configurations.
- Opus 4.8 demonstrated improved proof strategy handling and error detection compared to earlier models.
Why it matters
This matters because first formally verified polygon intersection algorithm, built with Lean 4 proof assistant.
Technical impact
May affect model selection, inference cost, product capability, and evaluation benchmarks.
Notifications You must be signed in to change notification settings
Fork 0
Star 0
BranchesTags
Open more actions menu
Folders and files
NameName
Last commit message
Last commit date
Latest commit
History
2 Commits
2 Commits
.claude
.claude
Polygons
Polygons
docs
docs
readme-assets
readme-assets
scripts
scripts
.gitignore
.gitignore
CLAUDE.md
CLAUDE.md
LICENSE
LICENSE
README.md
README.md
Web.lean
Web.lean
build.sh
build.sh
coi-serviceworker.min.js
coi-serviceworker.min.js
emcc-wasm.sh
emcc-wasm.sh
index.html
index.html
lake-manifest.json
lake-manifest.json
lakefile.lean
lakefile.lean
lean-toolchain
lean-toolchain
serve.py
serve.py
setup_packages.sh
setup_packages.sh
wrapper.c
wrapper.c
Repository files navigation
To my knowledge, this is the first formally verified implementation of an intersection algorithm for polygons. (Also compare with Related work)
The experience of working with AI agents on this project changed a lot with recent model releases. Latest models are able to provide algorithm implementation with formal proof in one shot, whereas previous models required me to provide proof strategies in multiple steps. (Capabilities of AI agents)
Trust in the correctness comes entirely from the Lean checker and human review of a small specification, not from the LLM. (See Use of AI agents)
Try it out
Try out the web demo built around the verified core, where you can draw and intersect multipolygons.
Background
Multipolygon intersection is a standard feature of many vector graphic editors.
A multipolygon is defined by a list of polygon components and polygonal holes, each defined by a list of vertices. It describes a two-dimensional area: the set of interior points. This set can be formally defined by counting the parity of the number of intersections of the polygon with rays cast from each point on the plane. Given two multipolygons, we construct a new multipolygon, whose interior set is the intersection of the two interior sets of the input multipolygons.
There are infinitely many configurations of input polygons, so without formal verification no property can be exhaustively verified for every configuration by classical testing. Furthermore, for each polygon the set of interior points is infinite, so without formal verification interior sets and their intersection are just an interpretation that cannot be represented in the code.
In this development the intersection specification is formally described and fully verified with the Lean 4 proof assistant. So we can guarantee that these infinite sets of interior points actually satisfy the intersection equality, for any configuration of input polygons.
Implementations of computational geometry algorithms like this are notoriously hard to verify by classical testing, because of rare special configurations of inputs that may make up much of the complexity of the algorithm. Consider for example the following example where we intersect a cross (blue polygon in picture below) with a square with a hole (yellow polygons in the picture). To produce a multipolygon that describes the intersection, the algorithm must choose closed boundary components and order the vertices. This choice of which green segments belong together is not unique in this case (for example it could be 4 squares or a cross with a square hole), but it would be unique if the yellow hole were a tiny bit smaller or larger. It is a non-trivial fact (related to Eulerian cycles) that it is possible to partition and order the segments into closed boundary components in all cases.
The length of the formal verification in this repository mostly does not come from the algorithm, but from the fact that many seemingly obvious geometrical facts have no short rigorous proof. Only the proof that definition of "inside" described above is independent from the direction of the ray took thousands of lines of lean.
Use of AI agents
The setup of this repository aims to minimize human review needed to verify correctness of the implementation of the polygon intersection algorithm. A human reviewer just needs to read the 3 files DataStructures.lean, Defs.lean and MultipolygonIntersectionAlgorithmWithPreconditionCheck.lean and run the Lean checker. These are 87 lines of simple-to-understand Lean specification, mostly setting up basic geometrical definitions for polygons. The unoptimized code implementing the algorithm is already more than twice that and more complicated to understand. The code will grow a lot once we add optimizations. The specification that humans have to read to review correctness, on the other hand, will stay the same size.
It is not necessary to read other files to review correctness. I also read and directed the content of other files that don't end in ...Proofs.lean and ...Impl.lean to steer the strategy (pre Opus 4.8, see Capabilities of AI agents). The main theorems in these other files served as checkpoints, so I could use the Lean checker to determine when the agent actually succeeded at a task. The implementation and formal proof of its correctness in the ...Proofs.lean and ...Impl.lean files was autonomously written by AI agents and never reviewed by me or any other human, but thanks to the Lean checker, neither I nor any human reviewer needs to trust any LLM in this process.
The way this separation is structured is described in CLAUDE.md and may not be Lean-idiomatic.
A drawback I observed from forcing AI agents to formally verify their implementation is that this tends to produce code that is slower or disregards other practical considerations that are not captured in the specification. This probably stems from the difficulty of the formal verification pushing for simpler code and from the lack of formally verified practical software in the training data.
Capabilities of AI agents
The experience of working with AI agents on this project changed a lot with recent model releases:
My first attempt at this project was at the beginning of the year, using Claude Opus 4.5 and 4.6. In my experience Opus 4.5 was the first model that could handle non-trivial lean proofs. But it was at the level, that only when I was able to sketch out a perfectly rigorous proof, it would translate it to Lean. For example I needed to split the proof that the definition of the interior set using the ray intersection is independent of the ray directions, into many small steps. I was then stuck at this project until Claude Opus 4.7 came out.
Claude Opus 4.7 enabled to take larger steps. It could prove that for every two polygons, there exists an intersection, but I still needed to provide the idea to use Eulerian circuits and in separate steps give it hints how to handle some tricky special cases. And from this theorem it was then able to autonomously extract a formally verified algorithm.
The day after the release, I tried Claude Opus 4.8 in ultracode mode. I started two sessions in parallel:
One to reprove the main polygon intersection theorem, which is almost all of the previously done work, from scratch with no hints (in an isolated container to avoid peeking at the solution). See github.com/schildep/challenge-verified-polygon-intersection-Opus-4.8.
Another to extend the algorithm to handle special cases regarding overlapping segments, where previous Opus 4.7 sessions failed. To my surprise, both of these sessions succeeded after a few hours of autonomous work.
It was able to formulate and execute large proof strategies. One thing that I observed from looking at the intermediate output of Opus 4.8, is that it actually seems to much more accurately handle the risk of wrong intermediate theorems: To prove a large theorem you have to take risk of loosing time trying to prove helper theorems that a priori might be incorrect. Where previous models got stuck at trying to prove an incorrect intermediate theorem, Opus 4.8 became suspicious, and autonomously pivoted to a different strategy or at another occasion it decided to run parallel subagents to try multiple strategies.
Building and checking
Requires elan. The Lean version is pinned in lean-toolchain (currently leanprover/lean4:v4.15.0) to simplify the WebAssembly build.
Check all proofs:
lake build
Inspect the axioms the theorems of interest depend on (e.g. here correctness theorem of the algorithm). This is important since agents could have introduced unwanted axioms into the proofs. We only depend on trusted axioms [propext, Classical.choice, Quot.sound].
printf 'import Polygons.MultipolygonIntersectionAlgorithmWithPreconditionCheck\n#print axioms multipolygonIntersectionAlgorithmWithPreconditionCheck_interior_eq\n#print axioms multipolygonIntersectionAlgorithmWithPreconditionCheck_complete\n' | lake env lean --stdin
Build the WebAssembly bundle served by the web app (requires emscripten, zstd, wasm-opt):
./build.sh
Next steps
Measure and improve the performance of the implementation
Simplify proofs, that take unnecessary detours, using latest models (as described in Capabilities of AI agents, the current development contains many detours that can be avoided as we saw in github.com/schildep/challenge-verified-polygon-intersection-Opus-4.8)
SVG import/export
Related work
Di Vito and Hocking (NASA Formal Methods 2021) verified a polygon merge algorithm in PVS, combining two overlapping simple polygons, computing a single outer boundary without holes.
About
Formally verified polygon intersection
schildep.github.io/verified-polygon-intersection/
Topics
computational-geometry
formal-verification
lean4
Resources
Readme
License
MIT license
Uh oh!
There was an error while loading. Please reload this page.
Activity
Stars
0 stars
Watchers
0 watching
Forks
0 forks
Report repository
Releases
No releases published
Packages 0
Uh oh!
There was an error while loading. Please reload this page.
Contributors
Uh oh!
There was an error while loading. Please reload this page.
Languages
Lean 98.5%
Other 1.5%