Program Synthesis — Making Code Write Itself
Last Updated on November 5, 2023 by Editorial Team
Author(s): Saar Barak
Originally published on Towards AI.
One term you may have heard but not fully understood is ‘program synthesis,’ often framed as AI’s foray into writing code. This article is the first in a three-part series designed to demystify this concept. Here, we focus on understanding the semantic essence of program synthesis. While I strive for clarity and simplicity, be advised that some technical details might be glossed over for the sake of readability.
To get a handle on Program Synthesis, we first must tackle the term Program. So, let’s start by asking what exactly is a Program.
Program
Basically, programming is the act of telling a computer what to do. Scheduled a reminder on your phone, customize your social media photos, optimize traffic, or set up this Medium article, all of them are programming. To ensure we’re in sync, let’s look at the Programming Triplet:
- A programmer turns a task into a
program
. - A program is a thing that the
programmer
gives, that theinterpreter
understands - An interpreter takes in the
program
and does stuff.
So why do we spend time learning to program? Because computers offer abilities, we can’t match. We do not need to manually stitch together a panoramic photo from a dozen individual pics, I certainly can’t remember all my friends’ birthdays without a little nudge from my phone. I cannot deliver the blog to all of you in person.
Program Synthesis
Program synthesis is a system that makes programming easier. GCC frees you from having to write assembly, your smartphone operation system takes a dozen individual pics and produces a panoramic photograph. The small green ‘publish’ button on the top of this page will send you all this article. With program synthesis, you can work in an enhanced version of a programming language — let’s call it “Program++”. This is more straightforward than the original Program.
The synthesizer acts as a middleman, converting your Program++ back into the original program. When you pair this synthesizer with an interpreter, you get an advanced interpreter — let’s call it “Interpreter++”. This upgraded version takes in your Program++ (commonly referred to as a spec or specification) and carries out the tasks. The process is recursive — you can keep adding layers of synthesizers and interpreters to make programming easier and easier. Thus, program synthesis is simply “easier programming”.
Nowadays Program Synthesis
The art of program synthesis is making program++/spec
as simple as possible. Nowadays, program synthesis techniques have become increasingly sophisticated, like ChatGPT or CoPilot, which generates code from high-level requirements, reducing the need for “hand-written” code and thereby minimizing human error.
After we define what a Program and Program Synthesis are, we need to define precisely what a problem is.
Problem
Let's start by asking: How does a programmer write a program so that the interpreter does the right thing for a task?
Task as Spec
Tasks can be specified in various ways — imagine the different recipes you could use to make a salad. Verifying a program’s correctness also has multiple angles — consider the different criteria to judge whether the salad is tasty, healthy, or visually appealing. However, the nuances of many real-world tasks often escape computational understanding.
Spec as Problem
A specification or spec
is a way of stating the task so that both humans and computers can agree on what needs to be done. One of the rare and special forms of communication is Input-Output (or just I/O) which is readily understood by both humans and computers.
There are other forms of specifications, for instance, it could be an objective function to be maximized, a set of constraints on the computer’s behaviors, or a natural language sentence. But there’s a lot of latitude in how you might write that spec
. Different specs
can describe the same problem from different angles, syntax, or levels of granularity. For instance, if you want to sort a list(L) of n numbers, we could provide:
- A series of I/O pairs: In₁[
3,2,1
] Out₁[1,2,3
] U+007C In₂[0,-1,2
] Out₂[-1,0,2
] U+007C In₃[5,8,100
] Out₃[5,8,100
] U+007C In₄[9
] Out₄[9
] … - A single sentence: “Sort a list of integers in ascending order.”
- A mathematical function: f( X ) = x₁≤ x₂≤ x₃ … ≤ xₙ
- A recursive definition: Sorted(L) ⇔ L[1] ≤ min( L[2:n] ) ∧ Sorted( L[2:n] )
Similarly, for making a salad, the spec could look like:
- I/O Pairs: In₁[
lettuce
,tomatoes
,cucumber
] Out₁[salad
] U+007C In₂[cucumber, caesar
,tomatoes
] Out₂[not salad
] - Natural Language:
Combine lettuce, tomatoes, and cucumber
- Constraint-based: Must include[
at least one green vegetable and one source of protein.
]
correctness
With a spec
, the computer can automatically decide if a program “did the right stuff” by checking if executing the program on the inputs produces the corresponding outputs. we name this process a Verifier.
Now, a simple Verifier gives you a straightforward answer: True or False. is_sorted( [1,2,4] ) =
True. Are the combined elements [tomatoes
, caesar
] are salad ingredients? False. But we can dial it up a notch — some Verifiers go beyond simple binary answers, diving into metrics or scales that measure how well the program did.
From Logic to Learning
Traditionally, program synthesis was a field dominated by strict logical rules and algorithmic reasoning. However, the landscape has been changing rapidly, thanks to the infusion of machine learning techniques. This shift toward learning-based approaches is revolutionizing the speed, accuracy, and efficiency of program synthesis.
Diverse Approaches: The Taxonomy of Program Synthesis Program synthesis is not a one-size-fits-all solution. Depending on the needs of the task, different approaches or types of synthesis can be more advantageous. Let’s delve into some of these types.
- Deductive Synthesis — The Traditionalist. Relies on formal methods, logical frameworks, and algorithmic principles. Highly accurate but may require substantial computational resources. Ideal for critical systems where reliability is paramount.
- Inductive Synthesis — The Modernist. Primarily driven by machine learning techniques and heuristic methods.Excels in speed and adaptability but may lack the stringent accuracy of deductive methods. Best suited for rapid development cycles and less mission-critical applications.
- Stochastic Synthesis — The Realist. Utilizes probabilistic models to deal with the uncertainties in specifications or environments. Offers a balanced approach but may require fine-tuning to achieve desired reliability. It is well-suited for applications where some level of uncertainty or variation is expected.
Let me finish with some thoughts …
Program Synthesis — Beyond Human Capabilities
We do not need to limit ourselves. We aim to build a synthesizer that can go even further beyond!! The aim is not just to replicate human capabilities but to exceed them since Program Synthesis discovers new solutions that humanity currently cannot tackle. We’re talking about potentially breaking new ground in medicine, finance, and maybe even decoding the cosmos…
Further Reading & Reference
- Armando’s lecture notes on deductive synthesis ↩
- A 2021 survey on neuro-symbolic program synthesis ↩
- Competitive programming with AlphaCode ↩
- Evaluating Large Language Models Trained on Code ↩
- Bringing machine learning and compositional semantics together by Percy Liang ↩
- ACL 2018 tutorial on neural semantic parsing ↩
- The Child as Hacker ↩
- Probabilistic Models of Cognition ↩
- Armando’s lecture notes on definition, history of program synthesis ↩
- a minimalist guide to program synthesis
- A 2017 survey on program synthesis by Microsoft folks ↩
MIT License
Copyright (c) 2018 Chester How
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI