Program Synthesis

Last Updated on November 5, 2023 by Editorial Team

Author(s): Saar Barak

Originally published on Towards AI.

One term you may have heard but not fully understood is ‘program synthesis,’ often framed as AI’s foray into writing code. This article is the first in a three-part series designed to demystify this concept. Here, we focus on understanding the semantic essence of program synthesis. While I strive for clarity and simplicity, be advised that some technical details might be glossed over for the sake of readability.

To get a handle on Program Synthesis, we first must tackle the term Program. So, let’s start by asking what exactly is a Program.

Program

Basically, programming is the act of telling a computer what to do. Scheduled a reminder on your phone, customize your social media photos, optimize traffic, or set up this Medium article, all of them are programming. To ensure we’re in sync, let’s look at the Programming Triplet:

A programmer turns a task into a program.
A program is a thing that the programmer gives, that the interpreter understands
An interpreter takes in the program and does stuff.

So why do we spend time learning to program? Because computers offer abilities, we can’t match. We do not need to manually stitch together a panoramic photo from a dozen individual pics, I certainly can’t remember all my friends’ birthdays without a little nudge from my phone. I cannot deliver the blog to all of you in person.

Program synthesis is a system that makes programming easier. GCC frees you from having to write assembly, your smartphone operation system takes a dozen individual pics and produces a panoramic photograph. The small green ‘publish’ button on the top of this page will send you all this article. With program synthesis, you can work in an enhanced version of a programming language — let’s call it “Program++”. This is more straightforward than the original Program.

The synthesizer acts as a middleman, converting your Program++ back into the original program. When you pair this synthesizer with an interpreter, you get an advanced interpreter — let’s call it “Interpreter++”. This upgraded version takes in your Program++ (commonly referred to as a spec or specification) and carries out the tasks. The process is recursive — you can keep adding layers of synthesizers and interpreters to make programming easier and easier. Thus, program synthesis is simply “easier programming”.

Nowadays Program Synthesis

The art of program synthesis is making program++/spec as simple as possible. Nowadays, program synthesis techniques have become increasingly sophisticated, like ChatGPT or CoPilot, which generates code from high-level requirements, reducing the need for “hand-written” code and thereby minimizing human error.

After we define what a Program and Program Synthesis are, we need to define precisely what a problem is.

Problem

Let's start by asking: How does a programmer write a program so that the interpreter does the right thing for a task?

Task as Spec

Tasks can be specified in various ways — imagine the different recipes you could use to make a salad. Verifying a program’s correctness also has multiple angles — consider the different criteria to judge whether the salad is tasty, healthy, or visually appealing. However, the nuances of many real-world tasks often escape computational understanding.

Spec as Problem

A specification or spec is a way of stating the task so that both humans and computers can agree on what needs to be done. One of the rare and special forms of communication is Input-Output (or just I/O) which is readily understood by both humans and computers.

There are other forms of specifications, for instance, it could be an objective function to be maximized, a set of constraints on the computer’s behaviors, or a natural language sentence. But there’s a lot of latitude in how you might write that spec. Different specs can describe the same problem from different angles, syntax, or levels of granularity. For instance, if you want to sort a list(L) of n numbers, we could provide:

A series of I/O pairs: In₁[3,2,1] Out₁[1,2,3] U+007C In₂[0,-1,2] Out₂[-1,0,2] U+007C In₃[5,8,100] Out₃[5,8,100] U+007C In₄[9] Out₄[9] …
A single sentence: “Sort a list of integers in ascending order.”
A mathematical function: f( X ) = x₁≤ x₂≤ x₃ … ≤ xₙ
A recursive definition: Sorted(L) ⇔ L[1] ≤ min⁡( L[2:n] ) ∧ Sorted( L[2:n] )

Similarly, for making a salad, the spec could look like:

I/O Pairs: In₁[lettuce, tomatoes, cucumber] Out₁[salad] U+007C In₂[cucumber, caesar, tomatoes] Out₂[not salad]
Natural Language: Combine lettuce, tomatoes, and cucumber
Constraint-based: Must include[ at least one green vegetable and one source of protein.]

correctness

With a spec, the computer can automatically decide if a program “did the right stuff” by checking if executing the program on the inputs produces the corresponding outputs. we name this process a Verifier.

Now, a simple Verifier gives you a straightforward answer: True or False. is_sorted( [1,2,4] ) = True. Are the combined elements [tomatoes, caesar] are salad ingredients? False. But we can dial it up a notch — some Verifiers go beyond simple binary answers, diving into metrics or scales that measure how well the program did.

From Logic to Learning

Traditionally, program synthesis was a field dominated by strict logical rules and algorithmic reasoning. However, the landscape has been changing rapidly, thanks to the infusion of machine learning techniques. This shift toward learning-based approaches is revolutionizing the speed, accuracy, and efficiency of program synthesis.

Diverse Approaches: The Taxonomy of Program Synthesis Program synthesis is not a one-size-fits-all solution. Depending on the needs of the task, different approaches or types of synthesis can be more advantageous. Let’s delve into some of these types.

Deductive Synthesis — The Traditionalist. Relies on formal methods, logical frameworks, and algorithmic principles. Highly accurate but may require substantial computational resources. Ideal for critical systems where reliability is paramount.
Inductive Synthesis — The Modernist. Primarily driven by machine learning techniques and heuristic methods.Excels in speed and adaptability but may lack the stringent accuracy of deductive methods. Best suited for rapid development cycles and less mission-critical applications.
Stochastic Synthesis — The Realist. Utilizes probabilistic models to deal with the uncertainties in specifications or environments. Offers a balanced approach but may require fine-tuning to achieve desired reliability. It is well-suited for applications where some level of uncertainty or variation is expected.

Let me finish with some thoughts …