Compilation Stages#

This notebook describes the different compilation stages in Pulla.

import os
from math import pi
from pprint import pprint
from IPython.core.display import HTML

from qiskit import QuantumCircuit, visualization

from iqm.qiskit_iqm import transpile_to_IQM
from iqm.qiskit_iqm import IQMProvider
from iqm.pulla.pulla import Pulla
from iqm.pulla.utils_qiskit import sweep_job_to_qiskit, qiskit_to_pulla, get_qiskit_compiler
from iqm.pulse.playlist.visualisation.base import inspect_playlist
from iqm.pulse.builder import ScheduleBuilder
from iqm.pulse.playlist.instructions import VirtualRZ
from iqm.pulse.playlist.schedule import Schedule, Segment

iqm_server_url = os.environ['PULLA_IQM_SERVER_URL']  # or set the URL directly here
os.environ["IQM_TOKEN"] = os.environ.get("IQM_TOKEN")  # or set the token directly here

pulla = Pulla(iqm_server_url)
provider = IQMProvider(iqm_server_url)
backend = provider.get_backend()

qc = QuantumCircuit(3, 3)
qc.h(0)
qc.cx(0, 1)
qc.cx(0, 2)
qc.measure_all()

# Transpile, route and optimize the circuit
qc_transpiled = transpile_to_IQM(qc, backend=backend, layout_method='sabre', optimization_level=3)

# Print the circuit
qc_transpiled.draw(output='mpl', style="clifford", idle_wires=False)

Standard stages#

The Pulla compiler comes with a pre-defined “standard” set of customizable stages. These standard stages are also used by IQM Server to submit a circuit to IQM Server (without using Pulla).

A CompilationStage consists of multiple stage pass functions, which are just normal Python functions. The pass function arguments fall into the following three categories:

The first argument is always the “payload” or the circuits to compiler. It can be named and typed whatever.
Arguments that are one of the iqm.cpc.compiler.compilation_stage.DEFAULT_CONTEXT_KEYS can be added into the args and the corresponding object will then be fetched automatically from the context (this includes having the whole context as an arg).
Finally, the so-called stage options, meaning that any other argument will become a changeable setting in under Compiler.get_settings(...).stages.

View the standard stages via Compiler.show_stages() and click on individual stages to show the stage pass functions they consist of and further click on the individual stages to view their docstrings.

compiler = pulla.get_standard_compiler()
compiler.show_stages()

Compiling the above circuit is possible by running the standard stages. The compiler context is used to implement stateful effects during the compilation and it will collect some useful info and metadata about the circuits in question.

circuits, compiler = qiskit_to_pulla(pulla, backend, qc_transpiled)
job_definition, context = compiler.compile(circuits)

print("Final job definition:")
pprint(job_definition)

print("Context fields:")
print(list(context.keys()))

Instead of calling compile() (which runs all the stages), each stage, or even each separate pass, can be run individually. In that case, the following has to be taken care of:

Provide initial context to the first pass of the first stage.
Save data and context returned by each pass (or stage) in order to provide it to the next pass (or stage).

To help with 1, the compiler has a method compiler_context which returns a dictionary of the initial context. The method takes two mandatory arguments, components and settings. The former is used in parametric circuits to generate the circuit for a particular subset of the QPU and can be left as None here, and the latter is the familiar Compiler.get_settings that has been used already.

context = compiler.compiler_context(components=None, settings=compiler.get_settings(circuits=circuits))

Compile a circuit “manually” to run each of the standard stages one by one in the notebook.

1st standard stage: circuit-level passes#

The first stage is circuit-level passes:

When defining a circuit in IQM JSON or IQM Pulse format directly, specify an implementation for each gate (selecting from implementations provided by the calibration set). If no implementation is specified, the standard circuit-level stage will select the default implementation for each gate automatically.

Note: the circuit generation will automatically become the first compilation stage whenever you call Compiler.run_stages. In the case of parametrized circuits, Compiler provides the circuit parameter values to the circuit generation function (values provided either in the settings or in sweeps). Here, the circit is not parametrized, so Compiler just wraps it into a trivial circuit generation function.

processed_circuits, context = compiler.run_stages(
    circuits,
    stages=[compiler.circuit_stages[0]],
    context=context
)
pprint(processed_circuits)

2nd standard stage: circuit resolution#

Next, convert the circuit to TimeBoxes. TimeBox is a pulse-level intermediate representation used in the IQM stack: a container for one or more instruction schedule fragments, to be scheduled according to a given scheduling strategy.

timeboxes, context = compiler.run_stages(
    processed_circuits,
    stages=[compiler.pulse_stages[0]],
    context=context
)
# let's look into the first TimeBox which corresponds to the PRX instruction on QB1
timeboxes[0].print()

timeboxes is a list of TimeBox objects, and can be edited manually. A TimeBox can contain multiple children TimeBoxes, each containing either more TimeBoxes or a Schedule. A TimeBox containing a Schedule rather than children is referred to as “atomic”. In the provided example the circuit was converted into one TimeBox containing 11 children atomic TimeBoxes, which correspond to 11 circuit operations (7 gates + 1 barrier gate + 3 measurement gates). An atomic TimeBox holds its Schedule in an atom property:

timeboxes[0][0].atom.pprint()

3rd standard stage: timebox-level passes#

The measurements are multiplexed in the timebox-level stage. The measure_all() in the circuit creation adds a single TimeBox with the gate implementation Measure_Constant for each qubit in the circuit. With the default settings option, the first stage has also added a measurement TimeBox for the unused qubits.

Multiplexing means executing all of these measurements at once, instead of one after the other. The multiplexing pass takes care of this optimization:

multiplexed_timeboxes, context = compiler.run_stages(
    timeboxes,
    stages=[compiler.pulse_stages[1]],
    context=context
)
multiplexed_timeboxes[0].print()

To ensure that measurements are multiplexed together in a Qiskit circuit, they should get wrapped with barriers. This would prevent the Qiskit transpiler from putting any other instructions, acting on the same qubits, in between measurements, thus allowing the compiler to multiplex.

Notice that there is also an additional Wait TimeBox added at the beginning of the circuit. This corresponds to reset. By default, the reset is done via relaxation, so we just add a long-enough wait time (this also calibrated per QPU) before the actual circuit. If active reset is used (can be switched on via the settings), we would add that instead.

The TimeBox stage also adds another measurement in the beginning of the circuit if heralding is used (by default it is not, but this can be changed in the settings).

4th standard stage: timebox resolution#

Next, convert the TimeBoxes into a single Schedule. This is a recursive process which resolves all nested TimeBoxes into atomic TimeBoxes, and finally assembles a single Schedule out each of batches of TimeBoxes. At this stage, all relative timings between pulses are resolved and fixed.

schedules, context = compiler.run_stages(
    multiplexed_timeboxes,
    stages=[compiler.pulse_stages[2]],
    context=context
)
schedules[0]._contents

5th standard stage: dynamical decoupling#

Dynamical decoupling pulse sequences get inserted to replace Wait instructions. The process is controlled by a user-submitted dynamical decoupling strategy. By default, this stage is disabled. Consult other notebooks for examples of how to enable and apply dynamical decoupling.

6th standard stage: schedule-level passes#

Next is a schedule-level stage. Its first pass applies calibrated phase corrections if MOVE gates are used (only applicable to QCs with computational resonator, i.e. the IQM Star Architecture). The second pass removes non-functional instructions from the schedules. This is the last standard compilation stage.

processed_schedules, context = compiler.run_stages(
    schedules,
    stages=[compiler.pulse_stages[4]],
    context=context
)

Job definition finalization#

Finally, compiler.final_stages contain the stages and passes for turning the processed_schedules into an executable job definition. It contains the playlist, which can now be visualized.

job_definition, context = compiler.finalize(processed_schedules, context)
HTML(inspect_playlist(job_definition.sweep_definition.playlist, [0]))

In order to submit this final schedule for execution, one more step is required: building the Station Control settings. The settings control the behaviour of instruments.

At this point everything is ready to be submitted for execution to the server.

job = pulla.submit_playlist(job_definition, context=context)
job.wait_for_completion()

Now these raw results can be converted into a Qiskit Result object:

qiskit_result = sweep_job_to_qiskit(job, shots=1000)
print(f"Qiskit result counts: {qiskit_result.get_counts()}")
visualization.plot_histogram(qiskit_result.get_counts())

The same circuit can also be submitted to IQM Server for execution. IQM Server uses a server-side Pulla with fixed standard stages. Since starting with a normal Qiskit backend and a circuit, execution is as simple as:

job = backend.run(qc_transpiled, shots=1000)

print(job.result().get_counts())
visualization.plot_histogram(job.result().get_counts())

Post-processing#

After execution, the raw hardware data needs to be transformed into a human-readable result. Pulla provides two post-processing modes:

Circuit-style — Returns a CircuitExecutionResults with measurement bit-strings mapped back to the original circuit. Used when calling job.result() without a compiler.
EXA-style — Returns a RunResult containing an xarray Dataset with rich analysis data. Used when calling job.result(compiler=compiler), which invokes compiler.post_process().

The exa_style_pp flag in pulla.get_standard_compiler(exa_style_pp=True) (the default) controls which post-processing stages the compiler is created with. When no compiler is passed to job.result(), circuit-style post-processing is always used regardless of this flag.

Circuit-style post-processing#

Runs a single stage (construct_circuit_execution_results) that:

Unmaps hashed readout keys back to their original names.
Converts raw sweep results into per-circuit measurement dictionaries (bit-strings to counts).
Handles heralding if enabled.

EXA-style post-processing#

Runs two steps sequentially:

Step 1: construct_run_result — Aggregates raw hardware data into a structured RunResult backed by an xarray Dataset.

Step 2: construct_data_variables — Applies a series of passes to enrich the dataset.

To use EXA-style post-processing, pass the compiler to job.result(compiler=compiler), which will call compiler.post_process() internally.

Adding custom stages/passes#

Now it is time to customize the standard compilation pipeline, augmenting it with a custom pass.

In principle, there is no limit what a pass function may do, but for demonstration purposes, create a pass that merges adjacent VirtualZ instructions together into a single long instruction with the rotation angle taken from the sum of the original instructions’ angles. In practice, this kind of pass could be useful for complex circuits that have a lot of CZ gates as it could save some instruction memory from the hardware (even further, try flushing not needed VirtualZ instructions altogether and merge the ones not needed to the next drive IQPulse as a phase increment).

# Define a circuit that has a lot of VirtualZ instructions in it (one CZ gate results in one VirtualZ instruction in each of its qubit drive channels).
def _qiskit_circuit(n_cz: int = 1):
    qc = QuantumCircuit(2, 2)
    qc.h(0)
    for _ in range(n_cz):
        qc.cz(0, 1)
    qc.measure([0,1], [0,1])
    return [qc]

# Use the Qiskit compiler this time
compiler = get_qiskit_compiler(pulla, backend)
settings = compiler.get_settings(circuits=_qiskit_circuit)

# Add a lot of repeated CZ gates => a lot of VirtualZ instructions
settings.stages.circuit_generation.circuit.n_cz = 20

job_definition, context = compiler.compile(
    circuits=_qiskit_circuit,
    components=["QB3", "QB5"],
    settings=settings,
)
HTML(inspect_playlist(job_definition.sweep_definition.playlist, [0]))

This shows a lot of back-to-back VirtualZ instructions in the drive channel. Adding the stage for merging them saves up some instruction memory from the hardware.

# Define the VirtualZ merging stage pass as a Python function
def merge_adjacent_vzs(schedules: list[Schedule], builder: ScheduleBuilder) -> list[Schedule]:
    """Merge adjacent VirtualZ instructions into a single instruction, saving HW memory

    The rotation angle and the duration is the sum of the individual VirtualZ angles/durations.

    Args:
        schedules: The resolved pulse-level schedules.
        builder: The ScheduleBuilder.

    Returns:
        The schedules with adjacent VirtualZ instructions merged.
    """
    new_schedules = []
    for schedule in schedules:
        new_schedule = {}
        for ch, seg in schedule.items():
            new_seg = []
            prev_rz = None
            for inst in seg:
                if "drive.awg" in ch:
                    if isinstance(inst, VirtualRZ):
                        if prev_rz:
                            prev_rz = VirtualRZ(
                                duration=prev_rz.duration + inst.duration,
                                phase_increment=prev_rz.phase_increment + inst.phase_increment
                            )
                        else:
                            prev_rz = inst
                    else:
                        if prev_rz:
                            new_seg.extend([prev_rz, inst])
                            prev_rz = None
                        else:
                            new_seg.append(inst)
                else:
                    new_seg.append(inst)
            if prev_rz:
                new_seg.append(prev_rz)
                prev_rz = None
            new_schedule[ch] = Segment(new_seg)
        new_schedules.append(Schedule(new_schedule))
    return new_schedules

compiler = get_qiskit_compiler(pulla, backend)

# Insert the stage pass into the schedule-level stage (as the final pass)
compiler.pulse_stages.schedule_stage.add_passes(merge_adjacent_vzs)

# Inspect the stages after this
compiler.show_stages(pass_name="merge_adjacent_vzs")

Finally, compile with the new pass included and visualize the playlist.

settings = compiler.get_settings(circuits=_qiskit_circuit)

# Add a lot of repeated CZ gates => a lot of VirtualZ instructions
settings.stages.circuit_generation.circuit.n_cz = 20

job_definition, context = compiler.compile(
    circuits=_qiskit_circuit,
    components=["QB3", "QB5"],
    settings=settings,
)
HTML(inspect_playlist(job_definition.sweep_definition.playlist, [0]))

The same could have been achieved by defining a new CompilationStage altogether, named e.g. “virtual_z_merging” adding the stage after the default schedule-level stage and adding the pass into it. CompilationStage is just an abtraction layer for dividing the stage passes into useful categories – in practice, just a bunch of passes are executed sequentially.

Compilation Stages

Contents