Why Is It So Difficult to Write Software for Quantum Computers? And What Can Be Done About It
Many organizations have realized that quantum computing could fuel transformational changes in their industries. Whether these are substantial cost reductions — such as transportation route optimizations — or new revenue streams — such as the discovery of new compounds — the potential impact of quantum computing is too large to ignore.
It is typical to see a modest ‘wish list’ from organizations that form quantum computing teams: develop internal expertise (as opposed to outsourcing everything), short prototyping cycles, quantum algorithms that scale (to avoid rewriting them when stronger hardware becomes available), and the desire to bring in domain experts who are not versed in quantum computing.
But to their dismay and sometimes to their surprise, many teams discover that creating useful or sophisticated software for quantum computing is very difficult. Why is this the case and what can be done about it?
Translating classical into quantum is difficult
Converting an algorithm from C to Python is easy. Expressing a classical algorithm in a way that can be executed efficiently on a quantum computer is much more difficult. One reason is that many of the classical programming constructs — such as loops, conditional branching and random-access memory — either don’t exist or are completely different in quantum programming. Another reason is that the power of quantum computing comes from the unique properties of qubits: superposition, entanglement, and interference. For a quantum computer to offer any kind of advantage over classical computers, algorithm designers need to think in quantum terms. That’s why quantum algorithms such as the Grover search, look nothing like their classical counterparts.
Today’s quantum computers have many practical limitations. The number of qubits is relatively, as is the ability of a quantum computer to perform long calculations. As a result, practical algorithms in the current NISQ era are often hybrid algorithms — executing some portion on quantum computers and another portion on a controlling classical machine. This adds an extra layer of complexity to creating quantum algorithms.
Even when the starting point is not an algorithm, but rather a method of calculating a number (e.g. derivatives of financial options), specifying this method as a quantum-ready recipe is difficult. Such a recipe is typically expressed as a collection of simpler functional blocks. These might be “perform QFT” or “for each bit of the register, apply a controlled rotation of angle X on the accumulator qubit conditioned on all previous bits having been zero.” As an example, look at academic articles such as this and this, written by quantum teams at leading Wall Street firms. One can easily appreciate the complexity of breaking an algorithm into functional blocks that could then be implemented with quantum computers.
The complexity at the functional block level
Once the functional blocks are defined, a new set of difficulties arise.
As in classical computing, there are multiple ways to implement certain operations. Wikipedia, for instance, lists 43 different number sorting algorithms. Some are faster, some are more efficient, some are simpler to program. While there might not be that many options today for quantum, a designer still has to decide which implementation to use. For instance, would you use a QFT adder or a ripple adder? The selection criteria are similar but different — some algorithms are shallower (meaning they require fewer steps), some require more qubits, and so on. Even for a given implementation, the designer has flexibility in the order of commutative blocks, or even whether to perform such blocks in parallel or serially.
Many quantum blocks use auxiliary qubits, sometimes called ancilla qubits. These are used either to help with the particular operation or to make the quantum circuit reversible. Any qubit that is not measured yet will be used later in the circuit needs to be “uncomputed”. Ancilla qubits can’t just be reset to zero because these utility qubits may have become entangled with other qubits, so simply resetting them to zero could disturb other qubits. Thus, the complexity of each block is increased because any ancilla qubits need to be uncomputed. On the other hands, using more ancilla qubits could simplify an implementation.
Another factor is the accuracy of the circuit. For example, calculations might need an approximation of a gaussian distribution. How accurate should the gaussian approximation be? Greater accuracy might mean more qubits, deeper circuits, and so forth.
System-level complexity
When the individual functional blocks are designed, it is tempting to say that all that is left is to string them together. While technically true, the hard work is just beginning.
An algorithm designer has to address not just functional-level requirements but also system-level constraints. Are there enough qubits to tie together all the blocks? Is the resultant circuit too long and thus will be overrun by noise? Which functions should be performed in parallel to others and which must be done serially?
Another problematic aspect is the desire to consider the native gate set of the target hardware. Different quantum computers have different native gate sets, and using these native gates is preferred. The designer must consider the target hardware, or create multiple versions to accommodate different target computers.
Beyond understanding which gates are native for the particular hardware, the design could benefit from awareness of other hardware attributes. For instance, the level of connectivity between qubits might impact the number of qubit swaps that are required. Superconducting quantum computers, for instance, typically have lower connectivity than atom-based qubits. The noise and fidelity model of the hardware could also be taken into accounts. If not all qubits are created equal — some are noisier than others — maybe it makes sense to design a deeper circuit that uses fewer qubits instead of the shallower circuit that uses more qubits.
A new set of problems arises when considering memory management — the usage of ancilla qubits. The designer needs to understand which qubits can be reused and how to best do so. An ancilla qubit earlier in the circuit might be useful as an output qubit later on. If a functional block used many ancilla qubits, are there enough left for other blocks and, if so, have they been returned to their original state? On the other hand, ancilla qubits that don’t need to be reused might not need to be uncomputed, hence reducing the gate count and depth of the circuit.
The design also needs to make sense from an accuracy perspective: if one block uses low-accuracy approximations, does it make sense to follow it by high-accuracy calculations, or should there be a better balance between them?
Here is the most important takeaway: addressing the system-level constraints and optimizing the circuit at the system level might require going back and modifying the individual blocks. Block-level and system-level optimizations are interdependent. This is one key aspect of why writing software for quantum computers is so difficult.
Why is important to optimize at all? Because that can be the reason you could execute quantum code six months before your competitor. Because that can be the reason that on a given computer you can optimize a portfolio of 100 assets, not just 50. You can do a lot with few resources — the Apollo 11 on-board computer had just 72K of ROM and 32K of RAM — but you need to become really good at squeezing the performance out of the given hardware.
But wait, there’s more…beyond the system level
Typical desktop machines can simulate quantum circuits of about 30 qubits. High-performance supercomputers might be able to simulate a 50-qubit design. But what about larger circuits, such as those for the newly-announced 127-qubit IBM machine? How can you debug a circuit that you can’t simulate?
When writing classical software, you can set a breakpoint and inspect a variable. You can set that variable to a new value before continuing on. How can you do this on a quantum computer? Are we relegated to having no debugger for increasingly-sophisticated quantum programs?
And even before the debugging phase, a designer would love to know if the algorithm in mind can fit in a given computer. How many qubits are optimally required? How deep is the circuit? What is the best hardware to execute this algorithm on? This knowledge might prompt the designer to wait a few years for stronger computers or to simply their algorithm to fit in the current one. The ability to estimate the required resources can be a big time-saver, providing a go/no-go decision on a project before additional significant efforts have been invested in writing code.
The Classiq approach
As much as these are difficult problems, they are also important ones. If we don’t solve them, quantum computing might end up a nice and exciting science experiment, but lacking in practical applications. If we can only solve “toy quantum problems” by hand-crafting simple circuits, we haven’t fulfilled the promise of quantum.
When the team at Classiq started looking at these problems, seeking inspiration for solutions, we realized that chip design — the process of putting together digital integrated circuits — had many similar problems. For instance, chips have an increasing number of transistors. The Intel 4004, the first commercial-produced microprocessor had just over 2000 transistors. A modern i7 chip has over 4 billion transistors, and there is no way such large chips could be designed and optimized manually.
Chips also have high-level building blocks such as memory, a USB controller, a graphics processor, and so forth. They also have lower-level blocks such as counters, shift registers, and data latches. Designing, testing, and scaling up such blocks cannot be done at the transistor level. To make this process much more efficient — and in many cases possible at all — high-level functional languages such as VHDL and Verilog were invented, allowing the designer to express the behavior of the circuit, and let classical computers synthesize a gate-level circuit from it.
All of these have to be integrated and jointly optimized. If the RAM uses too many transistors, there would not be enough left for the GPU. The USB perhaps needs to be next to the edge of the chip to allow connectivity. Timing has to be observed so that memory access is reliable. The circuit synthesis tools take these constraints into account, exploring numerous options to find some that fulfill the requirements yet meet the constraints.
Armed with this understanding of the history and the state-of-the-art of chip design, we set out at Classiq to create a similar process for quantum algorithm design: a design language that can help engineers express the desired behavior, a modular approach that allows for multiple implementations of each building block, a synthesis engine that converts the functional models into qubits and quantum gates, and unique verification and debugging tools that make it easier to analyze these circuits that are too large to be simulated. We wanted to achieve a division of labor: allow the designer to focus on the “What” — on defining what needs to happen and what constraints need to be met — and have a powerful classical computer focus on the “How” — synthesizing these requirements into intricate and optimized quantum circuits. We think we succeeded, but we look forward to working with other players in the quantum ecosystem to make it even better.
December 3, 2021
Originally published at https://quantumcomputingreport.com on December 3, 2021.