Ponder This Challenge - May 2023 - Computing quadratic forms

This problem was suggested by Lorenzo Gianferrari Pini and Radu-Alexandru Todor - thanks!

Let $x\in\mathbb{R}^n$ be a length $n$ vector $(x_0, x_1, \ldots, x_{n-1})$ and $A\in\mathbb{R}^{n\times n}$ be an $n\times n$ matrix. We seek to compute the quadratic form $x^TAx$ .

Assume $x$ is equally spaced between $-1$ and $1$ , e.g., for $n=5$ , we have

$x = (-1, -0.5, 0, 0.5, 1)$

The matrix $A$ is generated in the following manner:

Let $k\in\mathbb{N}$ be a natural number and define a vector $a\in\mathbb{R}^{2^k}$ that is equally spaced between $0$ and $1$ , i.e., $a_t =\frac{t}{2^k-1}$ for $t=0,1,\ldots,2^k-1$ . The values of $A$ will be taken from the vector $a$ in the following manner:

We are given a sequence $Q_0, Q_1,\ldots, Q_{n-1}$ , with each $Q_i\in\mathbb{N}^k$ being a vector of $k$ natural numbers. Given two such vectors, define $Q_i==Q_j$ as the binary vector of length $k$ that has 1s in the entries that are equal in $Q_i, Q_j$ and 0s in the other entries. Let $[Q_i==Q_j]$ be the natural number whose binary representation is $Q_i==Q_j$ , where the least significant bit is on index 0. For example, if

$Q_i = (1, 5, 7, 8)$

$Q_j = (2, 5, 6, 8)$

Then

$Q_i==Q_j = (0, 1, 0, 1)$

And

$[Q_i==Q_j]$ = $10$ (since the vector $(0, 1, 0, 1)$ stands for the binary representation 1010).

Now define $A_{ij} = a_{[Q_i==Q_j]}$ .

So one can think of $A_{ij}$ as taking on a value from a fixed list of $2^k$ values based on the "similarity" between $Q_i$ and $Q_j$ .

The values of the $Q_i$ 's are chosen pseudo-randomly using the formula

$Q_i[t] = \left\lfloor2^k\cdot (\sin((i+1)\cdot (t+1))-\left\lfloor \sin((i+1)\cdot (t+1))\right\rfloor )\right\rfloor$

For example, if $k=5$ , then $Q_{13}=(31, 8, 2, 15, 24)$

Your goal: Find $x^TAx$ (rounded to three decimals) for $k=5, n=2^{20}$ .

A bonus "*" will be given for finding $x^TAx$ (rounded to three decimals) for $k=5, n=2^{30}$ .

Solution

May 2023 Solution:
The numerical results are
52390.274 for $n=2^{20}$
156032.269 for $n=2^{30}$
We have accepted other, less exact solutions.
The naive solution for computing $x^TAx=\sum_{i,j=0}^{n-1}A_{ij}x_ix_j$ requires $O(n^2)$ steps (ignoring $k$ ). Instead, We describe here a clever way to compute the form in $O(n\cdot 2^k)$ steps, based on the solution communicated to us by Lorenzo Gianferrari Pini.
First of all, note that $A_{ij}$ takes values from the length $2^k$ vector $a$ , so we can rewrite the sum by agglomerating over the values of $a$ :
$\sum_{i,j=0}^{n-1}A_{ij}x_ix_j=\sum_{t=0}^{2^k-1}a_t\cdot\sum_{i,j:[Q_i==Q_j]=t}x_ix_j$
Define a length $2^k$ vector $u$ by $u_{t}=\sum_{i,j:[Q_i==Q_j]=t}x_ix_j$ . Given $u$ , the requested solution is $a\cdot u^T$ .
From now on we can focus on computing $u_{t}$ for a specific value of $t$ . This is done using a technique similar to the inclusion-exclusion principle.
We illustrate this for $k=5$ and
$Q_{1}=\left[0,0,0,0,0\right]$
$Q_{2}=\left[0,1,0,1,0\right]$
$Q_{3}=\left[0,1,0,0,0\right]$
In this case, $\sum_{i,j:\left[Q_{i}==Q_{j}\right]=\left\{ 1,3,5\right\} }x_{i}x_{j}=x_{1}x_{2}+x_{2}x_{1}$ since Q_2 agree precisely on the indices $1,3,5$ and disagree on $2,4$ . Since $Q_3$ Agrees with $Q_1$ on index $4$ and with $Q_2$ on index $2$ , it does not participate in this sum.
We now present a different way of obtaining this sum (more efficient when used in general). For any subset $A\subseteq\{1,\ldots,k\}$ Define an indicator function $\left[Q_{i}=Q_{j}\right]^A$ taking the value 1 if $Q_i$ and $Q_j$ agree on the indices in $A$ (indices not in $A$ are not considered at all) and 0 otherwise.
Now note that the sum $\sum_{i,j:\left[Q_{i}==Q_{j}\right]=\left\{ 1,3,5\right\} }x_{i}x_{j}$ can be written using indicators:
$\sum\_{i,j}x\_{i}x\_{j}\left[Q\_{i}=Q\_{j}\right]^{\left\{ 1\right\} }\left(1-\left[Q\_{i}=Q\_{j}\right]^{\left\{ 2\right\} }\right)\left[Q\_{i}=Q\_{j}\right]^{\left\{ 3\right\} }\left(1-\left[Q\_{i}=Q\_{j}\right]^{\left\{ 4\right\} }\right)\left[Q\_{i}=Q\_{j}\right]^{\left\{ 5\right\} }$
Opening the product, we obtain a sum of products of the form
$\sum\_{i,j}x\_{i}x\_{j}\left(\left[Q\_{i}=Q\_{j}\right]^{\left\{ 1,3,5\right\} }-\left[Q\_{i}=Q\_{j}\right]^{\left\{ 1,2,3,5\right\} }- \left[Q\_{i}=Q\_{j}\right]^{\left\{ 1,3,4,5\right\} }+\left[Q\_{i}=Q\_{j}\right]^{\left\{ 1,2,3,4,5\right\} }\right)$
Each summand is an indicator for a set $A$ which is a superset of $\{1,3,5\}$ . The sign of the summand is related to the oddity of the extra number of indices added to $A$ (indices not from $\{1,3,5\}$ , which come from a term of the form $1-\left[Q_{i}=Q_{j}\right]^{\left\{ r\right\} }$ ). We can write this in short form as
$\sum\_{i,j}x\_{i}x\_{j}\sum\_{\left\{ 1,3,5\right\} \subseteq A}\left(-1\right)^{\left|A\right|-\left|\left\{ 1,3,5\right\} \right|}\left[Q\_{i}=Q\_{j}\right]^{A}$
Switching the summation order we obtain
$\sum\_{\left\{ 1,3,5\right\} \subseteq A}\left(-1\right)^{\left|A\right|-\left|\left\{ 1,3,5\right\} \right|}\sum\_{i,j}\left[Q\_{i}=Q\_{j}\right]^{A}x\_{i}x\_{j}$
We are left with the question of how to compute $\sum_{i,j}\left[Q_{i}=Q_{j}\right]^{A}x_{i}x_{j}$ . This is relatively easy since we do not require $Q_i, Q_j$ to be distinct in the indices not present in $A$ . This means the relation $i\sim\_A j\iff\left[Q\_{i}=Q\_{j}\right]^{A}=1$ is an equivalence relation over the set $[n]=\{1,2,\ldots,n\}$ , and we can write: $\sum\_{i,j}\left[Q\_{i}=Q\_{j}\right]^{A}x\_{i}x\_{j}=\sum\_{X\in\left(\left[n\right]/\sim\_{A}\right)}\left(\sum\_{i\in X}x\_{i}\right)^{2}$
In our example, we have
$\sum_{i,j}\left[Q_{i}=Q_{j}\right]^{\left\{ 1,3,5\right\} }x_{i}x_{j}=\left(x_{1}+x_{2}+x_{3}\right)^{2}$ $\sum_{i,j}\left[Q_{i}=Q_{j}\right]^{\left\{ 1,2,3,5\right\} } x_{i}x_{j}=x_{1}^{2}+\left(x_{2}+x_{3}\right)^{2}$ $\sum_{i,j}\left[Q_{i}=Q_{j}\right]^{\left\{ 1,3,4,5\right\} } x_{i}x_{j}=\left(x_{1}+x_{3}\right)^{2}+x_{2}^{2}$ $\sum_{i,j}\left[Q_{i}=Q_{j}\right]^{\left\{ 1,2,3,4,5\right\} } x_{i}x_{j}=x_{1}^{2}+x_{2}^{2}+x_{3}^{2}$
And indeed
$\left(x\_{1}+x\_{2}+x\_{3}\right)^{2}-\left[x\_{1}^{2}+\left(x\_{2}+x\_{3}\right)^{2}\right]- \left[\left(x\_{1}+x\_{3}\right)^{2}+x\_{2}^{2}\right]+\left[x\_{1}^{2}+x\_{2}^{2}+x\_{3}^{2}\right]=x\_{1}x\_{2}+x\_{2}x\_{1}$
This works in general. For any $t\in\left\{ 0,1,\ldots,2^{k}-1\right\}$ , denote by $B_t$ the set of indices in the binary representation of $t$ which are $1$ . We have
$\sum\_{i,j:\left[Q\_{i}=Q\_{j}\right]=t}x\_{i}x\_{j}=\sum\_{B\_{t}\subseteq A}\left(-1\right)^{\left(\left|A\right|- \left|B\_{t}\right|\right)}\sum\_{X\in\left(\left[n\right]/\sim\_{A}\right)}\left(\sum\_{i\in X}x\_{i}\right)^{2}$
Since the expression $\sum_{X\in\left(\left[n\right]/\sim_{A}\right)}\left(\sum_{i\in X}x_{i}\right)^{2}$ is independent of $B_t$ , it suffices to compute it once, for every subset $A\subseteq\left\{ 0,1,\ldots,2^{k}-1\right\}$ . We create a vector $v$ such that
$v_{t}=\sum_{X\in\left(\left[n\right]/\sim_{A_{t}}\right)}\left(\sum_{i\in X}x_{i}\right)^{2}$
The inclusion-exclusion part is handled by defining a matrix
$M_{t,s}=\begin{cases}\left(-1\right)^{\left|A_{s}\right|-\left|B_{t}\right|} & B_{t}\subseteq A_{s}\\0 & B_{t}\not\subseteq A_{s}\end{cases}$
Recall that we defined
$u_{t}=\sum_{i,j:[Q_i==Q_j]=t}x_ix_j$
Such that $a\cdot u^T$ is the required solution. We have now seen that
$u=M\cdot v^T$
Thus, the algorithm for solving the problem efficiently is
1. Compute the matrix $M$ using the formula $M_{t,s}=\begin{cases}\left(-1\right)^{\left|A_{s}\right|-\left|B_{t}\right|} & B_{t}\subseteq A_{s}\\0 & B_{t}\not\subseteq A_{s}\end{cases}$
2. Compute the vector $v$ using the formula $v_{t}=\sum_{X\in\left(\left[n\right]/\sim_{A_{t}}\right)}\left(\sum_{i\in X}x_{i}\right)^{2}$
3. Compute $a\cdot M\cdot v^T$
Steps 1 and 3 involve operations polynomial in $O(2^k)$ ; the main computation is done in step 2, where for each of the $2^k$ values of $t$ , one has to go over the $O(n)$ vector $x$ and sum according to the equivalance classes.
An unoptimized Python code for this computation is
import functools import numpy as np k = 5 @functools.lru_cache(maxsize=None) def Q(i): return [int((2**k)* (np.sin((i+1)*(t+1))-np.floor(np.sin((i+1)*(t+1))))) for t in range(k)] def num_to_set(t): binary = bin(t)[2:] return {i for i, bit in enumerate(binary[::-1]) if bit == '1'} def M(t, s): A = num_to_set(s) B = num_to_set(t) if not A.issuperset(B): return 0 else: diff = len(A.difference(B)) if diff % 2 == 0: return 1 else: return -1 def equiv_class_sum(t, x): A = num_to_set(t) results = {} for i in range(n): Qi = Q(i) subQ = tuple([Qi[j] for j in A]) if subQ not in results: results[subQ] = 0 results[subQ] += x[i] return sum(res**2 for res in results.values()) n = 2**30 a = np.linspace(0, 1, 2**k).reshape((1,2**k)) M = np.array([[M(t,s) for s in range(2**k)] for t in range(2**k)]) x = np.linspace(-1,1,n) v = np.array([equiv_class_sum(t, x) for t in range(2**k)]).reshape((2**k,1)) print((a @ M @ v)[0][0])

Solvers

*Lazar Ilic (1/5/2023 4:26 PM IDT)
Lorenz Reichel (1/5/2023 6:25 PM IDT)
*Yan-Wu He (1/5/2023 10:24 PM IDT)
Amos Guler (2/5/2023 12:25 PM IDT)
*Alper Halbutogullari (2/5/2023 10:59 PM IDT)
*Bertram Felgenhauer (3/5/2023 1:49 PM IDT)
Dieter Beckerle (4/5/2023 5:52 PM IDT)
Sanandan Swaminathan (4/5/2023 5:55 PM IDT)
Julien Pradier (5/5/2023 5:27 PM IDT)
*Martin Thorne (5/5/2023 7:40 PM IDT)
Gary M. Gerken (6/5/2023 4:56 PM IDT)
*Li Li (7/5/2023 1:53 AM IDT)
Evan Semet (7/5/2023 3:26 AM IDT)
Philip Bui (7/5/2023 4:04 AM IDT)
*Harald Bögeholz (7/5/2023 8:05 AM IDT)
*Lawrence Au (7/5/2023 5:02 PM IDT)
Asaf Zimmerman (7/5/2023 6:14 PM IDT)
Lorenzo Gianferrari Pini (8/5/2023 12:29 PM IDT)
*Guy Daniel Hadas (8/5/2023 9:41 PM IDT)
*Peter Moser (9/5/2023 8:41 PM IDT)
Sachal Mahajan (10/5/2023 2:15 AM IDT)
Joram Meron (10/5/2023 12:58 PM IDT)
Stéphane Higueret (12/5/2023 7:21 AM IDT)
Daniel Chong Jyh Tar (12/5/2023 9:30 PM IDT)
Victor Chang (13/5/2023 5:04 AM IDT)
Michael Liepelt (13/5/2023 8:16 AM IDT)
*Dominik Reichl (13/5/2023 7:53 PM IDT)
*David F.H. Dunkley (14/5/2023 2:44 AM IDT)
*David Greer (14/5/2023 9:00 PM IDT)
Marco Bellocchi (16/5/2023 11:14 AM IDT)
*Daniel Chong Jyh Tar (16/5/2023 1:14 PM IDT)
*Tim Walters (16/5/2023 10:29 PM IDT)
*Andreas Stiller (17/5/2023 3:25 PM IDT)
Daniel Bitin (17/5/2023 6:50 PM IDT)
*Vladimir Volevich (18/5/2023 10:57 AM IDT)
*Dan Dima (18/5/2023 3:30 PM IDT)
Reda Kebbaj (19/5/2023 6:11 PM IDT)
*Bert Dobbelaere (19/5/2023 9:38 PM IDT)
*Daniel Copeland (21/5/2023 4:17 PM IDT)
*Daniel Mayer (21/5/2023 11:01 PM IDT)
Phil Proudman (22/5/2023 7:25 PM IDT)
*Hok Leung Nip (23/5/2023 8:10 PM IDT)
*Motty Porat (27/5/2023 3:21 AM IDT)
*Reiner Martin (27/5/2023 6:14 PM IDT)
Shouky Dan & Tamir Ganor (29/5/2023 11:56 AM IDT)
Latchezar Christov (30/5/2023 1:23 PM IDT)
*Diane Pham (31/5/2023 12:07 PM IDT)
*Kang Jin Cho (31/5/2023 7:51 PM IDT)
*Balakrishnan V (1/6/2023 5:48 AM IDT)
Nyles Heise (1/6/2023 7:56 PM IDT)
Matt Cristina (2/6/2023 12:55 AM IDT)
*Vaskor Basak (2/6/2023 2:52 PM IDT)

Ponder This Challenge - May 2023 - Computing quadratic forms

Solution

Solvers

Related posts

Ponder This Challenge - February 2026 - Blot-avoiding backgammon strategy

Ponder This Challenge - January 2026 - Number splitting

Ponder This Challenge - December 2025 - Sums of a prime and an even number

Ponder This Challenge - November 2025 - The CAT sequence