# XOR Basis with deletions

## Preface

This article uses terms from linear algebra. If you understand them, you can skip this section.

## Introduction

Let's recall the following well-known problem:

You can learn more about how to solve this problem here.

In this article, we will consider a more complex problem where there are also deletion queries of integers from the set $S$. Moreover, we receive the next query only after answering the previous one (i.e., the problem needs to be solved "online").

The initial problem can be solved in $O(m)$ per query, but we will learn to solve the problem with deletions in $O(m_{2})$.

## Structure

In the problem with deletions, uncertainties begin when the integer we add to the XOR basis is represented by $xor$s of existing integers of basis.

We want to effectively store each integer that is added to $S$, so we will maintain a sequence $X_{n}$ ($nâ‰¥1$) of XOR basises (each of which is initially empty), instead of a single basis.

## Adding a integer $v$ to $S$

Let's consider how we will add a new integer $vî€=0$ to the set $S$:

We will gradually iterate over our basises starting from the first. Let $i$ be the integer of the current basis we are considering;

If $vâˆˆ/span(X_{i})$, we simply add $v$ to $X_{i}$ and finish the procedure;

Otherwise, $vâˆˆspan(X_{i})$, so we can't add it to the basis, but we want to store this integer somewhere. Therefore, we increment $i$ by $1$ and return to step $2$.

In other words, we find such minimal $i$ that $vâˆˆ/span(X_{i})$ and add $v$ to $X_{i}$. If we implement this algorithm directly, in the worst case it will work in $O(âˆ£Sâˆ£)$, which is quite inefficient.

We will improve this algorithm. First, let's consider some properties of this sequence of bases $X_{n}$:

Using these two properties, we can improve the algorithm so that addition works in $O(m_{2})$ or even $O(mgm)$.

Recall that we find such minimal $i$ that $vâˆˆ/span(X_{i})$ and add $v$ to $X_{i}$. Since there are no more than $m+1$ unique $span(X_{i})$, we need only to iterate over those $i$, for which $i=1$ or $span(X_{i})î€=span(X_{iâˆ’1})$. Then we achieve complexity $O(m_{2})$. If we use binary search over these indices $i$, we achieve complexity $O(mgm)$.

Thus, we have learned how to add a integer $vî€=0$ to the set $S$.

## Deleting a integer $vâˆˆS$

Choose $j$ such that $vâˆˆX_{j}$. If we simply delete $v$ from $X_{j}$, the properties we noted earlier will be violated. We aim to ensure that these properties hold after deletion.

We will delete in the following way:

While $span(X_{j})=span(X_{j+1})$, we swap the bases $X_{j}$ and $X_{j+1}$ ($swap(X_{j},X_{j+1})$) and increment $j$ by $1$. Since $span(X_{j})=span(X_{j+1})$, it does not matter in what order these basises are located in the sequence, so we can rearrange them arbitrarily;

Now $span(X_{j})î€=span(X_{j+1})$. We have two cases:

$span(X_{j}âˆ–{v})âŠ‡span(X_{j+1})$. In this case, we can simply delete the integer $v$ from $X_{j}$ and finish the procedure;

$span(X_{j}âˆ–{v})âŠ‰span(X_{j+1})$. Consider the following statement:

We find this $yâˆˆX_{j+1}$ from the statement. Then we can replace $v$ in $X_{j}$ with $y$ and now continue the procedure from step $1$, but now we will be deleting the integer $y$ from the set $X_{j+1}$.

Let's analyze the time complexity of the deletion procedure. How many times will we encounter case $2.2$? It is easy to understand that no more than $m$ times, since each time we return to step $1$, the size of our current basis $X_{j}$ decreases by at least $1$.

Steps $2.1âˆ’2.2$ can be done directly in $O(m_{2})$:

Completely rebuild the XOR basis $X_{j}$ after deleting the element $v$ in $O(m_{2})$;

Add elements from the basis $X_{j+1}$ to $X_{j}âˆ–{v}$ in $O(m_{2})$ to find $y$ or ensure that it does not exist.

Since we can do these steps at most $m$ times in the worst case, deletion will work in $O(m_{3})$.

In fact, we can perform steps $2.1âˆ’2.2$ in $O(m)$.

Before moving on to optimization, let's consider how we represent the XOR basises $X_{i}$ so that we can work with them efficiently.

Let $A$ and $B$ be XOR basises that are built using the algorithm above. A XOR basis represented in such a way has several important properties and consequences:

Let's consider how exactly to delete a integer $v$ from the basis and rebuild it in $O(m)$:

Each integer in the basis also has a $mask$, which denotes the linear combination of the initial basis elements used to obtain this basis element $val$.

We need to ensure that the integer $v$ is no longer present in the linear combinations, and at the same time, the basis remains correctly constructed (the property we mentioned above is preserved).

It turns out that this is quite simple to do. Namely, we find such a integer in the basis $x$ that contains $v$ in the linear combination and among such integers has the smallest highest bit.

Then we delete this found integer from the basis and modify other integers in the basis that contained $v$ in the linear combination by xoring the corresponding integer with $x$. Thus, the highest bit in these integers will not change, so the property is preserved.

Simultaneously, we calculate the integer $bits$ that contains all the highest bits of the basis integers that contained the integer $v$ in the linear combination.

int rebuild_and_delete (int id) { int pos = 0, mn = inf, p2 = 0; for (int i = 0; i < basis.size(); i++) { if (basis[i].id == id) { pos = i; } } int bits = 0; for (int i = 0; i < basis.size(); i++) { if (basis[i].mask & (1 << pos)) { if (mn > basis[i].high) { mn = basis[i].high; p2 = i; } bits ^= 1 << basis[i].high; } } if (p2 != pos) { swap(basis[p2].id, basis[pos].id); for (auto &i : basis) { swap_bits(i.mask, pos, p2); // just swaps pos-th and p2-th bit in i.mask } pos = p2; } for (int i = 0; i < basis.size(); i++) { if (i != pos) { if (basis[i].mask & (1 << pos)) { basis[i].val ^= basis[pos].val; basis[i].mask ^= basis[pos].mask; } } } int good = (1 << pos) - 1; int other = ((1 << len(basis)) - 1) ^ (good | (1 << pos)); basis.erase(basis.begin() + pos); for (auto &i : basis) { i.mask = ((i.mask & other) >> 1) | (i.mask & good); } return bits; }

Next, we need to find $y$ from Statement $1$ in $O(m)$ or say that it does not exist. For $y$, the following condition must hold: $span((X_{j}âˆ–{v})âˆª{y})=span(X_{j})$. We know which highest bit disappeared when deleting the integer $v$ from $X_{j}$. Therefore, when adding $yâˆˆX_{j+1}$ to $X_{j}âˆ–{v}$, $y$ must "be responsible" for this disappeared bit.

Then, we can check for each $xâˆˆX_{j+1}$ in $O(1)$ whether it is suitable as $y$. It is not difficult to verify that $x$ is suitable if and only if $cnt(bitsÂ&Âx)ÂmodÂ2=1$, where $cnt(t)$ is a function that returns the integer of set bits in the integer.

int get_the_same_high_bit (int bits, vector <int> &val) { for (auto &i : basis) { if (__builtin_popcount(val[i.id] & bits) & 1) { return i.id; } } return -1; }

Thus, we have learned to delete in $O(m_{2})$.

Here is full version of the code:

P. S. Similarly to adding, we can try to use binary search to do deletion: find maximum $r$ that $span(X_{j}âˆ–{v})âŠ‰span(X_{r})$. But after moving an integer from $X_{r}$ to $X_{j}$ the property of sequence of basises can still not hold. And thus using binary search doesn't improve time complexity.

For some time I thought that we can do better than $O(m_{2})$, but didn't manage to improve. The reason why is that using this property $span(X_{i})âŠ‡span(X_{i+1})$ we can construct a sequence of integers $a_{i}$, such that $span(X_{i})=span({a_{1},a_{2},â€¦,a_{dim(X_{i})}})$ for all $i$. Having it, we can actually do additions in $O(m)$, but still can't figure out how to optimize deletions.