This article uses terms from linear algebra. If you understand them, you can skip this section.
Introduction
Let's recall the following well-known problem:
You can learn more about how to solve this problem here.
In this article, we will consider a more complex problem where there are also deletion queries of integers from the set S. Moreover, we receive the next query only after answering the previous one (i.e., the problem needs to be solved "online").
The initial problem can be solved in O(m) per query, but we will learn to solve the problem with deletions in O(m2).
Structure
In the problem with deletions, uncertainties begin when the integer we add to the XOR basis is represented by xors of existing integers of basis.
We want to effectively store each integer that is added to S, so we will maintain a sequence Xn (n≥1) of XOR basises (each of which is initially empty), instead of a single basis.
Adding a integer v to S
Let's consider how we will add a new integer v=0 to the set S:
We will gradually iterate over our basises starting from the first. Let i be the integer of the current basis we are considering;
If v∈/span(Xi), we simply add v to Xi and finish the procedure;
Otherwise, v∈span(Xi), so we can't add it to the basis, but we want to store this integer somewhere. Therefore, we increment i by 1 and return to step 2.
In other words, we find such minimal i that v∈/span(Xi) and add v to Xi. If we implement this algorithm directly, in the worst case it will work in O(∣S∣), which is quite inefficient.
We will improve this algorithm. First, let's consider some properties of this sequence of bases Xn:
Using these two properties, we can improve the algorithm so that addition works in O(m2) or even O(mlogm).
Recall that we find such minimal i that v∈/span(Xi) and add v to Xi. Since there are no more than m+1 unique span(Xi), we need only to iterate over those i, for which i=1 or span(Xi)=span(Xi−1). Then we achieve complexity O(m2). If we use binary search over these indices i, we achieve complexity O(mlogm).
Thus, we have learned how to add a integer v=0 to the set S.
Deleting a integer v∈S
Choose j such that v∈Xj. If we simply delete v from Xj, the properties we noted earlier will be violated. We aim to ensure that these properties hold after deletion.
We will delete in the following way:
While span(Xj)=span(Xj+1), we swap the bases Xj and Xj+1 (swap(Xj,Xj+1)) and increment j by 1. Since span(Xj)=span(Xj+1), it does not matter in what order these basises are located in the sequence, so we can rearrange them arbitrarily;
Now span(Xj)=span(Xj+1). We have two cases:
span(Xj∖{v})⊇span(Xj+1). In this case, we can simply delete the integer v from Xj and finish the procedure;
span(Xj∖{v})⊉span(Xj+1). Consider the following statement:
We find this y∈Xj+1 from the statement. Then we can replace v in Xj with y and now continue the procedure from step 1, but now we will be deleting the integer y from the set Xj+1.
Let's analyze the time complexity of the deletion procedure. How many times will we encounter case 2.2? It is easy to understand that no more than m times, since each time we return to step 1, the size of our current basis Xj decreases by at least 1.
Steps 2.1−2.2 can be done directly in O(m2):
Completely rebuild the XOR basis Xj after deleting the element v in O(m2);
Add elements from the basis Xj+1 to Xj∖{v} in O(m2) to find y or ensure that it does not exist.
Since we can do these steps at most m times in the worst case, deletion will work in O(m3).
In fact, we can perform steps 2.1−2.2 in O(m).
Before moving on to optimization, let's consider how we represent the XOR basises Xi so that we can work with them efficiently.
Let A and B be XOR basises that are built using the algorithm above. A XOR basis represented in such a way has several important properties and consequences:
Let's consider how exactly to delete a integer v from the basis and rebuild it in O(m):
Each integer in the basis also has a mask, which denotes the linear combination of the initial basis elements used to obtain this basis element val.
We need to ensure that the integer v is no longer present in the linear combinations, and at the same time, the basis remains correctly constructed (the property we mentioned above is preserved).
It turns out that this is quite simple to do. Namely, we find such a integer in the basis x that contains v in the linear combination and among such integers has the smallest highest bit.
Then we delete this found integer from the basis and modify other integers in the basis that contained v in the linear combination by xoring the corresponding integer with x. Thus, the highest bit in these integers will not change, so the property is preserved.
Simultaneously, we calculate the integer bits that contains all the highest bits of the basis integers that contained the integer v in the linear combination.
int rebuild_and_delete (int id) {
int pos = 0, mn = inf, p2 = 0;
for (int i = 0; i < basis.size(); i++) {
if (basis[i].id == id) {
pos = i;
}
}
int bits = 0;
for (int i = 0; i < basis.size(); i++) {
if (basis[i].mask & (1 << pos)) {
if (mn > basis[i].high) {
mn = basis[i].high;
p2 = i;
}
bits ^= 1 << basis[i].high;
}
}
if (p2 != pos) {
swap(basis[p2].id, basis[pos].id);
for (auto &i : basis) {
swap_bits(i.mask, pos, p2); // just swaps pos-th and p2-th bit in i.mask
}
pos = p2;
}
for (int i = 0; i < basis.size(); i++) {
if (i != pos) {
if (basis[i].mask & (1 << pos)) {
basis[i].val ^= basis[pos].val;
basis[i].mask ^= basis[pos].mask;
}
}
}
int good = (1 << pos) - 1;
int other = ((1 << len(basis)) - 1) ^ (good | (1 << pos));
basis.erase(basis.begin() + pos);
for (auto &i : basis) {
i.mask = ((i.mask & other) >> 1) | (i.mask & good);
}
return bits;
}
Next, we need to find y from Statement 1 in O(m) or say that it does not exist. For y, the following condition must hold: span((Xj∖{v})∪{y})=span(Xj). We know which highest bit disappeared when deleting the integer v from Xj. Therefore, when adding y∈Xj+1 to Xj∖{v}, y must "be responsible" for this disappeared bit.
Then, we can check for each x∈Xj+1 in O(1) whether it is suitable as y. It is not difficult to verify that x is suitable if and only if cnt(bits&x)mod2=1, where cnt(t) is a function that returns the integer of set bits in the integer.
int get_the_same_high_bit (int bits, vector <int> &val) {
for (auto &i : basis) {
if (__builtin_popcount(val[i.id] & bits) & 1) {
return i.id;
}
}
return -1;
}
Thus, we have learned to delete in O(m2).
Here is full version of the code:
P. S. Similarly to adding, we can try to use binary search to do deletion: find maximum r that span(Xj∖{v})⊉span(Xr). But after moving an integer from Xr to Xj the property of sequence of basises can still not hold. And thus using binary search doesn't improve time complexity.
For some time I thought that we can do better than O(m2), but didn't manage to improve. The reason why is that using this property span(Xi)⊇span(Xi+1) we can construct a sequence of integers ai, such that span(Xi)=span({a1,a2,…,adim(Xi)}) for all i. Having it, we can actually do additions in O(m), but still can't figure out how to optimize deletions.