Basecamp
    Home
    Problems
    Contests
    Courses
    Rating
    Posts
    Discord
XOR Basis with deletions
Sign in
denisovkostya
•Article•7 months ago

XOR Basis with deletions

Preface

This article uses terms from linear algebra. If you understand them, you can skip this section.

An integer between 0 (inclusive) and 2m (exclusive) can be considered as an m-dimensional vector, taking into account its binary representation. The operation of computing xor (exclusive OR) corresponds to summing each element modulo 2, and the operation of selecting certain elements from a set of integers and computing their xor corresponds to multiplying vectors by 0 or 1 and summing them modulo 2. The sum of several scalar multiplications and vectors is called a linear combination.

span(X) denotes the set of all possible xors of some elements of the set X. For example, span({1,2,4})={0,1,2,4,1⊕2,1⊕4,2⊕4,1⊕2⊕4}={0,1,2,3,4,5,6,7} (xor of the empty set equals 0).

Introduction

Note

In the article, we assume that all integers are within [0,2m).

Let's recall the following well-known problem:

Problem

There are q queries of the following type:

  • Add a integer x to the set S (initially it is empty).

After each query, you need to output the size of an arbitrary XOR basis of the current set S.

An XOR basis of a set S is the minimal set of integers X that allows representing any integer y∈S using xors of integers from X.

You can learn more about how to solve this problem here.

In this article, we will consider a more complex problem where there are also deletion queries of integers from the set S. Moreover, we receive the next query only after answering the previous one (i.e., the problem needs to be solved "online").

The initial problem can be solved in O(m) per query, but we will learn to solve the problem with deletions in O(m2).

Structure

In the problem with deletions, uncertainties begin when the integer we add to the XOR basis is represented by xors of existing integers of basis.

We want to effectively store each integer that is added to S, so we will maintain a sequence Xn​ (n≥1) of XOR basises (each of which is initially empty), instead of a single basis.

Adding a integer v to S

Let's consider how we will add a new integer v=0 to the set S:

  1. We will gradually iterate over our basises starting from the first. Let i be the integer of the current basis we are considering;

  2. If v∈/span(Xi​), we simply add v to Xi​ and finish the procedure;

  3. Otherwise, v∈span(Xi​), so we can't add it to the basis, but we want to store this integer somewhere. Therefore, we increment i by 1 and return to step 2.

In other words, we find such minimal i that v∈/span(Xi​) and add v to Xi​. If we implement this algorithm directly, in the worst case it will work in O(∣S∣), which is quite inefficient.

We will improve this algorithm. First, let's consider some properties of this sequence of bases Xn​:

Property 1

span(Xi​)⊇span(Xi+1​) for all i≥1.

Property 2

There are no more than m+1 unique span(Xi​).

Using these two properties, we can improve the algorithm so that addition works in O(m2) or even O(mlogm).

Recall that we find such minimal i that v∈/span(Xi​) and add v to Xi​. Since there are no more than m+1 unique span(Xi​), we need only to iterate over those i, for which i=1 or span(Xi​)=span(Xi−1​). Then we achieve complexity O(m2). If we use binary search over these indices i, we achieve complexity O(mlogm).

Thus, we have learned how to add a integer v=0 to the set S.

Deleting a integer v∈S

Choose j such that v∈Xj​. If we simply delete v from Xj​, the properties we noted earlier will be violated. We aim to ensure that these properties hold after deletion.

We will delete in the following way:

  1. While span(Xj​)=span(Xj+1​), we swap the bases Xj​ and Xj+1​ (swap(Xj​,Xj+1​)) and increment j by 1. Since span(Xj​)=span(Xj+1​), it does not matter in what order these basises are located in the sequence, so we can rearrange them arbitrarily;

  2. Now span(Xj​)=span(Xj+1​). We have two cases:

    1. span(Xj​∖{v})⊇span(Xj+1​). In this case, we can simply delete the integer v from Xj​ and finish the procedure;

    2. span(Xj​∖{v})⊉span(Xj+1​). Consider the following statement:

      Statement 1

      There exists y∈Xj+1​ such that span((Xj​∖{v})∪{y})=span(Xj​).

      We find this y∈Xj+1​ from the statement. Then we can replace v in Xj​ with y and now continue the procedure from step 1, but now we will be deleting the integer y from the set Xj+1​.

Let's analyze the time complexity of the deletion procedure. How many times will we encounter case 2.2? It is easy to understand that no more than m times, since each time we return to step 1, the size of our current basis Xj​ decreases by at least 1.

Steps 2.1−2.2 can be done directly in O(m2):

  1. Completely rebuild the XOR basis Xj​ after deleting the element v in O(m2);

  2. Add elements from the basis Xj+1​ to Xj​∖{v} in O(m2) to find y or ensure that it does not exist.

Since we can do these steps at most m times in the worst case, deletion will work in O(m3).

In fact, we can perform steps 2.1−2.2 in O(m).

Before moving on to optimization, let's consider how we represent the XOR basises Xi​ so that we can work with them efficiently.

XOR basis
vector <node> basis;
bool add (int x, int id) {
    int mask = 0;
    for (auto &i : basis) {
        if (x > (x ^ i.val)) {
            x ^= i.val;
            mask ^= i.mask;
        } 
    }
    if (x) {
        mask |= 1 << ((int)basis.size());
        for (auto &i : basis) {
            if (i.val > (i.val ^ x)) {
                i.val ^= x;
                i.mask ^= mask;
            }
        }
        basis.push_back(node{val: x, id: id, mask: mask, high: get_high(x)});
        return true;
    }
    return false;
}
C++
22 lines
529 bytes

Let A and B be XOR basises that are built using the algorithm above. A XOR basis represented in such a way has several important properties and consequences:

Property of the constructed basis

Each integer x∈A "is responsible" for a certain bit in this basis.

Formally, let high(t) be a function that returns the highest bit of the integer t. Then, for every y∈A, y=x, we have that the integer y does not have the bit high(x). That is, only the integer x has the bit high(x).

First Consequence of the Property

span(A)=span(B) if and only if A=B as sets.

Let's consider how exactly to delete a integer v from the basis and rebuild it in O(m):

  1. Each integer in the basis also has a mask, which denotes the linear combination of the initial basis elements used to obtain this basis element val.

  2. We need to ensure that the integer v is no longer present in the linear combinations, and at the same time, the basis remains correctly constructed (the property we mentioned above is preserved).

  3. It turns out that this is quite simple to do. Namely, we find such a integer in the basis x that contains v in the linear combination and among such integers has the smallest highest bit.

  4. Then we delete this found integer from the basis and modify other integers in the basis that contained v in the linear combination by xoring the corresponding integer with x. Thus, the highest bit in these integers will not change, so the property is preserved.

  5. Simultaneously, we calculate the integer bits that contains all the highest bits of the basis integers that contained the integer v in the linear combination.

int rebuild_and_delete (int id) {
    int pos = 0, mn = inf, p2 = 0;
    for (int i = 0; i < basis.size(); i++) {
        if (basis[i].id == id) {
            pos = i;
        }
    }
    int bits = 0;
    for (int i = 0; i < basis.size(); i++) {
        if (basis[i].mask & (1 << pos)) {
            if (mn > basis[i].high) {
                mn = basis[i].high;
                p2 = i;
            }
            bits ^= 1 << basis[i].high;
        }
    }

    if (p2 != pos) {
        swap(basis[p2].id, basis[pos].id);
        for (auto &i : basis) {
            swap_bits(i.mask, pos, p2); // just swaps pos-th and p2-th bit in i.mask
        }
        pos = p2;
    }
    for (int i = 0; i < basis.size(); i++) {
        if (i != pos) {
            if (basis[i].mask & (1 << pos)) {
                basis[i].val ^= basis[pos].val;
                basis[i].mask ^= basis[pos].mask;
            }
        }
    }
    int good = (1 << pos) - 1;
    int other = ((1 << len(basis)) - 1) ^ (good | (1 << pos));
    basis.erase(basis.begin() + pos);
    for (auto &i : basis) {
        i.mask = ((i.mask & other) >> 1) | (i.mask & good);
    }
    return bits;
}
C++
41 lines
1161 bytes

Next, we need to find y from Statement 1 in O(m) or say that it does not exist. For y, the following condition must hold: span((Xj​∖{v})∪{y})=span(Xj​). We know which highest bit disappeared when deleting the integer v from Xj​. Therefore, when adding y∈Xj+1​ to Xj​∖{v}, y must "be responsible" for this disappeared bit.

Then, we can check for each x∈Xj+1​ in O(1) whether it is suitable as y. It is not difficult to verify that x is suitable if and only if cnt(bits & x) mod 2=1, where cnt(t) is a function that returns the integer of set bits in the integer.

int get_the_same_high_bit (int bits, vector <int> &val) {
    for (auto &i : basis) {
        if (__builtin_popcount(val[i.id] & bits) & 1) {
            return i.id;
        }
    }
    return -1;
}
C++
8 lines
200 bytes

Thus, we have learned to delete in O(m2).

Here is full version of the code:

Code
//#pragma GCC optimize("Ofast", "unroll-loops")
//#pragma GCC target("sse", "sse2", "sse3", "ssse3", "sse4")

#include <bits/stdc++.h>

#define all(a) a.begin(),a.end()
#define len(a) (int)(a.size())
#define mp make_pair
#define pb push_back
#define fir first
#define sec second
#define fi first
#define se second

using namespace std;

typedef pair<int, int> pii;
typedef long long ll;
typedef long double ld;

template<typename T>
bool umin(T &a, T b) {
    if (b < a) {
        a = b;
        return true;
    }
    return false;
}
template<typename T>
bool umax(T &a, T b) {
    if (a < b) {
        a = b;
        return true;
    }
    return false;
}

#ifdef KIVI
#define DEBUG for (bool _FLAG = true; _FLAG; _FLAG = false)
#define LOG(...) print(#__VA_ARGS__" ::", __VA_ARGS__) << endl
template <class ...Ts> auto &print(Ts ...ts) { return ((cerr << ts << " "), ...); }
#else
#define DEBUG while (false)
#define LOG(...)
#endif

const int max_n = -1, inf = 1000111222;


const int C = 20;

struct node {
    int val, id, mask, high;
};


inline int get_high (int x) {
    if (x == 0) {
        return -1;
    }
    return 31 - __builtin_clz(x);
}

inline void swap_bits (int &x, int a, int b) {
    int x1 = bool(x & (1 << a));
    int x2 = bool(x & (1 << b));
    x ^= (x1 ^ x2) << a;
    x ^= (x1 ^ x2) << b;
}

struct xor_basis {
    vector <node> basis;

    inline bool add (int x, int id) {
        int mask = 0;
        for (auto &i : basis) {
            if (umin(x, x ^ i.val)) {
                mask ^= i.mask;
            }
        }
        if (x) {
            mask |= 1 << len(basis);
            for (auto &i : basis) {
                if (umin(i.val, i.val ^ x)) {
                    i.mask ^= mask;
                }
            }
            basis.pb(node{val: x, id: id, mask: mask, high: get_high(x)});
            return true;
        }
        return false;
    }



    inline int get_the_same_high_bit (int bits, const vector <int> &val) {
        for (auto &i : basis) {
            if (__builtin_popcount(val[i.id] & bits) & 1) {
                return i.id;
            }
        }
        return -1;
    }


    inline int rebuild_and_delete (int id) {
        int pos = 0, mn = inf, p2 = 0;
        for (int i = 0; i < len(basis); i++) {
            if (basis[i].id == id) {
                pos = i;
            }
        }
        int bits = 0;
        for (int i = 0; i < len(basis); i++) {
            if (basis[i].mask & (1 << pos)) {
                if (umin(mn, basis[i].high)) {
                    p2 = i;
                }
                bits ^= 1 << basis[i].high;
            }
        }

        if (p2 != pos) {
            swap(basis[p2].id, basis[pos].id);
            for (auto &i : basis) {
                swap_bits(i.mask, pos, p2);
            }
            pos = p2;
        }
        for (int i = 0; i < len(basis); i++) {
            if (i != pos) {
                if (basis[i].mask & (1 << pos)) {
                    basis[i].val ^= basis[pos].val;
                    basis[i].mask ^= basis[pos].mask;
                }
            }
        }
        int good = (1 << pos) - 1;
        int other = ((1 << len(basis)) - 1) ^ (good | (1 << pos));
        basis.erase(basis.begin() + pos);
        for (auto &i : basis) {
            i.mask = ((i.mask & other) >> 1) | (i.mask & good);
        }
        return bits;
    }

};

template<int max_bit> // not inclusive
struct xor_basis_online {

    vector <xor_basis> basises[max_bit + 1];

    int mx;

    vector <pii> where;
    vector <int> val;

    xor_basis_online() : mx(0), cur_id(0) {}

    int cur_id;

    inline int add (int x) {
        val.pb(x);
        where.pb(make_pair(-1, -1));
        int id = cur_id++;
        if (x == 0) {
            return id;
        }
        for (int i = mx; i >= 0; i--) {
            if (basises[i].empty()) {
                continue;
            }
            if (basises[i].back().add(x, id)) {
                basises[i + 1].pb(basises[i].back());
                basises[i].pop_back();
                umax(mx, i + 1);
                for (auto &j : basises[i + 1].back().basis) {
                    where[j.id] = make_pair(i + 1, len(basises[i + 1]) - 1);
                }
                return id;
            }
        }
        basises[1].pb(xor_basis());
        basises[1].back().add(x, id);
        where.back() = make_pair(1, len(basises[1]) - 1);
        umax(mx, 1);
        return id;
    }

    inline int move_back (int sz, int pos) {
        int to = len(basises[sz]) - 1;
        if (to == pos) {
            return pos;
        }
        for (auto &i : basises[sz][pos].basis) {
            where[i.id].second = to;
        }
        for (auto &i : basises[sz][to].basis) {
            where[i.id].second = pos;
        }
        swap(basises[sz][pos], basises[sz][to]);
        return to;
    }

    inline void del (int id) {
        if (val[id] == 0) {
            return;
        }
        int sz, pos;
        tie(sz, pos) = where[id];
        pos = move_back(sz, pos);
        while (true) {
            int bits = basises[sz].back().rebuild_and_delete(id);
            int i = sz - 1;
            while (i > 0 && basises[i].empty()) {
                --i;
            }
            int new_id = -1;
            if (i > 0) {
                new_id = basises[i].back().get_the_same_high_bit(bits, val);
            }
            if (new_id == -1) {
                if (sz > 1) {
                    basises[sz - 1].pb(basises[sz].back());
                    for (auto &j : basises[sz - 1].back().basis) {
                        where[j.id] = make_pair(sz - 1, len(basises[sz - 1]) - 1);
                    }
                }
                basises[sz].pop_back();
                if (mx > 0 && basises[mx].empty()) {
                    --mx;
                }
                return;
            }
            int cur = val[new_id];
            assert(basises[sz].back().add(cur, new_id));
            int new_sz = i;
            int new_pos = len(basises[i]) - 1;
            where[new_id] = make_pair(sz, pos);
            id = new_id;
            sz = new_sz;
            pos = new_pos;
        }
    }

    inline int size () {
        return mx;
    }
};

int main() {
//    freopen("input.txt", "r", stdin);
//    freopen("output.txt", "w", stdout);

    ios_base::sync_with_stdio(0);
    cin.tie(0);


    // solution to https://basecamp.eolymp.com/uk/problems/11732
    int n, q, add;
    cin >> n >> add;
    vector <int> p(n);
    xor_basis_online<19> t;
    vector <int> now(n);
    for (int i = 0; i < n; i++) {
        cin >> p[i];
        now[i] = t.add(i ^ p[i]);
    }
    cin >> q;
    int ans = t.size();
    cout << ans << '\n';
    for (int i = 0, l, r; i < q; i++) {
        cin >> l >> r;
        l = (l + ans * add) % n;
        r = (r + ans * add) % n;
        if (l != r) {
            t.del(now[l]);
            t.del(now[r]);
            swap(p[l], p[r]);
            now[l] = t.add(l ^ p[l]);
            now[r] = t.add(r ^ p[r]);
        }
        ans = t.size();
        cout << ans << '\n';
    }
}
C++
287 lines
7127 bytes

P. S. Similarly to adding, we can try to use binary search to do deletion: find maximum r that span(Xj​∖{v})⊉span(Xr​). But after moving an integer from Xr​ to Xj​ the property of sequence of basises can still not hold. And thus using binary search doesn't improve time complexity.

For some time I thought that we can do better than O(m2), but didn't manage to improve. The reason why is that using this property span(Xi​)⊇span(Xi+1​) we can construct a sequence of integers ai​, such that span(Xi​)=span({a1​,a2​,…,adim(Xi​)​}) for all i. Having it, we can actually do additions in O(m), but still can't figure out how to optimize deletions.


36

Comments (1)

Loading

Just a moment, getting data from the server
SANYAAAAAAAAAAAAA•28 days ago•2 revision

Is it possible to get size of set of integers that belong to a span of at least one basis, several basises given

0