•Article•9 months ago

XOR Basis with deletions

Preface

This article uses terms from linear algebra. If you understand them, you can skip this section.

Introduction

Let's recall the following well-known problem:

You can learn more about how to solve this problem here.

In this article, we will consider a more complex problem where there are also deletion queries of integers from the set $S$ . Moreover, we receive the next query only after answering the previous one (i.e., the problem needs to be solved "online").

The initial problem can be solved in $O (m)$ per query, but we will learn to solve the problem with deletions in $O (m^{2})$ .

Structure

In the problem with deletions, uncertainties begin when the integer we add to the XOR basis is represented by $xor$ s of existing integers of basis.

We want to effectively store each integer that is added to $S$ , so we will maintain a sequence $X_{n}$ ( $n \geq 1$ ) of XOR basises (each of which is initially empty), instead of a single basis.

Adding a integer $v$ to $S$

Let's consider how we will add a new integer $v \neq = 0$ to the set $S$ :

We will gradually iterate over our basises starting from the first. Let $i$ be the integer of the current basis we are considering;
If $v \in / span (X_{i})$ , we simply add $v$ to $X_{i}$ and finish the procedure;
Otherwise, $v \in span (X_{i})$ , so we can't add it to the basis, but we want to store this integer somewhere. Therefore, we increment $i$ by $1$ and return to step $2$ .

In other words, we find such minimal $i$ that $v \in / span (X_{i})$ and add $v$ to $X_{i}$ . If we implement this algorithm directly, in the worst case it will work in $O (∣ S ∣)$ , which is quite inefficient.

We will improve this algorithm. First, let's consider some properties of this sequence of bases $X_{n}$ :

Using these two properties, we can improve the algorithm so that addition works in $O (m^{2})$ or even $O (m lo g m)$ .

Recall that we find such minimal $i$ that $v \in / span (X_{i})$ and add $v$ to $X_{i}$ . Since there are no more than $m + 1$ unique $span (X_{i})$ , we need only to iterate over those $i$ , for which $i = 1$ or $span (X_{i}) \neq = span (X_{i - 1})$ . Then we achieve complexity $O (m^{2})$ . If we use binary search over these indices $i$ , we achieve complexity $O (m lo g m)$ .

Thus, we have learned how to add a integer $v \neq = 0$ to the set $S$ .

Deleting a integer $v \in S$

Choose $j$ such that $v \in X_{j}$ . If we simply delete $v$ from $X_{j}$ , the properties we noted earlier will be violated. We aim to ensure that these properties hold after deletion.

We will delete in the following way:

While $span (X_{j}) = span (X_{j + 1})$ , we swap the bases $X_{j}$ and $X_{j + 1}$ ( $swap (X_{j}, X_{j + 1})$ ) and increment $j$ by $1$ . Since $span (X_{j}) = span (X_{j + 1})$ , it does not matter in what order these basises are located in the sequence, so we can rearrange them arbitrarily;
Now $span (X_{j}) \neq = span (X_{j + 1})$ . We have two cases:
1. $span (X_{j} ∖ {v}) \supseteq span (X_{j + 1})$ . In this case, we can simply delete the integer $v$ from $X_{j}$ and finish the procedure;
2. $span (X_{j} ∖ {v}) ⊉ span (X_{j + 1})$ . Consider the following statement:
  Statement 1
  There exists $y \in X_{j + 1}$ such that $span ((X_{j} ∖ {v}) \cup {y}) = span (X_{j})$ .
  We find this $y \in X_{j + 1}$ from the statement. Then we can replace $v$ in $X_{j}$ with $y$ and now continue the procedure from step $1$ , but now we will be deleting the integer $y$ from the set $X_{j + 1}$ .

Let's analyze the time complexity of the deletion procedure. How many times will we encounter case $2.2$ ? It is easy to understand that no more than $m$ times, since each time we return to step $1$ , the size of our current basis $X_{j}$ decreases by at least $1$ .

Steps $2.1 - 2.2$ can be done directly in $O (m^{2})$ :

Completely rebuild the XOR basis $X_{j}$ after deleting the element $v$ in $O (m^{2})$ ;
Add elements from the basis $X_{j + 1}$ to $X_{j} ∖ {v}$ in $O (m^{2})$ to find $y$ or ensure that it does not exist.

Since we can do these steps at most $m$ times in the worst case, deletion will work in $O (m^{3})$ .

In fact, we can perform steps $2.1 - 2.2$ in $O (m)$ .

Before moving on to optimization, let's consider how we represent the XOR basises $X_{i}$ so that we can work with them efficiently.

Let $A$ and $B$ be XOR basises that are built using the algorithm above. A XOR basis represented in such a way has several important properties and consequences:

Let's consider how exactly to delete a integer $v$ from the basis and rebuild it in $O (m)$ :

Each integer in the basis also has a $ma s k$ , which denotes the linear combination of the initial basis elements used to obtain this basis element $v a l$ .
We need to ensure that the integer $v$ is no longer present in the linear combinations, and at the same time, the basis remains correctly constructed (the property we mentioned above is preserved).
It turns out that this is quite simple to do. Namely, we find such a integer in the basis $x$ that contains $v$ in the linear combination and among such integers has the smallest highest bit.
Then we delete this found integer from the basis and modify other integers in the basis that contained $v$ in the linear combination by xoring the corresponding integer with $x$ . Thus, the highest bit in these integers will not change, so the property is preserved.
Simultaneously, we calculate the integer $bi t s$ that contains all the highest bits of the basis integers that contained the integer $v$ in the linear combination.

int rebuild_and_delete (int id) {
    int pos = 0, mn = inf, p2 = 0;
    for (int i = 0; i < basis.size(); i++) {
        if (basis[i].id == id) {
            pos = i;
        }
    }
    int bits = 0;
    for (int i = 0; i < basis.size(); i++) {
        if (basis[i].mask & (1 << pos)) {
            if (mn > basis[i].high) {
                mn = basis[i].high;
                p2 = i;
            }
            bits ^= 1 << basis[i].high;
        }
    }

    if (p2 != pos) {
        swap(basis[p2].id, basis[pos].id);
        for (auto &i : basis) {
            swap_bits(i.mask, pos, p2); // just swaps pos-th and p2-th bit in i.mask
        }
        pos = p2;
    }
    for (int i = 0; i < basis.size(); i++) {
        if (i != pos) {
            if (basis[i].mask & (1 << pos)) {
                basis[i].val ^= basis[pos].val;
                basis[i].mask ^= basis[pos].mask;
            }
        }
    }
    int good = (1 << pos) - 1;
    int other = ((1 << len(basis)) - 1) ^ (good | (1 << pos));
    basis.erase(basis.begin() + pos);
    for (auto &i : basis) {
        i.mask = ((i.mask & other) >> 1) | (i.mask & good);
    }
    return bits;
}

Next, we need to find $y$ from Statement $1$ in $O (m)$ or say that it does not exist. For $y$ , the following condition must hold: $span ((X_{j} ∖ {v}) \cup {y}) = span (X_{j})$ . We know which highest bit disappeared when deleting the integer $v$ from $X_{j}$ . Therefore, when adding $y \in X_{j + 1}$ to $X_{j} ∖ {v}$ , $y$ must "be responsible" for this disappeared bit.

Then, we can check for each $x \in X_{j + 1}$ in $O (1)$ whether it is suitable as $y$ . It is not difficult to verify that $x$ is suitable if and only if $cnt (bi t s & x) mod 2 = 1$ , where $cnt (t)$ is a function that returns the integer of set bits in the integer.

int get_the_same_high_bit (int bits, vector <int> &val) {
    for (auto &i : basis) {
        if (__builtin_popcount(val[i.id] & bits) & 1) {
            return i.id;
        }
    }
    return -1;
}

Thus, we have learned to delete in $O (m^{2})$ .

Here is full version of the code:

//#pragma GCC optimize("Ofast", "unroll-loops")
//#pragma GCC target("sse", "sse2", "sse3", "ssse3", "sse4")

#include <bits/stdc++.h>

#define all(a) a.begin(),a.end()
#define len(a) (int)(a.size())
#define mp make_pair
#define pb push_back
#define fir first
#define sec second
#define fi first
#define se second

using namespace std;

typedef pair<int, int> pii;
typedef long long ll;
typedef long double ld;

template<typename T>
bool umin(T &a, T b) {
    if (b < a) {
        a = b;
        return true;
    }
    return false;
}
template<typename T>
bool umax(T &a, T b) {
    if (a < b) {
        a = b;
        return true;
    }
    return false;
}

#ifdef KIVI
#define DEBUG for (bool _FLAG = true; _FLAG; _FLAG = false)
#define LOG(...) print(#__VA_ARGS__" ::", __VA_ARGS__) << endl
template <class ...Ts> auto &print(Ts ...ts) { return ((cerr << ts << " "), ...); }
#else
#define DEBUG while (false)
#define LOG(...)
#endif

const int max_n = -1, inf = 1000111222;


const int C = 20;

struct node {
    int val, id, mask, high;
};


inline int get_high (int x) {
    if (x == 0) {
        return -1;
    }
    return 31 - __builtin_clz(x);
}

inline void swap_bits (int &x, int a, int b) {
    int x1 = bool(x & (1 << a));
    int x2 = bool(x & (1 << b));
    x ^= (x1 ^ x2) << a;
    x ^= (x1 ^ x2) << b;
}

struct xor_basis {
    vector <node> basis;

    inline bool add (int x, int id) {
        int mask = 0;
        for (auto &i : basis) {
            if (umin(x, x ^ i.val)) {
                mask ^= i.mask;
            }
        }
        if (x) {
            mask |= 1 << len(basis);
            for (auto &i : basis) {
                if (umin(i.val, i.val ^ x)) {
                    i.mask ^= mask;
                }
            }
            basis.pb(node{val: x, id: id, mask: mask, high: get_high(x)});
            return true;
        }
        return false;
    }



    inline int get_the_same_high_bit (int bits, const vector <int> &val) {
        for (auto &i : basis) {
            if (__builtin_popcount(val[i.id] & bits) & 1) {
                return i.id;
            }
        }
        return -1;
    }


    inline int rebuild_and_delete (int id) {
        int pos = 0, mn = inf, p2 = 0;
        for (int i = 0; i < len(basis); i++) {
            if (basis[i].id == id) {
                pos = i;
            }
        }
        int bits = 0;
        for (int i = 0; i < len(basis); i++) {
            if (basis[i].mask & (1 << pos)) {
                if (umin(mn, basis[i].high)) {
                    p2 = i;
                }
                bits ^= 1 << basis[i].high;
            }
        }

        if (p2 != pos) {
            swap(basis[p2].id, basis[pos].id);
            for (auto &i : basis) {
                swap_bits(i.mask, pos, p2);
            }
            pos = p2;
        }
        for (int i = 0; i < len(basis); i++) {
            if (i != pos) {
                if (basis[i].mask & (1 << pos)) {
                    basis[i].val ^= basis[pos].val;
                    basis[i].mask ^= basis[pos].mask;
                }
            }
        }
        int good = (1 << pos) - 1;
        int other = ((1 << len(basis)) - 1) ^ (good | (1 << pos));
        basis.erase(basis.begin() + pos);
        for (auto &i : basis) {
            i.mask = ((i.mask & other) >> 1) | (i.mask & good);
        }
        return bits;
    }

};

template<int max_bit> // not inclusive
struct xor_basis_online {

    vector <xor_basis> basises[max_bit + 1];

    int mx;

    vector <pii> where;
    vector <int> val;

    xor_basis_online() : mx(0), cur_id(0) {}

    int cur_id;

    inline int add (int x) {
        val.pb(x);
        where.pb(make_pair(-1, -1));
        int id = cur_id++;
        if (x == 0) {
            return id;
        }
        for (int i = mx; i >= 0; i--) {
            if (basises[i].empty()) {
                continue;
            }
            if (basises[i].back().add(x, id)) {
                basises[i + 1].pb(basises[i].back());
                basises[i].pop_back();
                umax(mx, i + 1);
                for (auto &j : basises[i + 1].back().basis) {
                    where[j.id] = make_pair(i + 1, len(basises[i + 1]) - 1);
                }
                return id;
            }
        }
        basises[1].pb(xor_basis());
        basises[1].back().add(x, id);
        where.back() = make_pair(1, len(basises[1]) - 1);
        umax(mx, 1);
        return id;
    }

    inline int move_back (int sz, int pos) {
        int to = len(basises[sz]) - 1;
        if (to == pos) {
            return pos;
        }
        for (auto &i : basises[sz][pos].basis) {
            where[i.id].second = to;
        }
        for (auto &i : basises[sz][to].basis) {
            where[i.id].second = pos;
        }
        swap(basises[sz][pos], basises[sz][to]);
        return to;
    }

    inline void del (int id) {
        if (val[id] == 0) {
            return;
        }
        int sz, pos;
        tie(sz, pos) = where[id];
        pos = move_back(sz, pos);
        while (true) {
            int bits = basises[sz].back().rebuild_and_delete(id);
            int i = sz - 1;
            while (i > 0 && basises[i].empty()) {
                --i;
            }
            int new_id = -1;
            if (i > 0) {
                new_id = basises[i].back().get_the_same_high_bit(bits, val);
            }
            if (new_id == -1) {
                if (sz > 1) {
                    basises[sz - 1].pb(basises[sz].back());
                    for (auto &j : basises[sz - 1].back().basis) {
                        where[j.id] = make_pair(sz - 1, len(basises[sz - 1]) - 1);
                    }
                }
                basises[sz].pop_back();
                if (mx > 0 && basises[mx].empty()) {
                    --mx;
                }
                return;
            }
            int cur = val[new_id];
            assert(basises[sz].back().add(cur, new_id));
            int new_sz = i;
            int new_pos = len(basises[i]) - 1;
            where[new_id] = make_pair(sz, pos);
            id = new_id;
            sz = new_sz;
            pos = new_pos;
        }
    }

    inline int size () {
        return mx;
    }
};

int main() {
//    freopen("input.txt", "r", stdin);
//    freopen("output.txt", "w", stdout);

    ios_base::sync_with_stdio(0);
    cin.tie(0);


    // solution to https://basecamp.eolymp.com/uk/problems/11732
    int n, q, add;
    cin >> n >> add;
    vector <int> p(n);
    xor_basis_online<19> t;
    vector <int> now(n);
    for (int i = 0; i < n; i++) {
        cin >> p[i];
        now[i] = t.add(i ^ p[i]);
    }
    cin >> q;
    int ans = t.size();
    cout << ans << '\n';
    for (int i = 0, l, r; i < q; i++) {
        cin >> l >> r;
        l = (l + ans * add) % n;
        r = (r + ans * add) % n;
        if (l != r) {
            t.del(now[l]);
            t.del(now[r]);
            swap(p[l], p[r]);
            now[l] = t.add(l ^ p[l]);
            now[r] = t.add(r ^ p[r]);
        }
        ans = t.size();
        cout << ans << '\n';
    }
}

P. S. Similarly to adding, we can try to use binary search to do deletion: find maximum $r$ that $span (X_{j} ∖ {v}) ⊉ span (X_{r})$ . But after moving an integer from $X_{r}$ to $X_{j}$ the property of sequence of basises can still not hold. And thus using binary search doesn't improve time complexity.

For some time I thought that we can do better than $O (m^{2})$ , but didn't manage to improve. The reason why is that using this property $span (X_{i}) \supseteq span (X_{i + 1})$ we can construct a sequence of integers $a_{i}$ , such that $span (X_{i}) = span ({a_{1}, a_{2}, \dots, a_{d i m (X_{i})}})$ for all $i$ . Having it, we can actually do additions in $O (m)$ , but still can't figure out how to optimize deletions.

Comments (1)

SANYAAAAAAAAAAAAA•3 months ago•2 revision

Is it possible to get size of set of integers that belong to a span of at least one basis, several basises given

Loading

Just a moment, getting data from the server