Multiple bitset index

Problem description

function int_out = bitsets(int_in, idxs, vals)
    int_out = int_in 
    for cols = [idxs; vals]
        int_out = bitset(int_out, cols(1), cols(2));
    end
end

My system

  • OS: Windows 10

Solution

  • Kind of vectorization operation

Just not using a for loop seems to work for me:

bitset ([0, 4], [2, 3], [true, false])

This code will work just fine. As you note, it may be slow because of the for loop.

If you need something faster, most programmers use AND and OR operations when they need to set or reset multiple bits at a time. See functions bitand and bitor.

As an example, use bitor to set the 4th and 1st bits.

n = bitor (0, 0b1001)
n = 9
dec2bin (n)
ans = 1001

I made use of the binary notation 0b in the Octave interpreter. But you could have used hex 0x or even decimal 9 say after converting with bin2dec

Similarly, to reset a bit to 0 you can use bitand with a mask that is all 1 except for the bits you want to set to 0. Using the example above, if I want to reset bits 1 and 2 I would form a mask of 0b1100 and then use bitand.

bitand (9, 0b1100)
ans = 8
dec2bin (ans)
ans = 1000

The issue is that this creates a two-element vector. The original post is about setting or clearing multiple bits for a single value.

1 Like

Ah, thanks. I misread.

It’s always a good idea to see just how slow things are and whether it matters in the grand scheme of things. I wrote a benchmark for the code

N = 1e3;

tic;
for i = 1:N
  v = bitsets (0, [1, 3, 5, 8], [1, 0, 1, 0]); 
endfor
toc

Running this script takes 0.85 seconds or 8.5 milliseconds per function call. That’s not very much and not something a human would necessarily notice. However, if you are using this within a large loop in a critical portion of your code this might become the bottleneck.

Here’s an implementation that is roughly 6x faster (1.35 milliseconds per function call). It could be made even slightly faster if you decide on the size of the bitvectors you are using and replace the calculated value pow2 (sz) - 1 with a hardcoded constant such as 0xFFFFFFFF for 32-bit vectors.

function int_out = bitsets(int_in, idxs, vals)
  sz = 32;
  vals = logical (vals);
  set_bits = sum (pow2 (idxs(vals) - 1));
  int_out = bitor (int_in, set_bits);
  clr_bits = (pow2 (sz) - 1) - sum (pow2 (idxs(! vals) - 1));
  int_out = bitand (int_out, clr_bits);
end

1 Like

Yeah, the function will be called thousands of times and might be implemented in a long binary set. So yeah, it need to be optimised.

Thank you, I will try to understand your code and might come back here later if I found something about it.

But still, your code seems suboptimal. Can’t we just implement it to the core of Octave (maybe submit it as feature submission? Is that possible in Octave community?). Furthermore, there is exist fetching multiple bits value using bitget.

In what way? What is optimal? Is there a Matlab-equivalent function that Octave should be emulating?

You can always submit a request over on the issue tracker at bugs.octave.org. This is free software which means it is all coded by volunteers. Either someone else has to take an interest in this issue, or , better yet, you code a solution, submit it, and it gets adopted in to core Octave.

Regarding bitget, Octave behaves the way it does there because it needs to maintain compatibility with Matlab. Similarly, Matlab has said that multiple values and indices must produce multiple outputs for bitget, not a single output.

If you want to write new functionality for bitset it needs to belong to a new function.

NO. MATLAB doesn’t have that equal function. That’s why I leave the MATLAB community and strongly advise my colleague to slowly moving to Octave. But in the Python community, it’s always preferable to use pip to download the library and there are exist lots of optimized libraries. But in the Octave community, I don’t know what is the norm here in contributing to the code base. Directly to the core Octave or add ons library (callable library exist?).

Thank you, I will take a look at how to properly add a function that didn’t exist in MATLAB yet.

YES, I understand that.

YES. But, which one is proper in the community, a code exchange and anyone need to manually copy-paste the code, a pipeline of a callable library, or submit a new function that didn’t exist in MATLAB to Octave core?

The use of existing function #joke. Currently, your solution might be best. But, direct memory manipulation (low level) might be a great idea to be implemented here. Sadly, that’s not my expertise.

All options are a possibility. The direct jump from function to core Octave is the least common. More frequently is that someone writes a bit of code for their own needs, but makes it available to others. That might mean posting it somewhere on the Octave Wiki (wiki.octave.org). You could also upload it as a submission to core Octave at the patch tracker (kept on bugs.octave.org, but you can switch from browsing bug reports to browsing patches). Another possibility is to see if it might belong more naturally to an Octave package (Octave Forge - Packages).

My own opinion is that this is reasonably related to the other existing bitXXX functions and probably should go in core Octave if it is to be adopted. I know there are some programmers who would consider it unnecessary since you can accomplish the same thing with bitand and bitor and this is just a user-friendly layer on top of those functions.

After some searching, yeah, i think that’s the proper way Octave Forge - Developers (sourceforge.io).

YES

NO, two times operation (AND and OR) make it slower than if there is exist direct manipulation.

Even in C/C++ the implementation is likely to take the form of an AND and an OR operation. You might lean on the C++ <bitset> functions from the standard library, but then there would be conversion from integer to double that might make this a slower option.

Also, do take programmer time in to account. Current implementation takes 1.35 milliseconds. Assume you re-code something in C++ that is 25% faster so each call is one millisecond. How long does it take you to write that C++ code? Maybe 3 hours because you need to figure out the basics of writing a C++ function callable by Octave (See the Appendix on external code in the Octave manual or look around in the source code in directory libinterp/corefcn). That’s a lot of time when the existing solution in an m-file works pretty well.

Thank you,

But, since this is part of a library and my way to hone my skill in programming (I don’t have a formal CS degree), I love to learn from the basics. But, yes, learning C/C++ might be too far for me. All in all, thank you very much for your consideration and solution.