How to vectorize this Wordle code?

Hello to everyone. I learn Octave by doing problems. I have function for Wordle but it is slow with loops. How to vectorize it?

# given a guessword and the actualword with n letters each,
# this function computes a vector "result" of length n, where
# result(i) == 2 if the ith letter of guessword is present in actualword in the same place 
# result(i) == 1 if the ith letter of guessword is present in actualword in a different place
# result(i) == 0 if the ith letter of guessword is not present in actualword at all

function result = getvector (guessword, actualword)
  n = length(guessword);
  result = zeros(1,n);
  used = (guessword == actualword);
  result(used) = 2;

  for i = find (result==0)
    for j = find ((actualword == guessword(i)) & ~used)
      if (j ~= i)
        result(i) = 1;
        used(j) = 1;
      end
    end
  end
endfunction

It is needed to handle repeated letters properly, so if the actual word is “abbey” and the guess is “babby”, then the result is [1 1 2 0 2].

But writing actualword' == guessword gave a 5x5 matrix that I did not know how to use.

OS: Windows and Linux both.
Octave: 6.4.0 (Linux distro), 7.1.0 (Windows installed from official Octave website)

ismember is what you are looking for

function result = getvector (guessword, actualword)
result=ismember(guessword, actualword);
result=double(result);
result(actualword==guessword)=2;
guessword='babby';
actualword='abbey';
getvector(guessword,actualword)

ans =

     1     1     2     0     2
2 Likes

This is a step in the right direction but doesn’t seem quite right. If the actual word were “abbot” and the guess is “abcde” then ismember (actual, guess) is [1 1 1 0 0] and incrementing matching elements to 2 makes it [2 2 1 0 0] which gives the erroneous impression that the actual word “abbot” has the letter “c”. Should that be ismember (guess, actual) instead of the other way round?

To the OP: depending on how many times you’re calling that function, you could also write it in C++ and turn it into an oct file that you can call from Octave. See mkoctfile for more to see if that matches your needs.

1 Like

you are right - should flip the two parameters in the ismember call.

I realized that doesn’t work either. Take the guess to be “hooch” and the actual to be “orate”. Then ismember (guess, actual) is [0 1 1 0 0] and ismember (actual, guess) is [1 0 0 0 0] but the correct result should be [0 1 0 0 0] indicating that there is one letter O (not two) in the actual word but it is not in the second place. Is there any function that takes a 0/1 vector and retains only the first K 1s and changes all 1s beyond that to 0? It’s tough to vectorize that without using find and a loop.

I’m not sure I understand, but maybe you mean

v = [0 1 1 0 0];
v(cumsum(v) > 1) = 0;

?

Loops are slower than vectorized code, but this code itself is not slow. As long as loop counts are low it is okay to use them. In this case, Wordle uses words of length 5 which meets the low loop count criteria. To test, I put your code into the file wordle.m and then ran this benchmark

g = "hooch";  w = "orate";
tic; for i = 1:1e4, wordle (g, w); end; bm = toc
bm = 0.6577
octave:16> bm / 1e4
ans = 6.5767e-05

Average running time is 66 microseconds per function call which is quite small.

For fun and for educational purposes you could try to vectorize the code, but it isn’t necessary for performance.