A package (Digital Forestry Toolbox) throw: error: regexp: the input string is invalid UTF-8

A package (Digital Forestry Toolbox) throws: error: regexp: the input string is invalid UTF-8.

Particularly when running dft_tutorial_2, I get an error:

error: regexp: the input string is invalid UTF-8
error: called from
    LASread at line 2005 column 50
    dft_tutorial_2 at line 36 column 4

which points to error in line:

pc = LASread('zh_2014_a.las');

I don’t understand, where the problem lies. Same code works okay in Matlab.

Which character encoding is used for the file ‘zh_2014_a.las’?
It might help if you set the file encoding that Octave uses by default to that encoding with __mfile_encoding__ ("some_encoding").

How do I figure this out?

Since the code works in Matlab, then wouldn’t this suggest that I need to adjust some Octave setting?

file -i (linux) or file -I (macos) [see link]

For MS Windows, Octave has Notepad++ in it’s installation bundle.

Probably yes, but to make this magic happen, we need to know what went wrong in your case. Please help us and in the future you might not find this issue again :innocent:

UTF-8 is pretty common now. So as a first guess you could try setting Octave’s default file encoding to that encoding:

>> __mfile_encoding__ ("utf-8");

And try to read the file after that.
If that doesn’t work, try to open the file in a text editor. E.g. try to open it in Notepad++ like Kai suggested. That program uses heuristics to guess the encoding of the file. It displays its best guess in the status bar at the bottom of its main window.

Could you attach a file that cannot be read?