Encoding problems and dlmread

octave:13> error: dlmread: error parsing RANGE
�octave:4> �

octave:4> error: parse error:

invalid character ‘�’ (ASCII 254)


Sounds like an encoding issue with the file you are trying to read.
What is the encoding used for the file? What is your system’s locale encoding? And what it the default encoding used by Octave (__mfile_encoding__)?

It is maybe not supported.
octave:5> mfile_encoding
ans = system

Best way to go forward is probably to convert the file you are trying to read from whatever encoding it is currently in to UTF-8.

I use
__mfile_encoding__ ("utf-8");
And then I use:
v=dlmread(fn," ");
But this case the last column is filled with zeroes, even there is nothing.
There is 101 columns in this file, but dlmread finds 102 columns.
But the text file header line has 102 columns… I guess it sets the width to the header width,
even the header is not read as it is text. But still it seems affect to the results.
This part then can be fixed by:
v=dlmread(fn," ",1,0);
But this case I would not need ‘zero padding’ at the end and the header text could be at any width different than the data/numbers width.
But another file read function fails:
and I can not set it to end reading at the end of the first line exactly.
So It is set as:


And cellstr is needed at the final phase…

Did you convert the file to UTF-8?

The file has only numbers and normal characters. How can I convert? It is maybe normal.
This is the new 6.3.0 version. I’m not sure if earlier 5.2.0 had same issue or not.
Now also if I press + it will exit from octave, which might be not earlier case.
You can inspect below file, but it looks with normal characters only:
unit.m (437 Bytes)

That script doesn’t use dlmread.

I’ll send now one with dlmread:
filread.m (555 Bytes)
In this below part:


Is it possible to read this with single command instead?
This would read the header of the file.

That file contains this comment:

# Text header MUST be skipped to avoid wrong data width

But earlier you wrote that the files you are reading only contain numbers…

Do you mean the above? The character refers to letters (like a-z).
Maybe wrong spelling.

Try help dlmread for more information. The first sentence is

    Read numeric data from the text file FILE which uses the delimiter
     SEP between data values.

dlmread can only read numeric values. If you have text data at the start of the file then you need to skip it by using the row argument to dlmread. The script filread.m is skipping the first line, but maybe you need to skip more to get to the start of the numeric data.

Can you upload a sample of the first 15 lines of the file? Someone could verify whether they can read it.

This is test file:
test.txt (16.8 KB)
(header+data lines).
The dlmread() works somecases by skipping the header if header is text.
But it is better to skip the header by the dlmread start row argument.
This looks more likely with ZorinOS16. LinuxMint20.2 case it seems to work.
Maybe different terminal setting. Both cases I used flatpak to install octave 6.3.0.

I can read the test file just fine with

data = dlmread ('test.txt', " ", 1, 0);

Dimensions of the variable data are [9 101].

Seems basically fine to me.

It works if you skip headerline.
But if there is random number of headerline in some files, then this method looks problematic.

That’s not really a problem. If you don’t know the format of the data file then it is the programmer’s responsibility to find that out. That can be done by inspecting the first 20 lines of the file. And it only needs to be done once if the file is being generated by the same instrument/process so there’s no real overhead issue.

It might need to be scanned line by line with some code similar than above until
it find the number lines starting.