Is there command to find kind of "prime factors" (that is different than actual prime factors) for any number?

I added here the sample code, but this kind of “factorization” looks difficult…
testh.m (1.9 KB)

It looks little faster though when each case is checked only once as in below code:
testh2.m (1.4 KB)

A little different method:
testh3.m (1.6 KB)

It still might not be best possible way to do it.
The question slowness looks coming from generating the product vector…
Searching the vector for the result is still relatively faster.

How to generate the product matrix for this (assuming that it is the good way to solve this)?
And for example if I set
k=[-1 0 1];
and I want repeat k n times in new vector m, how to make it?
m=[k k k k];
but let’s say k is repeated n times instead of 4 times as in above case.

Code looks operating relatively fast and gives results as:

Give real number (0.1-0.99999999): 0.95
Calculating "prime factors"...
2^-15*3^-10*5^6*7^6=0.9500485160703729≈0.95
[a,b,c,d]=[-15,-10,6,6]
log(0.95)=-0.0222764≈-15*log(2)-10*log(3)+6*log(5)+6*log(7)=-0.0222542

Give real number (0.1-0.99999999): 0.32
Calculating "prime factors"...
2^3*3^0*5^-2*7^0=0.32≈0.32
[a,b,c,d]=[3,0,-2,0]
log(0.32)=-0.49485≈3*log(2)-2*log(5)=-0.49485

Looping the search algorithm an use same product vector:
testh4.m (1.7 KB)

Possibly some type of bug, because for 100 I get as:

Give real number (0.1-0.99999999, 0=quit): 100
2^6*3^-17*5^1*7^9=99.99332461737717≈100
[a,b,c,d]=[6,-17,1,9]
log(100)=2≈6*log(2)-17*log(3)+1*log(5)+9*log(7)=1.99997

while 0.01 seems to work:

Give real number (0.1-0.99999999, 0=quit): 0.01
2^-2*3^0*5^-2*7^0=0.01≈0.01
[a,b,c,d]=[-2,0,-2,0]
log(0.01)=-2≈-2*log(2)-2*log(5)=-2

Bug fixed (The array elements where overwritten during generation):
testh5.m (1.7 KB)
It might still occur in some other conditions though or not.

It sounds like you want both the prime factors (which may be known) and the exponents (which are not known) – have I understood correctly?

Does the two-output form of factor help?

[p,n] = factor(2520)
p =

   2   3   5   7

n =

   3   2   1   1

The function rat may be useful to you to convert a real number into a rational approximation.

1 Like

The factorization could help in some cases, but for example 11 didn’t factor into 2,3,5,7.
testh5.m (1.7 KB)
I tried to add accuracy for the non-factorizable numbers as in the above program…

Give real number (0.1-0.99999999, 0=quit): 11
2^26*3^1*5^15*7^-21=10.99999193456363???11
[a,b,c,d]=[26,1,15,-21]
log(11)=1.04139???26*log(2)+log(3)+15*log(5)-21*log(7)=1.04139

If the exponents become large enough it can approach the numbers with certain maximum error.

Another approach is that you want integer solutions to the following linear equation:
a*log(2) + b*log(3) + c*log(5) + d*log(7) = log(x)
where x is given, and the only unknowns are a,b,c,d, all integer. Since there are likely to be many solutions to that underdetermined system, you could pose it as an integer optimization problem for GLPK or other suitable MIP solver. A recent work in progress is this: Improving Octave's linear and integer programming facility

Or you can explicitly solve for d given a,b,c and run a loop or meshgrid on a,b,c, accepting only if d is close to an integer as well – your approach is similar to this already.

1 Like

It looks I can do the ‘grid’ or ‘table’ with below settings:
base=80;
cent=40;
But not sure what is maximum for these settings.
Give real number (0.1-0.99999999, 0=quit): 0.24
2^13^15^-2=0.24≈0.24
[a,b,c,d]=[1,1,-2,0]
log(0.24)=-0.619789≈log(2)+log(3)-2*log(5)=-0.619789
Above case is easily found…
but some numbers do not fit this criteria (when not factorizable to 2,3,5,7) and the exponent(s)
should approach to +/-infinity or some very large number to get that number.
The calculation accuracy with the eps is limited also…
I guess the algorithm to find this numbers would be similar to one inside the ‘factor’-command,
so maybe quite difficult one.

Give real number (0.1-0.99999999, 0=quit): 3.141592653589793
2^42*3^-41*5^36*7^-21=3.141591333310138≈3.141592653589793
[a,b,c,d]=[42,-41,36,-21]
log(3.14159)=0.49715≈42*log(2)-41*log(3)+36*log(5)-21*log(7)=0.49715

Changed the search in to the logarithms instead of the input value:
testh6.m (1.7 KB)

Give real number (0.1-0.99999999, 0=quit): 0.123456789
2^1*3^-4*5^1=0.1234567901234568≈0.123456789
[a,b,c,d]=[1,-4,1,0]
log(0.123457)=-0.908485≈log(2)-4*log(3)+log(5)=-0.908485

I think I could not improve the accuracy without ‘quad floating point’…
Not sure if that is possible by any means.
I think I was already asking earlier if the octave has any method to connect two dual floating point register in series to work like quad floating point register…
This case if course the speed would go the 1/2. But that might not be too slow to some rare calculations where it is asked for more accuracy.

1 Like

See if this helps you:

tic

weights = log([2 3 5 7]);

N = 0.12345; # <--- this is your RHS value, change as needed
tol = 1e-6; # <--- integer tolerance, tune as needed
maxexp = 50; # <--- exponents will be in the range (-maxexp:+maxexp)

tbl = (-maxexp:+maxexp)';
tbl = setcross (tbl, tbl, tbl); # <--- setcross can be replaced by meshgrid too
toc

# assume that tbl refers to the exponents of 3,5,7, and compute the exponent of 2:
tmp = (log(N) - sum(weights(2:4) .* tbl, 2)) / weights(1);

# pick only integer exponents within maxexp:
ii = (abs(tmp) <= maxexp & abs(tmp - round(tmp)) <= tol);
soln = [tmp(ii), tbl(ii,:)]; # this is the full solution set
clear tmp ii

[~,order] = sort(sum(abs(soln),2)); # rank-list with smaller exponents first
soln = soln(order,:)(1:min(5,end),:) # and pick up to the 5 best solutions
clear order

toc

where setcross is this function of mine that already existed:

function x = setcross (a, b, varargin)
  if (nargin < 2)
    error ('setcross must have at least 2 arguments')
  end

  [ii,jj] = meshgrid (1:size(a,1), 1:size(b,1));
  x = unique ([a(ii,:), b(jj,:)], "rows");

  if (nargin > 2)
    x = setcross (x, varargin{:});
  end
end

Output for N = 0.12345:

Elapsed time is 0.092731 seconds.
soln =

   -3.0000   24.0000    9.0000  -21.0000

Elapsed time is 0.158505 seconds.

That does check out: 3^24 / 2^3 / 7^21 * 5^9 is indeed 0.1234500655878999, which is within the specified tolerance of 1e-6.

@jarkky I have a way to achieve your higher precision needs without needing quad precision.

Consider my code above. It gives small exponents that get close to the answer (smallness and closeness tunable by you, but let’s keep them where they are for now). Say you wanted some real number N to be achieved but it only gave you a rational number M near N, with a difference less than 1e-6. Then the ratio (N/M - 1) can be fed to the code again to achieve a second number R1 that represents the ratio, which acts like a first order correction to M. If (N/M - 1) is too small, it can be multiplied by 10 a few times and you can remove that many 2s and 5s from its answer. This process can be repeated to get a sequence R2, R3 etc. Your rational numbers made of 2,3,5,7 would then be M, (M with R1), (M with R1 and R2), etc. That gives you a sense of where you want to truncate the search beyond which the increased accuracy is negligible. I think that will be of more value to you than a single numerical answer.

I have to check this.
It looks similar in sense that use the grid.
Maybe this would take some time to check it:
testi.m (1.1 KB)
But this case it had some error:

warning: function name 'setcross' does not agree with function filename '/home/jari/bin/testi.m'
error: setcross must have at least 2 arguments
error: called from
    testi at line 5 column 5

It looks I have to run it command line. I set it as:

N = 3.141592653589793;
soln =
   42.00000060630438  -41.00000000000000   36.00000000000000  -21.00000000000000

But its problematic that the solution is not exact integers.

octave:102> prod([2 3 5 7].^soln)
ans = 3.141592653589819
octave:103> prod([2 3 5 7].^[42 -41 36 -21])
ans = 3.141591333310121

Setting:
tol = 1e-5;
gives:

soln =
   3.300000149990753e+01  -1.300000000000000e+01  -1.000000000000000e+00  -3.000000000000000e+00
  -1.700000831346594e+01  -1.500000000000000e+01  -1.800000000000000e+01   3.000000000000000e+01
  -8.000009207069089e+00  -4.300000000000000e+01   1.900000000000000e+01   1.200000000000000e+01
   2.400000239351068e+01   1.500000000000000e+01  -3.800000000000000e+01   1.500000000000000e+01
  -2.600000604964107e+01   4.500000000000000e+01   9.000000000000000e+00  -2.300000000000000e+01

Setting:
tol = 1e-7;
gives:
soln = [](0x4)

It looks quite fast, but not giving exact integers always…
It should be obvious that table size can be reduced with calculating the exponent of 2 using the other variables, but otherwise I don’t know this commands.

I added below command to check the results:
prod([2 3 5 7].^round(soln)')

But it looks like the FPU (and double) accuracy is reached and I could not really get more than 6-7 decimals to the result.
Maybe if by some means the octave can use quad accuracy, it could get solved better.

1 Like

Try relaxing this

ii = (abs(tmp) <= maxexp & abs(tmp - round(tmp)) <= tol);

to this

ii = (abs(tmp - round(tmp)) <= tol);

to get more solutions.

You can also increase maxexp to say 100 or more if there are no solutions for a given tolerance.

You can tighten the tolerance to reduce the number of solutions if there are too many.

I fixed the table size in my code:
testh7.m (1.9 KB)
And now it works faster and better and memory will note get filled,
but still I think FPU accuracy is limiting the accuracy of the results.
It is getting better as:

Calculating "prime factors"...
Elapsed time is 21.2116 seconds.
Give real number (0.1-0.99999999, 0=quit): 3.141592653589793
2^72*3^-344*5^242*7^-31=3.14159265665141≈3.141592653589793
[a,b,c,d]=[72,-344,242,-31]
log(3.14159)=0.49715≈72*log(2)-344*log(3)+242*log(5)-31*log(7)=0.49715

But I could not get this better though.
Another machine with larger memory can give it as:

Calculating "prime factors"...
Elapsed time is 121.373 seconds.
Give real number (0.1-0.99999999, 0=quit): 3.141592653589793
2^-570*3^495*5^-186*7^78=3.141592651289402≈3.141592653589793
[a,b,c,d]=[-570,495,-186,78]
log(3.14159)=0.49715≈-570*log(2)+495*log(3)-186*log(5)+78*log(7)=0.49715

But not much more better…

And 2nd test case is as:

Give real number (0.1-0.99999999, 0=quit): 11
2^-1138*3^499*5^47*7^86=10.99999998383237≈11
[a,b,c,d]=[-1138,499,47,86]
log(11)=1.04139≈-1138*log(2)+499*log(3)+47*log(5)+86*log(7)=1.04139

But I could not reach 15-16 digits accuracy by any means, I guess it need longer looping:
testh8.m (2.2 KB)
Still nothing better found or just slightly better:

2^-214*3^-66*5^418*7^-231=11.00000001050144≈11
[a,b,c,d]=[-214,-66,418,-231]
log(11)=1.04139≈-214*log(2)-66*log(3)+418*log(5)-231*log(7)=1.04139

It can run this code, but I guess the accuracy is not enough to give results:
testh9.m (2.2 KB)

1 Like

Since you’re using log10 anyway, can you try replacing log10(2) == 1 - log10(5) to maybe eliminate one loop? It’s not clear from your code whether you’re doing that.

In best case I can get 8-9 digits accuracy only.

The symbolic package can call SymPy for arbitrary precision arithmetic. You can also install the apa package (GitHub - gnu-octave/pkg-apa: Octave/Matlab arbitrary precision arithmetic) and see if that helps.

1 Like

What about the ‘extended format’ (80 bits?):

I think the octave uses the ‘double’ instead of this ‘extended’ format:

This looks double presentation:

format bit
pi
ans = 0100000000001001001000011111101101010100010001000010110100011000
(with 64 binary digits rather than 80 binary digits).

I “logged” the results in the below file:
lognum.log (1.2 KB)

Loading the symbolic package didn’t seem to work:

> pkg load apa
error: MEX interface creation failed.  APA cannot be used.
error: called from
    install_apa at line 86 column 7
    /home/jari/octave/apa-1.0.0/PKG_ADD at line 2 column 1
    load_packages_and_dependencies at line 56 column 5
    load_packages at line 53 column 3
    pkg at line 588 column 7
octave:4> pkg test apa
Testing functions in package 'apa':

Integrated test scripts:

  /home/jari/octave/apa-1.0.0/install_apa.m ...................... pass    1/1   

Fixed test scripts:


Summary:

  PASS                                1
  FAIL                                0

See the file /home/jari/bin/fntests.log for additional details.

270 (of 271) .m files have no tests.

Please help improve Octave by contributing tests for these files
(see the list in the file /home/jari/bin/fntests.log).

error: MEX interface creation failed.  APA cannot be used.
error: called from
    install_apa at line 86 column 7
    /home/jari/octave/apa-1.0.0/PKG_ADD at line 2 column 1
    pkg at line 757 column 9

Can I find “pythagoren triples” made of these number:
(2^2*3)^2+5^2~(2^-778*3^84*5^23*7^212)^2. <= not exact match
This program below would give only exact matches:
pythgen.m (781 Bytes)
It loooks this program is ‘skipping’ the case:
21^2+28^2=35^2

That is likely only being missed because of floating point rounding.

It appears you are trying to make Octave work for you beyond the limits of floating point precision, and you’re not getting help from the symbolic or apa packages. You might be better off using programs aimed at pure math, such as Sage (sagemath.org).

Yes… I should have the ‘extended floating point’ version of the octave or it selectable between double/extended/quad floating poing somehow at the startup or somehow.
But I got better result for the pi as:

5^299*2^-210*3^-618*7^177=3.141592654396437≈3.141592653589793
[a,b,c,d]=[299,-210,-618,177]
log(3.14159)=0.49715≈299*log(5)-210*log(2)-618*log(3)+177*log(7)=0.49715

when the order of the primes is changed from [2 3 5 7] into [5 2 3 7] (as in below code):
testh8b.m (2.9 KB)

I might try the symbolic packages also. In this spefic problem it looks the WolframAlpha is not working.
Improved results for the number ‘29’ as well:

5^-364*2^848*3^561*7^-316=29.0000000047892≈29
[a,b,c,d]=[-364,848,561,-316]
log(29)=1.4624≈-364*log(5)+848*log(2)+561*log(3)-316*log(7)=1.4624

lognum.log (4.7 KB)

I tried the sage (that seems available also through the apt-command):

sage: N(sin(1),digits=100)
0.8414709848078965066525023216302989996225630607983710656727517099919104043912396689486397435430526959

It seems to work and higher precision is available there.
Don’t know how that is made, maybe it is still made somehow in the FPU…
The help-menus or the tutorial fails to work, and looks like with Python related errors.
I try find the online-help and check how the for-loops, matrices, etc work in this sage.

It looks the sage gets some results, then crashing:

[354, -70, 185, -239]
3.1415926589061544881
j = -48
/usr/share/sagemath/bin/sage-python: line 2: 96052 Killed                  sage -python "$@"

for the below code:
testh8b.s (1.1 KB)
It looks considerably slower. I might try to check if they have support page.

I could be wrong that the accuracy is problem. The search space probably is too big to search…
I try increase the search space, but then it would be come slow.
Using the sage, I found one more solution as:

[-288, 769, -243, -130]
3.1415926535444999703

which seems more accuracte than any of the octave solutions.
So it could be also accuracy issue with the double-format.
testh8a.s (1.9 KB)
Or because the 3^769 goes to ‘infinity’ with the double format…

octave:4> 2^-288*3^769*5^-243*7^-130
ans = Inf
octave:5> 2^-288*3^256*5^-243*3^256*7^-130*3^257
ans = 3.141592653544500

Octave could not ‘divide’ the 3^769 into smaller parts automatically (like 256+256+257)…
So this could become difficult case if this should be programmed to work out…
So I guess it must be handled fully logarithmic when the exponent becomes very large:

2^-288*3^769*5^-243*7^-130=3.141592653544178≈3.141592653589793
[a,b,c,d]=[-288,769,-243,-130]
log(3.14159)=0.49715≈-288*log(2)+769*log(3)-243*log(5)-130*log(7)=0.49715

testh8b.m (3.0 KB)
Above the logarithmic calculation gives 3.141592653544178, which is slighlty
different than the actual value 3.1415926535444999703, because required use of the
logarithm that causes loss of accuracy in that last decimals.
Octave can find also case as:

2^1147*3^-1106*5^2785*7^-2087=3.14159265362066~3.141592653589793
[a,b,c,d]=[1147,-1106,2785,-2087]
log(3.14159)=0.49715~1147*log(2)-1106*log(3)+2785*log(5)-2087*log(7)=0.49715

But the remaining 4-5 digits are not possible to be fixed…

There is some differences in the last 4-5 digits for ‘sage’ vs. ‘octave’:
octave:
2^5552*3^544*5^-124*7^-2181=11.00000000002673
sage:

sage: N(2^5552*3^544*5^-124*7^-2181,digits=16)
11.00000000003184

and depending on the log-base used in the octave:

octave:19> exp(log(2)*5552+log(3)*544+log(5)*-124+log(7)*-2181)
ans = 11.00000000004090
octave:20> 10^(log10(2)*5552+log10(3)*544+log10(5)*-124+log10(7)*-2181)
ans = 11.00000000002673
octave:21> 2^(log2(2)*5552+log2(3)*544+log2(5)*-124+log2(7)*-2181)
ans = 11.00000000003705

Using base=10 seems most accurate option here…

Using function as:

function [r]=powc(t)
  SMAX=10000;
  LSMAX=log10(SMAX);
  digits=[2 3 5 7];
  ldigits=log10(digits);
  limdig=floor(LSMAX./ldigits);
  t0=floor(abs(t./limdig));
  n1=max(t0);
  t1=t/n1;
  t2(1,:)=[0 0 0 0];
  for i=1:n1,
    t2(i+1,:)=t2(i,:)+t1;
  end;
  t3=diff(round(t2));
  r=1;
  for i=1:n1,
    r=r*prod(digits.^t3(i,:));
  end;
endfunction;

seems giving more accurate results, but could be too slow to be used in the search algorithm.

octave:4> powc([5552 544 -124 -2181])
ans = 11.00000000003247

Also I made c-code to compare ‘double’ and ‘long double’ and it looks its just ‘double’ both of them:

$ ./printnum 
FLOAT:       3.1415927410125732421875000
DOUBLE:      3.1415926535897931159979635
LONG DOUBLE: 3.1415926535897931159979635

The FLOAT & DOUBLE have clear difference. Maybe the FPU didn’t support the long double or its not implemented in the c-language…

// FPU format checking...
#include <stdarg.h>
#include <stdio.h>
#include <stddef.h>
#include <stdlib.h>
#include <float.h>
#include <math.h>

int main(int argc, char *argv[])
{

  int i;
  float a;
  double b;
  long double c;

  a=3.141592653589793238462643383279502884197;
  b=3.141592653589793238462643383279502884197;  
  c=3.141592653589793238462643383279502884197;  
  printf("FLOAT:       %.25f\n",a);
  printf("DOUBLE:      %.25f\n",b);
  printf("LONG DOUBLE: %.25Lf\n",c); 
};

printnum.c (488 Bytes)

The problem is how to solve the problem like what is O(2) with O(1) or to reduce the level
of the O(), then its maybe solved in right manner.
There must be some way to reduce the computational complexity that didn’t come into mind,
because it still takes a lot of time to run this and last 4-5 digits didn’t get fixed with this code.

I think the “long double” really is possible to be used with c-programs and when I checked
the execution times are about 2x longer than for “double”.
I hope the octave can be somehow used with “double” & “long double” modes both
and also maybe in “float” mode if it should work even faster.

This seems to work fine:

octave:4> double(pi)
ans = 3.141592653589793
octave:5> single(pi)
ans = 3.1415927
octave:6> format bit
octave:7> double(pi)
ans = 0100000000001001001000011111101101010100010001000010110100011000
octave:8> single(pi)
ans = 01000000010010010000111111011011

but long double option is not available.


Improvement of result looks increasingly too difficult for error < 2e-12…

It could be little bit ‘off-course’ to ask how to get distance to integer in C-language in this forum,
but let me try with this sample program:
testh11.c (2.9 KB)
The distance to integer is value (0 to 0.5) for any real number, but not sure if that is
available in C-language.
I also wonder how to make this C-language code to run in parallel processing mode in multiple threads at the same time.
And also if octave can use the MPI-library multicore processing somehow…