Rechenkraft.net e.V.

Verfasst: **09.09.2017 17:24**

OK, I didn't see that you are already at 2^44. And that's nice! Sorry about the previous post, and now I know that you are really exploring new territories!

If you need to see whether you can optimize your code, I would be willing to help. I don't know what the exact algorithm is, but I am very interested.

fwjmath.

Verfasst: **09.09.2017 20:43**

Hello, fwjmath.

Actually, the last my post was caused by the need to close the gestalt. )))
I'm very glad that 3 monthes of manual crunch had finished.
So, I peed it right to post the last totals of manual crunch.
I am grateful to all participants who helped me to improve app, with their support the achievment became possible:
A1ex01
rpisarev
5erg
dimus
vasyannyasha
firstomega
(_KoDAk_)

But... now we are starting the new life as a sub-project of yoyo@home!
We started from 10T, for some reasons which I would like to keep in secret till a proper time. )
Here, on sub-project status page you can find out where we are.
The legend in the right side is clickable, click on "To Do", for example, to hide this category and for better scale of others categories.
"In Progress" — the range, where "porridge is boiling", with "holes" between checked ranges.
"Done" means continuos range of (at least) double-checked numbers, where "porridge has already boiled and cooled down".

So, now I'm waiting for when the range up to 30'359'586'957'394 will be crystallized, to double check our manual crunch.

You are welcome in Telegram group "Perfect Cuboid" to discuss math and code sides of algorythm.

Verfasst: **24.10.2017 20:43**

fwjmath hat geschrieben:I don't know what the exact algorithm is, but I am very interested.

I have found that the first half of the algorythm had already used and described at I. Korec, Lower bounds for Perfect Rational Cuboids, Math. Slovaca, 42 (1992), No. 5, p. 576.

Korec hat geschrieben:The program was written in the language TURBO PASCAL v. 5.5 and run on PC AT. In every run, all integers n from an interval [n1, n2] are considered and at the end input data for the next run are prepared.
After the start, several tables are computed. They later help in fast distinguishing quadratic residues and non-residues, in computing square roots modulo several integers, etc. The computation time of this stage is very short.
Whenever necessary, a portion of suitable n is prepared by a sieve method. It starts with all n = 1 (mod 4) from a suitable subinterval [a, b] of the interval [n1, n2]. Then for every prime p = 3 (mod 4), p < sqrt(b), all multiples of p are excluded. This is done also for some composite p because it is faster than testing primality. Then only n whose prime divisors are all = 1 (mod 4) remain, and they are considered in the further computation in the usual order.
Now let a suitable n be fixed. At first it is factorized, by the classical method, but only primes = 1 (mod 4) are treated. If n is prime, it is immediately excluded.

The next part is similar but not identical:

Korec hat geschrieben:Otherwise the list
(A0, B0),(A1, B1),..., (At, Bt), (4.1)
(A0, B0) = (n, 0) of all nonnegative integer solutions (a, b) of the equation a^2 + (4b)^2 = n^2 is computed by a method based on Lemma 2.2 and Lemma 2.3.

Instead of that I do:

x3mEn hat geschrieben:Otherwise the list
(A0, B0), (A1, B1),..., (At, Bt),
(A0, B0) = (0, n) of all nonnegative integer solutions (a, b) of the equation a^2 + b^2 = n^2 is computed.

And next:

Korec hat geschrieben:Then the main (and the most time consuming) part starts, which consists of three nested loops.

Instead of that I do only two nested loops. And they are enough to find out not only a Perfect cuboid, but also +5 other types of almost-perfect cuboids: Face, Edge, Perfect Complex, Imaginary and Twilight (the last 2 types I was forced to call by myself because of the reason which is out of scope of algorithm explanation).

Verfasst: **28.10.2017 00:21**

9 October we have added per user statistics of found cuboids, as well as subproject status page was extended by the common statistics.

Today, 27 October, we have added Hits to Perfect Cuboid subproject, so from now you can compete not only in earned Credits, but also in the amount of found cuboids. Free-DC is already showing Hits in Subproject Stats.

And also good news, today we have "crossed the equator" of the 1st Batch, Max In Progress threshold has exceeded 2^49 = 562.949.953.421.312, which is the half of our 1st goal — 2^50.

Verfasst: **28.10.2017 08:03**

Very nice!

Michael.

Verfasst: **01.11.2017 10:28**

Hi x3mEn,

Thanks again for sharing some more details on your algorithm. I have read the paper you provided. Although we cannot judge the efficiency of an algorithm by simply counting the number of nested loops, I admit that your two-loop idea may be better in implementation than the three-loop one in the paper, mainly by its simplicity.

I now have the following suggestions:

1. You may want to use more knowledge in number theory.

Many of the constraints in the three-loop algorithm concern the value mod 4 of results. We know that, for a primitive cuboid, its diagonal is of the form 4k+1, meaning that its sides are 0, 0, 1 mod 4. But I think you have already used this fact.

A more interesting one comes from checking for squares. More precisely, as you said before, you have g^2=a^2+f^2=b^2+e^2, and you want to check for a^2+b^2=d^2 for some d in the decompositions. You are doing it now first with byte residue, then with integer residue (32 bit), then finally with float point (long double). In fact, you can do it using Chinese Remainder Theorem (https://en.wikipedia.org/wiki/Chinese_remainder_theorem).

Basically, if you want to check an equality x+y=z, with z<n_1 * n_2 * ... * n_k, and all the n_i's are all coprime in pairs, you only need to check the equality mod all the n_i's. If all gives equal, then we must have a+b=c. In your case, a and b are 64 integers, therefore you need a collection of numbers with a product larger than 2^129, with each number less than 2^32. That is fairly easy to do. We know that 2^32-5, 2^32-17, 2^32-65, 2^32-99 are all primes, and you can also use 2^32 (your integer check), which is also coprime with them. You just perform the integer check, then check mod these five numbers (four if you restrict a, b to be less than 2^63). If you can use 64bit integers native to the processor (you can in C, with multiplication that gives you a 128-bit integer, in the form of hi- and low-64 bits), you may simply try 2^64 (the 64-bit integer check), 2^64-59, and (optionally, unnecessary for a,b > 2^63) 2^64-83.

A more interesting fact is that the residue mod a number of the form 2^32-a or 2^64-a for a small a takes much less time to compute than other modulus. Indeed, take 2^32-a as an example. If I want to compute n mod 2^32-a, with n a 64bit integer written as n1*2^32+n2, then we have

n mod 2^32-a = n1*a + n2 mod 2^32 - a

With a shift, a multiplication (with small number!) and an addition, we almost get the residue. On modern processors, this combinations gives you typically a large speed up, since division is much slower than multiplication. Of course, I say almost, because it is possible that n1*a + n2 is larger than 2^32 - a. In this case, if n1*a + n2 is larger than 2^32, you just do it again and it is good; otherwise, you substract 2^32 - a (or add a and take the lower 32 bits) and it is also good. The test is easy and fast.

On modern processors, integer instructions are much, much faster than float point ones. Especially you used extended double (80-bit) here, which must use the FPU. Square root in FPU takes about 40 cycles, while integer multiplication in CPU takes only 3 to 4 cycles (I look up some old reference, may be even better now). So I think there is a margin that you can gain here.

And for your information, on modern processors, checking for byte is not faster than checking for integer. In fact, the most natural check is to the native word length, which is 64 in your case.

2. Where is the hotspot?

I don't know if you have already profile your program and see the hotspot that takes most of the time, and try to improve it. I don't know much about Delphi (I can write code in it, but I mainly works on C), but there must be some tools that you can use for Delphi (or Pascal). Since integer factorization is much more expensive than any other part of your algorithm, I think that up to some turning point, the main cost will be on factorization. In this case, I have a good way to totally eliminate the cost of factorization, but it will complicate your code and the project for quite a bit. I don't recommend to do it unless you find out the hotspot in your current program. Determining hotspot is the basic of code optimization.

Hope that these commentaries can help you.

Verfasst: **01.11.2017 21:12**

Hi, fwjmath.

Thank you for your interest to the algorithm. As I mentioned before, I invite you join my Telegram channel to discuss all question in more interactive mode. And also I have some limitations of publishing app code here so if you are interested on code optimization, there's no other way to see it, only private conversation.

Next, you missed, I've already written: "First, this spring I at last learned C (pure C, not C++) and translate my ideas and previous developments on C language.", and also "TURBO PASCAL v. 5.5 and run on PC AT" — there was a quote of Korec words, so forget about Pascal and Delphi.

Regarding

you have g^2=a^2+f^2=b^2+e^2, and you want to check for a^2+b^2=d^2 for some d in the decompositions

I understand what you concern about, I have very laconic solutions for both cases (sum and subtract of squares) for 64-bit CPUs:

Code: Alles auswählen

static __inline__ uint64_t is_sub_of_squares_square_too(uint64_t a, uint64_t b)
{
    if (a == b) return 0;
    if (a < b) xchgu64(&a, &b);
    __uint128_t p = (__uint128_t)a*(__uint128_t)a - (__uint128_t)b*(__uint128_t)b;
    uint64_t c = rintl(sqrtl(p));
    return (p == (__uint128_t)c*(__uint128_t)c) ? c : 0;
}

static __inline__ uint64_t is_sum_of_squares_square_too(uint64_t a, uint64_t b)
{
    __uint128_t p = (__uint128_t)a*(__uint128_t)a + (__uint128_t)b*(__uint128_t)b;
    uint64_t c = rintl(sqrtl(p));
    return (p == (__uint128_t)c*(__uint128_t)c) ? c : 0;
}

If you have ideas how to make these routines faster, you are welcome!

Regarding

Since integer factorization is much more expensive than any other part of your algorithm, I think that up to some turning point, the main cost will be on factorization

You are right, during app evolution there was time when factorization was the most expensive routine, but after August'17 refactoring (thanks to Rob D. Matson) and entire changing factorization/filtering method to sieve, the share of factorization/filtering became insignificant. Just for comparison:

factoring/filtering only:

Code: Alles auswählen

Research Range borders  : from 1000000000000001 to 1000000099999997 step 4
Total amount of Numbers : 25000000
Block Size              : 100000
Starting date timestamp : 28.07.2017 05:06:35
Elapsed time            : 00:00:06.217
Candidates investigated : 4038022

full-functional app with factoring/filtering, decomposition and testing routines:

Code: Alles auswählen

Research Range borders  : from 1000000000000001 to 1000000099999997 step 4
Total amount of Numbers : 25000000
Factoring Block Size    : 100000
Starting date timestamp : 01.11.2017 21:48:55
Elapsed time            : 00:08:53.187
Candidates investigated : 4038022

So, factoring/filtering takes only 00:00:06.217 = 1.166% of 00:08:53.187

Verfasst: **04.11.2017 13:29**

Hello, guys!
I'm from Russia. My name's Zakhar Pekhterev. Your project is very interesting to us. We investigated this with my colleague Natalia Makarova. So we created fast algorytm that allowed to get good result. We varified 156481 Pythagorean triples with calculating values of space diagonals into all possible Euler briks up to value of space diagonal is g=3918370664973947,8649970910420777... The side's values of last verified cuboid are: a=1238639696233380, b=1239555398362281, c=3504697245902160. Last verified Pythagorean triple is w=120561, v=85239, u=85260. So we didn't find at least one integger value of space diagonal. So we have any teqnical dificaltties, but we can get over them if it's necеssary, however, at this time, we decided that here is no further meaning. Good luck for your searching!

Verfasst: **04.11.2017 15:06**

We got formulas for our calculatings from here: https://en.wikipedia.org/wiki/Euler_brick

Verfasst: **04.11.2017 18:14**

Zak hat geschrieben:We got formulas for our calculatings from here: https://en.wikipedia.org/wiki/Euler_brick

I think that the parametric family there does not cover all possible Euler bricks. So I think that what you and your colleague checked is an infinite family of Euler bricks, but not all possible Euler bricks.

Reference: http://mathworld.wolfram.com/EulerBrick.html

Verfasst: **04.11.2017 20:34**

Can you give an example of an Euler brick that does not match Sounderson's parametric formulas? Do you know parametric formulas found by Euler himself?

Verfasst: **04.11.2017 20:57**

Zak hat geschrieben:Can you give an example of an Euler brick that does not match Sounderson's parametric formulas? Do you know parametric formulas found by Euler himself?

For example, (240, 252, 275) is an Euler brick, but not in Sounderson's parametrization. I don't have reference for Euler's parametrization though.

And it has been proved that Sounderson's parametrization will never give a perfect cuboid.

Reference: W. Spohn. On the integral cuboid. American Mathematical Monthly, 79:57–59, 1972

Rechenkraft.net e.V.

[sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid

Re: [sub-project] Perfect cuboid