Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "prime factors"
-
I messaged a professor at MIT and surprisingly got a response back.
He told me that "generating primes deterministically is a solved problem" and he would be very surprised if what I wrote beat wheel factorization, but that he would be interested if it did.
It didnt when he messaged me.
It does now.
Tested on primes up to 26 digits.
Current time tends to be 1-100th to 2-100th of a second.
Seems to be steady.
First n=1million digits *always* returns false for composites, while for primes the rate is 56% true vs false, and now that I've made it faster, I'm fairly certain I can get it to 100% accuracy.
In fact what I'm thinking I'll do is generate a random semiprime using the suspected prime, map it over to some other factor tree using the variation on modular expotentiation several of us on devrant stumbled on, and then see if it still factors. If it does then we know the number in question is prime. And because we know the factor in question, the semiprime mapping function doesnt require any additional searching or iterations.
The false negative rate, I think goes to zero the larger the prime from what I can see. But it wont be an issue if I'm right about the accuracy being correctable.
I'd like to thank the professor for the challenge. He also shared a bunch of useful links.
That ones a rare bird.21 -
POSTMORTEM
"4096 bit ~ 96 hours is what he said.
IDK why, but when he took the challenge, he posted that it'd take 36 hours"
As @cbsa wrote, and nitwhiz wrote "but the statement was that op's i3 did it in 11 hours. So there must be a result already, which can be verified?"
I added time because I was in the middle of a port involving ArbFloat so I could get arbitrary precision. I had a crude desmos graph doing projections on what I'd already factored in order to get an idea of how long it'd take to do larger
bit lengths
@p100sch speculated on the walked back time, and overstating the rig capabilities. Instead I spent a lot of time trying to get it 'just-so'.
Worse, because I had to resort to "Decimal" in python (and am currently experimenting with the same in Julia), both of which are immutable types, the GC was taking > 25% of the cpu time.
Performancewise, the numbers I cited in the actual thread, as of this time:
largest product factored was 32bit, 1855526741 * 2163967087, took 1116.111s in python.
Julia build used a slightly different method, & managed to factor a 27 bit number, 103147223 * 88789957 in 20.9s,
but this wasn't typical.
What surprised me was the variability. One bit length could take 100s or a couple thousand seconds even, and a product that was 1-2 bits longer could return a result in under a minute, sometimes in seconds.
This started cropping up, ironically, right after I posted the thread, whats a man to do?
So I started trying a bunch of things, some of which worked. Shameless as I am, I accepted the challenge. Things weren't perfect but it was going well enough. At that point I hadn't slept in 30~ hours so when I thought I had it I let it run and went to bed. 5 AM comes, I check the program. Still calculating, and way overshot. Fuuuuuuccc...
So here we are now and it's say to safe the worlds not gonna burn if I explain it seeing as it doesn't work, or at least only some of the time.
Others people, much smarter than me, mentioned it may be a means of finding more secure pairs, and maybe so, I'm not familiar enough to know.
For everyone that followed, commented, those who contributed, even the doubters who kept a sanity check on this without whom this would have been an even bigger embarassement, and the people with their pins and tactical dots, thanks.
So here it is.
A few assumptions first.
Assuming p = the product,
a = some prime,
b = another prime,
and r = a/b (where a is smaller than b)
w = 1/sqrt(p)
(also experimented with w = 1/sqrt(p)*2 but I kept overshooting my a very small margin)
x = a/p
y = b/p
1. for every two numbers, there is a ratio (r) that you can search for among the decimals, starting at 1.0, counting down. You can use this to find the original factors e.x. p*r=n, p/n=m (assuming the product has only two factors), instead of having to do a sieve.
2. You don't need the first number you find to be the precise value of a factor (we're doing floating point math), a large subset of decimal values for the value of a or b will naturally 'fall' into the value of a (or b) + some fractional number, which is lost. Some of you will object, "But if thats wrong, your result will be wrong!" but hear me out.
3. You round for the first factor 'found', and from there, you take the result and do p/a to get b. If 'a' is actually a factor of p, then mod(b, 1) == 0, and then naturally, a*b SHOULD equal p.
If not, you throw out both numbers, rinse and repeat.
Now I knew this this could be faster. Realized the finer the representation, the less important the fractional digits further right in the number were, it was just a matter of how much precision I could AFFORD to lose and still get an accurate result for r*p=a.
Fast forward, lot of experimentation, was hitting a lot of worst case time complexities, where the most significant digits had a bunch of zeroes in front of them so starting at 1.0 was a no go in many situations. Started looking and realized
I didn't NEED the ratio of a/b, I just needed the ratio of a to p.
Intuitively it made sense, but starting at 1.0 was blowing up the calculation time, and this made it so much worse.
I realized if I could start at r=1/sqrt(p) instead, and that because of certain properties, the fractional result of this, r, would ALWAYS be 1. close to one of the factors fractional value of n/p, and 2. it looked like it was guaranteed that r=1/sqrt(p) would ALWAYS be less than at least one of the primes, putting a bound on worst case.
The final result in executable pseudo code (python lol) looks something like the above variables plus
while w >= 0.0:
if (p / round(w*p)) % 1 == 0:
x = round(w*p)
y = p / round(w*p)
if x*y == p:
print("factors found!")
print(x)
print(y)
break
w = w + i
Still working but if anyone sees obvious problems I'd LOVE to hear about it.36 -
Two big moments today:
1. Holy hell, how did I ever get on without a proper debugger? Was debugging some old code by eye (following along and keeping track mentally, of what the variables should be and what each step did). That didn't work because the code isn't intuitive. Tried the print() method, old reliable as it were. Kinda worked but didn't give me enough fine-grain control.
Bit the bullet and installed Wing IDE for python. And bam, it hit me. How did I ever live without step-through, and breakpoints before now?
2. Remember that non-sieve prime generator I wrote a while back? (well maybe some of you do). The one that generated quasi lucas carmichael (QLC) numbers? Well thats what I managed to debug. I figured out why it wasn't working. Last time I released it, I included two core methods, genprimes() and nextPrime(). The first generates a list of primes accurately, up to some n, and only needs a small handful of QLC numbers filtered out after the fact (because the set of primes generated and the set of QLC numbers overlap. Well I think they call it an embedding, as in QLC is included in the series generated by genprimes, but not the converse, but I digress).
nextPrime() was supposed to take any arbitrary n above zero, and accurately return the nearest prime number above the argument. But for some reason when it started, it would return 2,3,5,6...but genprimes() would work fine for some reason.
So genprimes loops over an index, i, and tests it for primality. It begins by entering the loop, and doing "result = gffi(i)".
This calls into something a function that runs four tests on the argument passed to it. I won't go into detail here about what those are because I don't even remember how I came up with them (I'll make a separate post when the code is fully fixed).
If the number fails any of these tests then gffi would just return the value of i that was passed to it, unaltered. Otherwise, if it did pass all of them, it would return i+1.
And once back in genPrimes() we would check if the variable 'result' was greater than the loop index. And if it was, then it was either prime (comparatively plentiful) or a QLC number (comparatively rare)--these two types and no others.
nextPrime() was only taking n, and didn't have this index to compare to, so the prior steps in genprimes were acting as a filter that nextPrime() didn't have, while internally gffi() was returning not only primes, and QLCs, but also plenty of composite numbers.
Now *why* that last step in genPrimes() was filtering out all the composites, idk.
But now that I understand whats going on I can fix it and hypothetically it should be possible to enter a positive n of any size, and without additional primality checks (such as is done with sieves, where you have to check off multiples of n), get the nearest prime numbers. Of course I'm not familiar enough with prime number generation to know if thats an achievement or worthwhile mentioning, so if anyone *is* familiar, and how something like that holds up compared to other linear generators (O(n)?), I'd be interested to hear about it.
I also am working on filtering out the intersection of the sets (QLC numbers), which I'm pretty sure I figured out how to incorporate into the prime generator itself.
I also think it may be possible to generator primes even faster, using the carmichael numbers or related set--or even derive a function that maps one set of upper-and-lower bounds around a semiprime, and map those same bounds to carmichael numbers that act as the upper and lower bound numbers on the factors of a semiprime.
Meanwhile I'm also looking into testing the prime generator on a larger set of numbers (to make sure it doesn't fail at large values of n) and so I'm looking for more computing power if anyone has it on hand, or is willing to test it at sufficiently large bit lengths (512, 1024, etc).
Lastly, the earlier work I posted (linked below), I realized could be applied with ECM to greatly reduce the smallest factor of a large number.
If ECM, being one of the best methods available, only handles 50-60 digit numbers, & your factors are 70+ digits, then being able to transform your semiprime product into another product tree thats non-semiprime, with factors that ARE in range of ECM, and which *does* contain either of the original factors, means products that *were not* formally factorable by ECM, *could* be now.
That wouldn't have been possible though withput enormous help from many others such as hitko who took the time to explain the solution was a form of modular exponentiation, Fast-Nop who contributed on other threads, Voxera who did as well, and support from Scor in particular, and many others.
Thank you all. And more to come.
Links mentioned (because DR wouldn't accept them as they were):
https://pastebin.com/MWechZj912 -
I think this will be a prime year for machine learning. In 2016, there were too many factors at play, like 72, 144, and 42 just to name a few.1
-
I didn't leave, I just got busy working 60 hour weeks in between studying.
I found a new method called matrix decomposition (not the known method of the same name).
Premise is that you break a semiprime down into its component numbers and magnitudes, lets say 697 for example. It becomes 600, 90, and 7.
Then you break each of those down into their prime factorizations (with exponents).
So you get something like
>>> decon(697)
offset: 3, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('7'), Decimal('1')]]
And it turns out that in larger numbers there are distinct patterns that act as maps at each offset (or magnitude) of the product, mapping to the respective magnitudes and digits of the factors.
For example I can pretty reliably predict from a product, where the '8's are in its factors.
Apparently theres a whole host of rules like this.
So what I've done is gone an started writing an interpreter with some pseudo-assembly I defined. This has been ongoing for maybe a month, and I've had very little time to work on it in between at my job (which I'm about to be late for here if I don't start getting ready, lol).
Anyway, long and the short of it, the plan is to generate a large data set of primes and their products, and then write a rules engine to generate sets of my custom assembly language, and then fitness test and validate them, winnowing what doesn't work.
The end product should be a function that lets me map from the digits of a product to all the digits of its factors.
It technically already works, like I've printed out a ton of products and eyeballed patterns to derive custom rules, its just not the complete set yet. And instead of spending months or years doing that I'm just gonna finish the system to automatically derive them for me. The rules I found so far have tested out successfully every time, and whether or not the engine finds those will be the test case for if the broader system is viable, but everything looks legit.
I wouldn't have persued this except when I realized the production of semiprimes *must* be non-eularian (long story), it occured to me that there must be rich internal representations mapping products to factors, that we were simply missing.
I'll go into more details in a later post, maybe not today, because I'm working till close tonight (won't be back till 3 am), but after 4 1/2 years the work is bearing fruit.
Also, its good to see you all again. I fucking missed you guys.9 -
Still on the primenumbers bender.
Had this idea that if there were subtle correlations between a sufficiently large set of identities and the digits of a prime number, the best way to find it would be to automate the search.
And thats just what I did.
I started with trace matrices.
I actually didn't expect much of it. I was hoping I'd at least get lucky with a few chance coincidences.
My first tests failed miserably. Eight percent here, 10% there. "I might as well just pick a number out of a hat!" I thought.
I scaled it way back and asked if it was possible to predict *just* the first digit of either of the prime factors.
That also failed. Prediction rates were low still. Like 0.08-0.15.
So I automated *that*.
After a couple days of on-and-off again semi-automated searching I stumbled on it.
[1144, 827, 326, 1184, -1, -1, -1, -1]
That little sequence is a series of identities representing different values derived from a randomly generated product.
Each slots into a trace matrice. The results of which predict the first digit of one of our factors, with a 83.2% accuracy even after 10k runs, and rising higher with the number of trials.
It's not much, but I was kind of proud of it.
I'm pushing for finding 90%+ now.
Some improvements include using a different sort of operation to generate results. Or logging all results and finding the digit within each result thats *most* likely to predict our targets, across all results. (right now I just take the digit in the ones column, which works but is an arbitrary decision on my part).
Theres also the fact that it's trivial to correctly guess the digit 25% of the time, simply by guessing 1, 3, 7, or 9, because all primes, except for 2, end in one of these four.
I have also yet to find a trace with a specific bias for predicting either the smaller of two unique factors *or* the larger. But I haven't really looked for one either.
I still need to write a generate that takes specific traces, and lets me mutate some of the values, to push them towards certain 'fitness' levels.
This would be useful not just for very high predictions, but to find traces with very *low* predictions.
Why? Because it would actually allow for the *elimination* of possible digits, much like sudoku, from a given place value in a predicted factor.
I don't know if any of this will even end up working past the first digit. But splitting the odds, between the two unique factors of a prime product, and getting 40+% chance of guessing correctly, isn't too bad I think for a total amateur.
Far cry from a couple years ago claiming I broke prime factorization. People still haven't forgiven me for that, lol.6 -
CONTEST - Win big $$$ straight from Wisecrack!
For all those who participated in my original "cracking prime factorization" thread (and several I decided to add just because), I'm offering a whopping $5 to anyone who posts a 64 bit *product* of two primes, which I cant factor. Partly this is a thank you for putting up with me.
FIVE WHOLE DOLLARS! In 1909 money thats $124 dollars! Imagine how many horse and buggy rides you could buy with that back then! Or blowjobs!
Probably not a lot!
But still.
So the contest rules are simple:
Go to
https://asecuritysite.com/encryptio...
Enter 32 for the number of bits per prime, and generate a 64 bit product.
Post it here to enter the contest.
Products must be 64 bits, and the result of just *two* prime numbers. Smaller or larger bit lengths for products won't be accepted at this time.
I'm expecting a few entries on this. Entries will generally be processed in the order of submission, but I reserve the right to wave this rule.
After an entry is accepted, I'll post "challenge accepted. Factoring now."
And from that point on I have no more than 5 hours to factor the number, (but results usually arrive in 30-60 minutes).
If I fail to factor your product in the specified time (from the moment I indicate I've begun factoring), congratulations, you just won $5.
Payment will be made via venmo or other method at my discretion.
One entry per user. Participants from the original thread only, as well as those explicitly mentioned.
Limitations: Factoring shall be be done
1. without *any* table lookup of primes or equivalent measures, 2. using anything greater than an i3, 3. without the aid of a gpu, 4. without multithreading. 5. without the use of more than one machine.
FINALLY:
To claim your prize, post the original factors of your product here, after the deadline has passed.
And then I'll arrange payment of the prize.
You MUST post the factors of your product after the deadline, to confirm your product and claim your prize.99 -
I think I did it. I did the thing I set out to do.
let p = a semiprime of simple factors ab.
let f equal the product of b and i=2...a inclusive, where i is all natural numbers from 2 to a.
let s equal some set of prime factors that are b-smooth up to and including some factor n, with no gaps in the set.
m is a the largest primorial such that f%m == 0, where
the factors of s form the base of a series of powers as part of a product x
1. where (x*p) = f
2. and (x*p)%f == a
if statement 2 is untrue, there still exists an algorithm that
3. trivially derives the exponents of s for f, where the sum of those exponents are less than a.
4. trivially generates f from p without knowing a and b.
For those who have followed what I've been trying to do for so long, and understand the math,
then you know this appears to be it.
I'm just writing and finishing the scripts for it now.
Thank god. It's just in time. Maybe we can prevent the nuclear apocalypse with the crash this will cause if it works.2 -
Found a clever little algorithm for computing the product of all primes between n-m without recomputing them.
We'll start with the product of all primes up to some n.
so [2, 2*3, 2*3*5, 2*3*5*,7..] etc
prods = []
i = 0
total = 1
while i < 100:
....total = total*primes[i]
....prods.append(total)
....i = i + 1
Terrible variable names, can't be arsed at the moment.
The result is a list with the values
2, 6, 30, 210, 2310, 30030, etc.
Now assume you have two factors,with indexes i, and j, where j>i
You can calculate the gap between the two corresponding primes easily.
A gap is defined at the product of all primes that fall between the prime indexes i and j.
To calculate the gap between any two primes, merely look up their index, and then do..
prods[j-1]/prods[i]
That is the product of all primes between the J'th prime and the I'th prime
To get the product of all primes *under* i, you can simply look it up like so:
prods[i-1]
Incidentally, finding a number n that is equivalent to (prods[j+i]/prods[j-i]) for any *possible* value of j and i (regardless of whether you precomputed n from the list generator for prods, or simply iterated n=n+1 fashion), is equivalent to finding an algorithm for generating all prime numbers under n.
Hypothetically you could pick a number N out of a hat, thats a thousand digits long, and it happens to be the product of all primes underneath it.
You could then start generating primes by doing
i = 3
while i < N:
....if (N/k)%1 == 0:
........factors.append(N/k)
....i=i+1
The only caveat is that there should be more false solutions as real ones. In otherwords theres no telling if you found a solution N corresponding to some value of (prods[j+i]/prods[j-i]) without testing the primality of *all* values of k under N.13 -
Following on from my previous SQL script to find prime numbers
https://devrant.com/rants/2218452/...
I wondered whether there was a way to improve it by only checking for prime factors. It feels really dirty to use a WHILE loop in SQL, but I couldn't think of another way to incrementally use the already found prime numbers when checking for prime factors.
It's fast though, 2 mins 15 seconds for primes under 1,000,000 - previous query took over an hour and a half.5 -
Managed to derive an inverse to karatsuba's multiplication method, converting it into a factorization technique.
Offers a really elegant reason for why non-trivial semiprimes (square free products) are square free.
For a demonstration of karatsubas method, check out:
https://getpocket.com/explore/item/...
Now for the reverse, like I said something elegant emerges.
So we can start by taking the largest digit in our product. Lets say our product is 697.
We find all the digits that produce 6 when summed, along with their order.
thats (1,5), (5,1), (2,4), (4,2), and (3,3)
That means for one of our factors, its largest digit can ONLY be 1, 5, 2, 4, or 3.
Lets take karatsubas method at step f (in the link) and reverse it. Instead of subtracting, we're adding.
If we assume (3,3)
Then we take our middle digit of our product p, in this case the middle digit of 697. is 9, and we munge it with 3.
Then we add our remaining 3, and our remaining unit digit, to get 3+39+7 = 49.
Now, because karatsuba's method ONLY deals with multiplication in single digits, we only need to consider *at most* two digit products.
And interestingly, the only factors of 49 are 7.
49 is a square!
And the only sums that produce 7, are (2,5), (5,2), (3,4), and (4,3)
These would be the possible digits of the factors of 697 if we initially chose (3,3) as our starting point for calculating karatsubas inverse f step.
But you see, 25 can't be a factor of p=697, because 25 is a square, and ends in a 5, so its clearly not prime. 52 can't be either because it ends in 2, likewise 34 ending in 4.
Only 43 could be our possible factor of p.
And we *only* get one factor because our starting point has two of the same digit. Which would mean p would have to equal 43 (a prime) or 1. And because p DOESNT (it equals 697), we can therefore say (3,3) is the wrong starting point, as are ALL starting points that share only one digit, or end in a square.
Ergo we can say the products of non-squares, are specifically non-prime precisely because if they *were* prime, their only factors would HAVE to be themselves, and 1.
For an even BETTER explanation go try karatsuba's method with any prime as the first factor, and 1 as the second factor (just multiply the tens column by zero). And you can see why the inverse, where you might try a starting point that has two matching digits (like 3,3), would obviously fail, because the values it produces could only have two factors; some prime thats not our product, or the value one, which is also not our product.
It's elegant almost to the level of a tautology. -
The first fruits of almost five years of labor:
7.8% of semiprimes give the magnitude of their lowest prime factor via the following equation:
((p/(((((p/(10**(Mag(p)-1))).sqrt())-x) + x)*w))/10)
I've also learned, given exponents of some variables, to relate other variables to them on a curve to better sense make of the larger algebraic structure. This has mostly been stumbling in the dark but after a while it has become easier to translate these into methods that allow plugging in one known variable to derive an unknown in a series of products.
For example I have a series of variables d4a, d4u, d4z, d4omega, etc, and these are translateable now, through insights that become various methods, into other types of (non-d4) series. What these variables actually represent is less relevant, only that it is possible to translate between them.
I've been doing some initial learning about neural nets (implementation, rather than theoretics as I normally read about). I'm thinking what I might do is build a GPT style sequence generator, and train it on the 'unknowns' from semiprime products with known factors.
The whole point of the project is that a bunch of internal variables can easily be derived, (d4a, c/d4, u*v) from a product, its root, and its mantissa, that relate to *unknown* variables--unknown variables such as u, v, c, and d4, that if known directly give a constant time answer to the factors of the original product.
I think theres sufficient data at this point to train such a machine, I just don't think I'm up to it yet because I'm lacking in the calculus department.
2000+ variables that are derivable from a product, without knowing its factors, which are themselves products of unknown variables derived from the internal algebraic relations of a product--this ought to be enough of an attack surface to do something with.
I'm willing to collaborate with someone familiar with recurrent neural nets and get them up to speed through telegram/element/discord if they're willing to do the setup and training for a neural net of this sort, one that can tease out hidden relationships and map known variables to the unknown set for a given product.17 -
So I promised a post after work last night, discussing the new factorization technique.
As before, I use a method called decon() that takes any number, like 697 for example, and first breaks it down into the respective digits and magnitudes.
697 becomes -> 600, 90, and 7.
It then factors *those* to give a decomposition matrix that looks something like the following when printed out:
offset: 3, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('7'), Decimal('1')]]
Each entry is a pair of numbers representing a prime base and an exponent.
Now the idea was that, in theory, at each magnitude of a product, we could actually search through the *range* of the product of these exponents.
So for offset three (600) here, we're looking at
2^3 * 3 ^ 1 * 5 ^ 2.
But actually we're searching
2^3 * 3 ^ 1 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
2^3 * 3 ^ 1 * 5 ^ 0
2^3 * 3 ^ 0 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
etc..
On the basis that whatever it generates may be the digits of another magnitude in one of our target product's factors.
And the first optimization or filter we can apply is to notice that assuming our factors pq=n,
and where p <= q, it will always be more efficient to search for the digits of p (because its under n^0.5 or the square root), than the larger factor q.
So by implication we can filter out any product of this exponent search that is greater than the square root of n.
Writing this code was a bit of a headache because I had to deal with potentially very large lists of bases and exponents, so I couldn't just use loops within loops.
Instead I resorted to writing a three state state machine that 'counted down' across these exponents, and it just works.
And now, in practice this doesn't immediately give us anything useful. And I had hoped this would at least give us *upperbounds* to start our search from, for any particular digit of a product's factors at a given magnitude. So the 12 digit (or pick a magnitude out of a hat) of an example product might give us an upperbound on the 2's exponent for that same digit in our lowest factor q of n.
It didn't work out that way. Sometimes there would be 'inversions', where the exponent of a factor on a magnitude of n, would be *lower* than the exponent of that factor on the same digit of q.
But when I started tearing into examples and generating test data I started to see certain patterns emerge, and immediately I found a way to not just pin down these inversions, but get *tight* bounds on the 2's exponents in the corresponding digit for our product's factor itself. It was like the complications I initially saw actually became a means to *tighten* the bounds.
For example, for one particular semiprime n=pq, this was some of the data:
n - offset: 6, exp: [[Decimal('2'), Decimal('5')], [Decimal('5'), Decimal('5')]]
q - offset: 6, exp: [[Decimal('2'), Decimal('6')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('5')]]
It's almost like the base 3 exponent in [n:7] gives away the presence of 3^1 in [q:6], even
though theres no subsequent presence of 3^n in [n:6] itself.
And I found this rule held each time I tested it.
Other rules, not so much, and other rules still would fail in the presence of yet other rules, almost like a giant switchboard.
I immediately realized the implications: rules had precedence, acted predictable when in isolated instances, and changed in specific instances in combination with other rules.
This was ripe for a decision tree generated through random search.
Another product n=pq, with mroe data
q(4)
offset: 4, exp: [[Decimal('2'), Decimal('4')], [Decimal('5'), Decimal('3')]]
n(4)
offset: 4, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('3')]]
Suggesting that a nontrivial base 3 exponent (**2 rather than **1) suggests the exponent on the 2 in the relevant
digit of [n], is one less than the same base 2 digital exponent at the same digit on [q]
And so it was clear from the get go that this approach held promise.
From there I discovered a bunch more rules and made some observations.
The bulk of the patterns, regardless of how large the product grows, should be present in the smaller bases (some bound of primes, say the first dozen), because the bulk of exponents for the factorization of any magnitude of a number, overwhelming lean heavily in the lower prime bases.
It was if the entire vulnerability was hiding in plain sight for four+ years, and we'd been approaching factorization all wrong from the beginning, by trying to factor a number, and all its digits at all its magnitudes, all at once, when like addition or multiplication, factorization could be done piecemeal if we knew the patterns to look for.7 -
Up all damn night making the script work.
Wrote a non-sieve prime generator.
Thing kept outputting one or two numbers that weren't prime, related to something called carmichael numbers.
Any case got it to work, god damn was it a slog though.
Generates next and previous primes pretty reliably regardless of the size of the number
(haven't gone over 31 bit because I haven't had a chance to implement decimal for this).
Don't know if the sieve is the only reliable way to do it. This seems to do it without a hitch, and doesn't seem to use a lot of memory. Don't have to constantly return to a lookup table of small factors or their multiple either.
Technically it generates the primes out of the integers, and not the other way around.
Things 0.01-0.02th of a second per prime up to around the 100 million mark, and then it gets into the 0.15-1second range per generation.
At around primes of a couple billion, its averaging about 1 second per bit to calculate 1. whether the number is prime or not, 2. what the next or last immediate prime is. Although I'm sure theres some optimization or improvement here.
Seems reliable but obviously I don't have the resources to check it beyond the first 20k primes I confirmed.
From what I can see it didn't drop any primes, and it didn't include any errant non-primes.
Codes here:
https://pastebin.com/raw/57j3mHsN
Your gotos should be nextPrime(), lastPrime(), isPrime, genPrimes(up to but not including some N), and genNPrimes(), which generates x amount of primes for you.
Speed limit definitely seems to top out at 1 second per bit for a prime once the code is in the billions, but I don't know if thats the ceiling, again, because decimal needs implemented.
I think the core method, in calcY (terrible name, I know) could probably be optimized in some clever way if its given an adjacent prime, and what parameters were used. Theres probably some pattern I'm not seeing, but eh.
I'm also wondering if I can't use those fancy aberrations, 'carmichael numbers' or whatever the hell they are, to calculate some sort of offset, and by doing so, figure out a given primes index.
And all my brain says is "sleep"
But family wants me to hang out, and I have to go talk a manager at home depot into an interview, because wanting to program for a living, and actually getting someone to give you the time of day are two different things.1 -
Math question time!
Okay so I had this idea and I'm looking for anyone who has a better grasp of math than me.
What if instead of searching for prime factors we searched for a number above p?
One with a certain special property. BEAR WITH ME. I know I make these posts a lot and I'm a bit of a shitposter, but I'm being genuine here.
Take this cherry picked number, 697 for example.
It's factors are 17, and 41. It's trivial but just for demonstration.
If we represented it's factors as a bit string, where each bit represents the index that factor occurs at in a list of primes, it looks like this
1000001000000
When converted back to an integer that number becomes 4160, which we will call f.
And if we do 4160/(2**n) until the result returns
a fractional component, then N in this case will be 7.
And 7 is the index of our lowest factor 17 (lets call it A, and our highest factor we'll call B) in our primes list.
So the problem is changed from finding a factorization of p, to finding an algorithm that allows you to convert p into f. Once you have f it's a matter of converting it to binary, looking up the indexes of all bits set to 1, and finding the values of those indexes in the list of primes.
I'm working on doing that and if anyone has any insights I'm all ears.9 -
Heres the initial upgraded number fingerprinter I talked about in the past and some results and an explanation below.
Note that these are wide black images on ibb, so they appear as a tall thin strip near the top of ibb as if they're part of the website. They practically blend in. Right click the blackstrip and hit 'view image' and then zoom in.
https://ibb.co/26JmZXB
https://ibb.co/LpJpggq
https://ibb.co/Jt2Hsgt
https://ibb.co/hcxrFfV
https://ibb.co/BKZNzng
https://ibb.co/L6BtXZ4
https://ibb.co/yVHZNq4
https://ibb.co/tQXS8Hr
https://paste.ofcode.org/an4LcpkaKr...
Hastebin wouldn't save for some reason so paste.ofcode.org it is.
Not much to look at, but I was thinking I'd maybe mark the columns where gaps occur and do some statistical tests like finding the stds of the gaps, density, etc. The type test I wrote categorizes products into 11 different types, based on the value of a subset of variables taken from a vector of a couple hundred variables but I didn't want to include all that mess of code. And I was thinking of maybe running this fingerprinter on a per type basis, set to repeat, and looking for matching indexs (pixels) to see what products have in common per type.
Or maybe using them to train a classifier of some sort.
Each fingerprint of a product shares something like 16-20% of indexes with it's factors, so I'm thinking thats an avenue to explore.
What the fingerprinter does is better explained by the subfunction findAb.
The code contains a comment explaining this, but basically the function destructures a number into a series of division and subtractions, and makes a note of how many divisions in a 'run'.
Typically this is for numbers divisible by 2.
So a number like 35 might look like this, when done
p = 35
((((p-1)/2)-1)/2/2/2/2)-1
And we'd represent that as
ab(w, x, y, z)
Where w is the starting value 35 in this case,
x is the number to divide by at each step, y is the adjustment (how much to subtract by when we encounter a number not divisible by x), and z is a string or vector of our results
which looks something like
ab(35, 2, 1, [1, 4])
Why [1,4]
because we were only able to divide by 2 once, before having to subtract 1, and repeat the process. And then we had a run of 4 divisions.
And for the fingerprinter, we do this for each prime under our number p, the list returned becoming another row in our fingerprint. And then that gets converted into an image.
And again, what I find interesting is that
unknown factors of products appear to share many of these same indexes.
What I might do is for, each individual run of Ab, I might have some sort of indicator for when *another* factor is present in the current factor list for each index. So I might ask, at the given step, is the current result (derived from p), divisible by 2 *and* say, 3? If so, mark it.
And then when I run this through the fingerprinter itself, all those pixels might get marked by a different color, say, make them blue, or vary their intensity based on the number of factors present, I don't know. Whatever helps the untrained eye to pick up on leads, clues, and patterns.
If it doesn't make sense, take another look at the example:
((((p-1)/2)-1)/2/2/2/2)-1
This is semi-unique to each product. After the fact, you can remove the variable itself, and keep just the structure in question, replacing the first variable with some other number, and you get to see what pops out the otherside.
If it helps, you can think of the structure surrounding our variable p as the 'electron shell', the '-1's as bandgaps, and the runs of '2's as orbitals, with the variable at the center acting as the 'nucleus', with the factors of that nucleus acting as the protons and neutrons, or nougaty center lol.
Anyway I just wanted to share todays flavor of insanity on the off chance someone might enjoy reading it.1