Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "factorization"
-
So I cracked prime factorization. For real.
I can factor a 1024 bit product in 11hours on an i3.
No GPU acceleration, no massive memory overhead. Probably a lot faster with parallel computation on a better cpu, or even on a gpu.
4096 bits in 97-98 hours.
Verifiable. Not shitting you. My hearts beating out of my fucking chest. Maybe it was an act of god, I don't know, but it works.
What should I do with it?240 -
I messaged a professor at MIT and surprisingly got a response back.
He told me that "generating primes deterministically is a solved problem" and he would be very surprised if what I wrote beat wheel factorization, but that he would be interested if it did.
It didnt when he messaged me.
It does now.
Tested on primes up to 26 digits.
Current time tends to be 1-100th to 2-100th of a second.
Seems to be steady.
First n=1million digits *always* returns false for composites, while for primes the rate is 56% true vs false, and now that I've made it faster, I'm fairly certain I can get it to 100% accuracy.
In fact what I'm thinking I'll do is generate a random semiprime using the suspected prime, map it over to some other factor tree using the variation on modular expotentiation several of us on devrant stumbled on, and then see if it still factors. If it does then we know the number in question is prime. And because we know the factor in question, the semiprime mapping function doesnt require any additional searching or iterations.
The false negative rate, I think goes to zero the larger the prime from what I can see. But it wont be an issue if I'm right about the accuracy being correctable.
I'd like to thank the professor for the challenge. He also shared a bunch of useful links.
That ones a rare bird.21 -
POSTMORTEM
"4096 bit ~ 96 hours is what he said.
IDK why, but when he took the challenge, he posted that it'd take 36 hours"
As @cbsa wrote, and nitwhiz wrote "but the statement was that op's i3 did it in 11 hours. So there must be a result already, which can be verified?"
I added time because I was in the middle of a port involving ArbFloat so I could get arbitrary precision. I had a crude desmos graph doing projections on what I'd already factored in order to get an idea of how long it'd take to do larger
bit lengths
@p100sch speculated on the walked back time, and overstating the rig capabilities. Instead I spent a lot of time trying to get it 'just-so'.
Worse, because I had to resort to "Decimal" in python (and am currently experimenting with the same in Julia), both of which are immutable types, the GC was taking > 25% of the cpu time.
Performancewise, the numbers I cited in the actual thread, as of this time:
largest product factored was 32bit, 1855526741 * 2163967087, took 1116.111s in python.
Julia build used a slightly different method, & managed to factor a 27 bit number, 103147223 * 88789957 in 20.9s,
but this wasn't typical.
What surprised me was the variability. One bit length could take 100s or a couple thousand seconds even, and a product that was 1-2 bits longer could return a result in under a minute, sometimes in seconds.
This started cropping up, ironically, right after I posted the thread, whats a man to do?
So I started trying a bunch of things, some of which worked. Shameless as I am, I accepted the challenge. Things weren't perfect but it was going well enough. At that point I hadn't slept in 30~ hours so when I thought I had it I let it run and went to bed. 5 AM comes, I check the program. Still calculating, and way overshot. Fuuuuuuccc...
So here we are now and it's say to safe the worlds not gonna burn if I explain it seeing as it doesn't work, or at least only some of the time.
Others people, much smarter than me, mentioned it may be a means of finding more secure pairs, and maybe so, I'm not familiar enough to know.
For everyone that followed, commented, those who contributed, even the doubters who kept a sanity check on this without whom this would have been an even bigger embarassement, and the people with their pins and tactical dots, thanks.
So here it is.
A few assumptions first.
Assuming p = the product,
a = some prime,
b = another prime,
and r = a/b (where a is smaller than b)
w = 1/sqrt(p)
(also experimented with w = 1/sqrt(p)*2 but I kept overshooting my a very small margin)
x = a/p
y = b/p
1. for every two numbers, there is a ratio (r) that you can search for among the decimals, starting at 1.0, counting down. You can use this to find the original factors e.x. p*r=n, p/n=m (assuming the product has only two factors), instead of having to do a sieve.
2. You don't need the first number you find to be the precise value of a factor (we're doing floating point math), a large subset of decimal values for the value of a or b will naturally 'fall' into the value of a (or b) + some fractional number, which is lost. Some of you will object, "But if thats wrong, your result will be wrong!" but hear me out.
3. You round for the first factor 'found', and from there, you take the result and do p/a to get b. If 'a' is actually a factor of p, then mod(b, 1) == 0, and then naturally, a*b SHOULD equal p.
If not, you throw out both numbers, rinse and repeat.
Now I knew this this could be faster. Realized the finer the representation, the less important the fractional digits further right in the number were, it was just a matter of how much precision I could AFFORD to lose and still get an accurate result for r*p=a.
Fast forward, lot of experimentation, was hitting a lot of worst case time complexities, where the most significant digits had a bunch of zeroes in front of them so starting at 1.0 was a no go in many situations. Started looking and realized
I didn't NEED the ratio of a/b, I just needed the ratio of a to p.
Intuitively it made sense, but starting at 1.0 was blowing up the calculation time, and this made it so much worse.
I realized if I could start at r=1/sqrt(p) instead, and that because of certain properties, the fractional result of this, r, would ALWAYS be 1. close to one of the factors fractional value of n/p, and 2. it looked like it was guaranteed that r=1/sqrt(p) would ALWAYS be less than at least one of the primes, putting a bound on worst case.
The final result in executable pseudo code (python lol) looks something like the above variables plus
while w >= 0.0:
if (p / round(w*p)) % 1 == 0:
x = round(w*p)
y = p / round(w*p)
if x*y == p:
print("factors found!")
print(x)
print(y)
break
w = w + i
Still working but if anyone sees obvious problems I'd LOVE to hear about it.36 -
As you can see from the screenshot, its working.
The system is actually learning the associations between the digit sequence of semiprime hidden variables and known variables.
Training loss and value loss are super high at the moment and I'm using an absurdly small training set (10k sequence pairs). I'm running on the assumption that there is a very strong correlation between the structures (and that it isn't just all ephemeral).
This initial run is just to see if training an machine learning model is a viable approach.
Won't know for a while. Training loss could get very low (thats a good thing, indicating actual learning), only for it to spike later on, and if it does, I won't know if the sample size is too small, or if I need to do more training, or if the problem is actually intractable.
If or when that happens I'll experiment with different configurations like batch sizes, and more epochs, as well as upping the training set incrementally.
Either case, once the initial model is trained, I need to test it on samples never seen before (products I want to factor) and see if it generates some or all of the digits needed for rapid factorization.
Even partial digits would be a success here.
And I expect to create multiple training sets for each semiprime product and its unknown internal variables versus deriable known variables. The intersections of the sets, and what digits they have in common might be the best shot available for factorizing very large numbers in this approach.
Regardless, once I see that the model works at the small scale, the next step will be to increase the scope of the training data, and begin building out the distributed training platform so I can cut down the training time on a larger model.
I also want to train on random products of very large primes, just for variety and see what happens with that. But everything appears to be working. Working way better than I expected.
The model is running and learning to factorize primes from the set of identities I've been exploring for the last three fucking years.
Feels like things are paying off finally.
Will post updates specifically to this rant as they come. Probably once a day.2 -
Still on the primenumbers bender.
Had this idea that if there were subtle correlations between a sufficiently large set of identities and the digits of a prime number, the best way to find it would be to automate the search.
And thats just what I did.
I started with trace matrices.
I actually didn't expect much of it. I was hoping I'd at least get lucky with a few chance coincidences.
My first tests failed miserably. Eight percent here, 10% there. "I might as well just pick a number out of a hat!" I thought.
I scaled it way back and asked if it was possible to predict *just* the first digit of either of the prime factors.
That also failed. Prediction rates were low still. Like 0.08-0.15.
So I automated *that*.
After a couple days of on-and-off again semi-automated searching I stumbled on it.
[1144, 827, 326, 1184, -1, -1, -1, -1]
That little sequence is a series of identities representing different values derived from a randomly generated product.
Each slots into a trace matrice. The results of which predict the first digit of one of our factors, with a 83.2% accuracy even after 10k runs, and rising higher with the number of trials.
It's not much, but I was kind of proud of it.
I'm pushing for finding 90%+ now.
Some improvements include using a different sort of operation to generate results. Or logging all results and finding the digit within each result thats *most* likely to predict our targets, across all results. (right now I just take the digit in the ones column, which works but is an arbitrary decision on my part).
Theres also the fact that it's trivial to correctly guess the digit 25% of the time, simply by guessing 1, 3, 7, or 9, because all primes, except for 2, end in one of these four.
I have also yet to find a trace with a specific bias for predicting either the smaller of two unique factors *or* the larger. But I haven't really looked for one either.
I still need to write a generate that takes specific traces, and lets me mutate some of the values, to push them towards certain 'fitness' levels.
This would be useful not just for very high predictions, but to find traces with very *low* predictions.
Why? Because it would actually allow for the *elimination* of possible digits, much like sudoku, from a given place value in a predicted factor.
I don't know if any of this will even end up working past the first digit. But splitting the odds, between the two unique factors of a prime product, and getting 40+% chance of guessing correctly, isn't too bad I think for a total amateur.
Far cry from a couple years ago claiming I broke prime factorization. People still haven't forgiven me for that, lol.6 -
CONTEST - Win big $$$ straight from Wisecrack!
For all those who participated in my original "cracking prime factorization" thread (and several I decided to add just because), I'm offering a whopping $5 to anyone who posts a 64 bit *product* of two primes, which I cant factor. Partly this is a thank you for putting up with me.
FIVE WHOLE DOLLARS! In 1909 money thats $124 dollars! Imagine how many horse and buggy rides you could buy with that back then! Or blowjobs!
Probably not a lot!
But still.
So the contest rules are simple:
Go to
https://asecuritysite.com/encryptio...
Enter 32 for the number of bits per prime, and generate a 64 bit product.
Post it here to enter the contest.
Products must be 64 bits, and the result of just *two* prime numbers. Smaller or larger bit lengths for products won't be accepted at this time.
I'm expecting a few entries on this. Entries will generally be processed in the order of submission, but I reserve the right to wave this rule.
After an entry is accepted, I'll post "challenge accepted. Factoring now."
And from that point on I have no more than 5 hours to factor the number, (but results usually arrive in 30-60 minutes).
If I fail to factor your product in the specified time (from the moment I indicate I've begun factoring), congratulations, you just won $5.
Payment will be made via venmo or other method at my discretion.
One entry per user. Participants from the original thread only, as well as those explicitly mentioned.
Limitations: Factoring shall be be done
1. without *any* table lookup of primes or equivalent measures, 2. using anything greater than an i3, 3. without the aid of a gpu, 4. without multithreading. 5. without the use of more than one machine.
FINALLY:
To claim your prize, post the original factors of your product here, after the deadline has passed.
And then I'll arrange payment of the prize.
You MUST post the factors of your product after the deadline, to confirm your product and claim your prize.99 -
Officially faster bruteforcing:
https://pastebin.com/uBFwkwTj
Provided toy values for others to try. Haven't tested if it works with cryptographic secure prime pairs (gcf(p, q) == 1)
It's a 50% reduction in time to bruteforce a semiprime. But I also have some inroads to a/30.
It's not "broke prime factorization for good!" levels of fast, but its still pretty nifty.
Could use decimal support with higher precision so I don't cause massive overflows on larger numbers, but this is just a demonstration after all.13 -
Managed to derive an inverse to karatsuba's multiplication method, converting it into a factorization technique.
Offers a really elegant reason for why non-trivial semiprimes (square free products) are square free.
For a demonstration of karatsubas method, check out:
https://getpocket.com/explore/item/...
Now for the reverse, like I said something elegant emerges.
So we can start by taking the largest digit in our product. Lets say our product is 697.
We find all the digits that produce 6 when summed, along with their order.
thats (1,5), (5,1), (2,4), (4,2), and (3,3)
That means for one of our factors, its largest digit can ONLY be 1, 5, 2, 4, or 3.
Lets take karatsubas method at step f (in the link) and reverse it. Instead of subtracting, we're adding.
If we assume (3,3)
Then we take our middle digit of our product p, in this case the middle digit of 697. is 9, and we munge it with 3.
Then we add our remaining 3, and our remaining unit digit, to get 3+39+7 = 49.
Now, because karatsuba's method ONLY deals with multiplication in single digits, we only need to consider *at most* two digit products.
And interestingly, the only factors of 49 are 7.
49 is a square!
And the only sums that produce 7, are (2,5), (5,2), (3,4), and (4,3)
These would be the possible digits of the factors of 697 if we initially chose (3,3) as our starting point for calculating karatsubas inverse f step.
But you see, 25 can't be a factor of p=697, because 25 is a square, and ends in a 5, so its clearly not prime. 52 can't be either because it ends in 2, likewise 34 ending in 4.
Only 43 could be our possible factor of p.
And we *only* get one factor because our starting point has two of the same digit. Which would mean p would have to equal 43 (a prime) or 1. And because p DOESNT (it equals 697), we can therefore say (3,3) is the wrong starting point, as are ALL starting points that share only one digit, or end in a square.
Ergo we can say the products of non-squares, are specifically non-prime precisely because if they *were* prime, their only factors would HAVE to be themselves, and 1.
For an even BETTER explanation go try karatsuba's method with any prime as the first factor, and 1 as the second factor (just multiply the tens column by zero). And you can see why the inverse, where you might try a starting point that has two matching digits (like 3,3), would obviously fail, because the values it produces could only have two factors; some prime thats not our product, or the value one, which is also not our product.
It's elegant almost to the level of a tautology. -
I've been away a while, mostly working 60-70 hour weeks.
Found a managers job and the illusion of low-level stability.
Also been exploring elliptic curve cryptography and other fun stuff, like this fun equation...
i = log(n, 2**0.5)
base = (((int((n/(n*(1-(n/((((abs(int(n+(n/(1/((n/(n-i))+(i+1)))))+i)-(i*2))/1))/1/i)))))*i)-i)+i))
...as it relates to A143975 a(n) = floor(n*(n+3)/3)
Most semiprimes n=pq, where p<q, appear to have values k in the sequence, where k is such that n+m mod k equals either p||q or a multiple thereof.
Tested successfully up to 49 bits and counting. Mostly haven't gone further because of work.
Theres a little more math involved, and I've (probably incorrectly) explained the last bit but the gist is the factorization doesn't turn up anything, *however* trial lookups on the sequence and then finding a related mod yields k instead, which can be used to trivially find p and q.
It has some relations to calculating on an elliptic curve but thats mostly over my head, and would probably bore people to sleep.2 -
After a lot of work, the new factorization algorithm has a search space thats the factorial of (log(log(n))**2) from what it looks like.
But thats outerloop type stuff. Subgraph search (inner loop) doesn't appear to need to do any factor testing above about 97, so its all trivial factors for sequence analysis, but I haven't explored the parameter space for improvements.
It converts finding the factors of a semiprime into a sequence search on a modulus related to
OIS sequence A143975 a(n) = floor(n*(n+3)/3)
and returns a number m such that n=pq, m%p == 0||(p*i), but m%q != 0||(q*k)
where i and k are respective multiples of p and q.
This is similar in principal to earlier work where I discovered that if i = p/2, where n=p*q then
r = (abs(((((n)-(9**i)-9)+1))-((((9**i)-(n)-9)-2)))-n+1+1)
yielding a new number r that shared p as a factor with n, but is coprime with n for q, meaning you now had a third number that you could use, sharing only one non-trivial factor with n, that you could use to triangulate or suss out the factors of n.
The problem with that variation on modular exponentiation, as @hitko discovered,
was that if q was greater than about 3^p, the abs in the formula messes the whole thing up. He wrote an improvement but I didn't undertsand his code enough to use it at the time. The other thing was that you had to know p/2 beforehand to find r and I never did find a way to get at r without p/2
This doesn't have that problem, though I won't play stupid and pretend not to know that a search space of (log(log(n))**2)! isn't an enormous improvement over state of the art,
unless I'm misunderstanding.
I haven't posted the full details here, or sequence generation code, but when I'm more confident in what my eyes are seeing, and I've tested thoroughly to understand what I'm looking at, I'll post some code.
hitko's post I mentioned earlier is in this thread here:
https://devrant.com/rants/5632235/...2 -
So I promised a post after work last night, discussing the new factorization technique.
As before, I use a method called decon() that takes any number, like 697 for example, and first breaks it down into the respective digits and magnitudes.
697 becomes -> 600, 90, and 7.
It then factors *those* to give a decomposition matrix that looks something like the following when printed out:
offset: 3, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('2')]]
offset: 2, exp: [[Decimal('2'), Decimal('1')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('1')]]
offset: 1, exp: [[Decimal('7'), Decimal('1')]]
Each entry is a pair of numbers representing a prime base and an exponent.
Now the idea was that, in theory, at each magnitude of a product, we could actually search through the *range* of the product of these exponents.
So for offset three (600) here, we're looking at
2^3 * 3 ^ 1 * 5 ^ 2.
But actually we're searching
2^3 * 3 ^ 1 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
2^3 * 3 ^ 1 * 5 ^ 0
2^3 * 3 ^ 0 * 5 ^ 2.
2^3 * 3 ^ 1 * 5 ^ 1
etc..
On the basis that whatever it generates may be the digits of another magnitude in one of our target product's factors.
And the first optimization or filter we can apply is to notice that assuming our factors pq=n,
and where p <= q, it will always be more efficient to search for the digits of p (because its under n^0.5 or the square root), than the larger factor q.
So by implication we can filter out any product of this exponent search that is greater than the square root of n.
Writing this code was a bit of a headache because I had to deal with potentially very large lists of bases and exponents, so I couldn't just use loops within loops.
Instead I resorted to writing a three state state machine that 'counted down' across these exponents, and it just works.
And now, in practice this doesn't immediately give us anything useful. And I had hoped this would at least give us *upperbounds* to start our search from, for any particular digit of a product's factors at a given magnitude. So the 12 digit (or pick a magnitude out of a hat) of an example product might give us an upperbound on the 2's exponent for that same digit in our lowest factor q of n.
It didn't work out that way. Sometimes there would be 'inversions', where the exponent of a factor on a magnitude of n, would be *lower* than the exponent of that factor on the same digit of q.
But when I started tearing into examples and generating test data I started to see certain patterns emerge, and immediately I found a way to not just pin down these inversions, but get *tight* bounds on the 2's exponents in the corresponding digit for our product's factor itself. It was like the complications I initially saw actually became a means to *tighten* the bounds.
For example, for one particular semiprime n=pq, this was some of the data:
n - offset: 6, exp: [[Decimal('2'), Decimal('5')], [Decimal('5'), Decimal('5')]]
q - offset: 6, exp: [[Decimal('2'), Decimal('6')], [Decimal('3'), Decimal('1')], [Decimal('5'), Decimal('5')]]
It's almost like the base 3 exponent in [n:7] gives away the presence of 3^1 in [q:6], even
though theres no subsequent presence of 3^n in [n:6] itself.
And I found this rule held each time I tested it.
Other rules, not so much, and other rules still would fail in the presence of yet other rules, almost like a giant switchboard.
I immediately realized the implications: rules had precedence, acted predictable when in isolated instances, and changed in specific instances in combination with other rules.
This was ripe for a decision tree generated through random search.
Another product n=pq, with mroe data
q(4)
offset: 4, exp: [[Decimal('2'), Decimal('4')], [Decimal('5'), Decimal('3')]]
n(4)
offset: 4, exp: [[Decimal('2'), Decimal('3')], [Decimal('3'), Decimal('2')], [Decimal('5'), Decimal('3')]]
Suggesting that a nontrivial base 3 exponent (**2 rather than **1) suggests the exponent on the 2 in the relevant
digit of [n], is one less than the same base 2 digital exponent at the same digit on [q]
And so it was clear from the get go that this approach held promise.
From there I discovered a bunch more rules and made some observations.
The bulk of the patterns, regardless of how large the product grows, should be present in the smaller bases (some bound of primes, say the first dozen), because the bulk of exponents for the factorization of any magnitude of a number, overwhelming lean heavily in the lower prime bases.
It was if the entire vulnerability was hiding in plain sight for four+ years, and we'd been approaching factorization all wrong from the beginning, by trying to factor a number, and all its digits at all its magnitudes, all at once, when like addition or multiplication, factorization could be done piecemeal if we knew the patterns to look for.7 -
Is there any exact way to get the product of all primes under n multiplied together, without explicitly knowing what those primes are?
Lets call this number V.
Because hypothethetically, if we calculate from the *base* of V, then we can derive easy divisibility rules for V-1 and V+1, as laid out
here:
https://notaboutapples.wordpress.com/...
And then, unless I've misunderstood something, the problem of factorization has been changed from division into an addition and subtraction problem.12 -
I may have accidentally found a legit factorization method that converts factoring to a combinatorics problem over a graph, with a time complexity that is the factorial of the logarithm of the semi prime being factored.
I don't know if this is supposed to be good or not, and I don't want to post it prematurely like I almost always do. Not at least until I study its properties better, but it's still a pretty interesting find I think.11 -
Looking for someone to test a new factorization script I wrote.
https://pastebin.com/Td2XTKe6
Tested against a set of products from all primes under 1000. Worked even on numbers up to 87954921289
Worked for about 66% of the products tested.
Obviously I'm cheating a little bit because I'm checking for four conditions n%a == 0, n%a == 1, n%b == 0, and n%b == 1
It appears it is possible to generate the series from just the product, and then factor each result. The last factor in each each set of factors becomes x, and we do p%x and check for zero.
if it works, we've found our answer.
Kind of wonky but basically what its doing is taking p, tacking on a 0 to the right, and then tacking p to the right of *that*.
So if you had a product like
314
The starting number we look at is
3140314
The middle digit becomes i, and the unit digit becomes j.
Don't know why it works more often then not, and don't know if it would really be any faster.
Just think it's cool.9 -
How resource calculations for software services like code analysis, monitoring, etc are done:
Opening fridge, putting all the beer one can find in it.
Opening the necessary tools, e.g Excel, Accounting software, ....
Drinking the first beer.
Starting to aggregate the monthly costs - cause you can never trust the reports written by someone else...
First beer poof.
Looking at the monthly cost, adding columns "Intended use", "Actual usage pattern", "Usage factor"...
Opening next beer...
Usage factor is btw a factor of 0.1 ... 1.0 - to give an estimate how much the products feature are actually used, for further analysis if the invest is justified or not...
Oh. Another half bottle gone...
Filling in the columns...
Oh. Bottle empty and the next one toooooooooooooooo...
*burping*
*cracking finger joints*
Now let's get to the sad part...
Next worksheet, adding infrastructure costs...
Cost and description as columns.
Hehe. Column sounds like gollum.
Another beer...
Ugh. Need the paper reports, manually typing off things for stuff that was e.g. tax deductible.
Many beers die during this task. Poor little beers, dying for such an boring and mundane task...
SUM is a real useful function. I don't think I can add numbers anymore.
Now we can add another sheet.
Hehe. Sheet sounds like shit. And yes, everything in this file is shit.
Summing up costs from both sheets and including the cost factor from 1
... Beeeeeeeer Beeeeer beer we need more beer here... Beer beer beer...
Where was I. Oh yeah. Cost factorization total vs effective.
Why do I want to get even more drunk.
Oh yeah. Most software is completely underused and the costs aren't justified.
Let's add some colored highlighting ...
Uuuuh. ,Too much red. Better change the highlights.
Too much red.
More beer.
Don't give a fuck.
Hm.
Time for some whiskey.
What else is there to do....
Oh yeah.
Diagrams.
The bloody wankers from accounting need diagrams as numbers are too boring.
Not that everything in accounting is boring, no matter how much you paint colors on it... *sigh*
Hm. More whiskey...
Hehe. Whiskey rhymes with frisky.
Uff. Now just need to write mail. Mail mail mail....
"Copy paste the last mail from last month"
Hm.
Ah.
*sipping whiskey*
Spell check extension - to the rescue.
Thesaurus *burps*.
Let's change a few words here and there... Maybe another paragraph there.
Uh....
Trying to attach file...
*fucking mouse is pretty constantly crashing into empty beer bottles*
Done.
Damn.
Need to press send button.
*Creating mess on the desk by just randomly crashing the beer bottles*
Done.
*Pressing computers power button*
Mwahahahaha. No mouse needed.
*regretting to stand up too quickly, nearly barfing on the floor*
Couch ... Where Couch...
After hitting several doors, frames and other stuff, the glorious mission ended successfully with a most graciously executed gut buster on the couch.
(Regretting next morning to have emptied two 6 packs and a few glasses of whiskey) -
Math question time!
Okay so I had this idea and I'm looking for anyone who has a better grasp of math than me.
What if instead of searching for prime factors we searched for a number above p?
One with a certain special property. BEAR WITH ME. I know I make these posts a lot and I'm a bit of a shitposter, but I'm being genuine here.
Take this cherry picked number, 697 for example.
It's factors are 17, and 41. It's trivial but just for demonstration.
If we represented it's factors as a bit string, where each bit represents the index that factor occurs at in a list of primes, it looks like this
1000001000000
When converted back to an integer that number becomes 4160, which we will call f.
And if we do 4160/(2**n) until the result returns
a fractional component, then N in this case will be 7.
And 7 is the index of our lowest factor 17 (lets call it A, and our highest factor we'll call B) in our primes list.
So the problem is changed from finding a factorization of p, to finding an algorithm that allows you to convert p into f. Once you have f it's a matter of converting it to binary, looking up the indexes of all bits set to 1, and finding the values of those indexes in the list of primes.
I'm working on doing that and if anyone has any insights I'm all ears.9 -
https://i.postimg.cc/4ycRFNZf/...
The factorization shit I'm always ranting about? I decided for once to explain it visually in this handy dandy little infographic.
We're essentially transforming the product from an unsmooth set of potential factors in its factor tree, to a factorization tree that guarantees first that the set of potential factors are all 2, 3, 5, and a or b of p, and second, that all the factors are *smooth integers* of a or b.
This is basically what Adi Shamir was trying to do with TWINKLE and TWIRL, despite checking a hundred thousand+ potential primes.
I did it in four.7 -
So I made a couple slight modifications to the formula in the previous post and got some pretty cool results.
The original post is here:
https://devrant.com/rants/5632235/...
The default transformation from p, to the new product (call it p2) leads to *very* large products (even for products of the first 100 primes).
Take for example
a = 6229, b = 10477, p = a*b = 65261233
While the new product the formula generates, has a factor tree that contains our factor (a), the product is huge.
How huge?
6489397687944607231601420206388875594346703505936926682969449167115933666916914363806993605...
and
So huge I put the whole number in a pastebin here:
https://pastebin.com/1bC5kqGH
Now, that number DOES contain our example factor 6229. I demonstrated that in the prior post.
But first, it's huge, 2972 digits long, and second, many of its factors are huge too.
Right from the get go I had hunch, and did (p2 mod p) and the result was surprisingly small, much closer to the original product. Then just to see what happens I subtracted this result from the original product.
The modification looks like this:
(p-(((abs(((((p)-(9**i)-9)+1))-((((9**i)-(p)-9)-2)))-p+1)-p)%p))
The result is '49856916'
Thats within the ballpark of our original product.
And then I factored it.
1, 2, 3, 4, 6, 12, 23, 29, 46, 58, 69, 87, 92, 116, 138, 174, 276, 348, 667, 1334, 2001, 2668, 4002, 6229, 8004, 12458, 18687, 24916, 37374, 74748, 143267, 180641, 286534, 361282, 429801, 541923, 573068, 722564, 859602, 1083846, 1719204, 2167692, 4154743, 8309486, 12464229, 16618972, 24928458, 49856916
Well damn. It's not a-smooth or b-smooth (where 'smoothness' is defined as 'all factors are beneath some number n')
but this is far more approachable than just factoring the original product.
It still requires a value of i equal to
i = floor(a/2)
But the results are actually factorable now if this works for other products.
I rewrote the script and tested on a couple million products and added decimal support, and I'm happy to report it works.
Script is posted here if you want to test it yourself:
https://pastebin.com/RNu1iiQ8
What I'll do next is probably add some basic factorization of trivial primes
(say the first 100), and then figure out the average number of factors in each derived product.
I'm also still working on getting to values of i < a/2, but only having sporadic success.
It also means *very* large numbers (either a subset of them or universally) with *lots* of factors may be reducible to unique products with just two non-trivial factors, but thats a big question mark for now.
@scor if you want to take a look.5 -
I'm teaching a couple of classes where students (~18 years old) work on their own projects. I just deleted two of those from my machine: one Angular and one Spring Boot, but just boilerplate. Together, they were about 500 MB. I spent 2-3 hours working on a little Go tool to make concurrent HTTP requests and to report statistics on the response time. The entire repository is roughly 500 kB in size, but solves a genuine problem. My students have a bloat ratio of 1000 compared to me as a baseline, but my stuff actually does things. Today, I programmed prime factorization in PHP for some load tests (mod_php vs. PHP-FPM). The PHP script is 1148 bytes long (but the file system reports 4 kB). My students could learn more from such a script than from their overblown "projects", but "PHP sucks" I hearsay, so let's bloat on.11
-
Let some number P be the product of two factors, A and B
Let the iterations of A*1..2..3..N up to B+1 be a directed graph.
Would this graph be eularian?
If so then it should be possible to use the BEST algorithm to count the number of eularian circuits, yielding B, no?
Edit: this is supported by the following text:
An arborescence can equivalently be defined as a rooted digraph in which the path from the root to any other vertex is unique.
* * *
Where the product of any two primes is unique, the path to it across our graph here, is also unique. -
Hang with me! This is *not* a math shitpost, I repeat, it is NOT a math shitpost, not entirely anyway.
It appears there is for products of two non-trivial factors, a real number n (well a rational number anyway) such that p/n = i (some number in the set of integers), whos factor chain is apparently no greater than floor(log(log(p))**2)-2, and whos largest factor is never greater than p^(1/4).
And that this number is at least derivable, laboriously with the following:
where p=a*b
https://pastebin.com/Z4thebha
And assuming you have the factors of p/z = jkl..
then instead of doing
p/(jkl..) = z
you can do
p-(jkl) to get the value of [result] whos index is a-1
Getting the actual factor tree of p/z is another matter, but its a start.
Edit: you have to provide your own product.
Preferably import Decimal first.3 -
I am trying to decompose a 3D matrix using python library scikit-tensor. I managed to decompose my Tensor (with dimensions 100x50x5) into three matrices. My question is how can I compose the initial matrix again using the decomposed matrix produced with Tensor factorization? I want to check if the decomposition has any meaning. My code is the following:
import logging
from scipy.io.matlab import loadmat
from sktensor import dtensor, cp_als
import numpy as np
//Set logging to DEBUG to see CP-ALS information
logging.basicConfig(level=logging.DEBUG)
T = np.ones((400, 50))
T = dtensor(T)
P, fit, itr, exectimes = cp_als(T, 10, init='random')
// how can I re-compose the Matrix T? TA = np.dot(P.U[0], P.U[1].T)
I am using the canonical decomposition as provided from the scikit-tensor library function cp_als. Also what is the expected dimensionality of the decomposed matrices.1