Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "mutation testing"
-
Has anyone else actually *used* mutation testing at all?
Heard a lot about it recently - it seems all the rage amongst the bloggers, but I'm generally always very sceptical of things touted as the "latest hotness" (my thoughts on blockchain for instance are well known.)
So I went ahead and whacked http://pitest.org/ into one of my more recent pet projects to see if it offered anything decent. Surprisingly, it did - in particular it caught a number of places where switching "<" for "<=" and similar had no affect on the pass / fail rate (indicating the tests should be better.) There were a *few* false positives, and some which were borderline useful, but as a whole I'd say it was a worthwhile addition.
Curious as to if anyone else has had the same experience?1 -
I had the idea that part of the problem of NN and ML research is we all use the same standard loss and nonlinear functions. In theory most NN architectures are universal aproximators. But theres a big gap between symbolic and numeric computation.
But some of our bigger leaps in improvement weren't just from new architectures, but entire new approaches to how data is transformed, and how we calculate loss, for example KL divergence.
And it occured to me all we really need is training/test/validation data and with the right approach we can let the system discover the architecture (been done before), but also the nonlinear and loss functions itself, and see what pops out the other side as a result.
If a network can instrument its own code as it were, maybe it'd find new and useful nonlinear functions and losses. Networks wouldn't just specificy a conv layer here, or a maxpool there, but derive implementations of these all on their own.
More importantly with a little pruning, we could even use successful examples for bootstrapping smaller more efficient algorithms, all within the graph itself, and use genetic algorithms to mix and match nodes at training time to discover what works or doesn't, or do training, testing, and validation in batches, to anneal a network in the correct direction.
By generating variations of successful nodes and graphs, and using substitution, we can use comparison to minimize error (for some measure of error over accuracy and precision), and select the best graph variations, without strictly having to do much point mutation within any given node, minimizing deleterious effects, sort of like how gene expression leads to unexpected but fitness-improving results for an entire organism, while point-mutations typically cause disease.
It might seem like this wouldn't work out the gate, just on the basis of intuition, but I think the benefit of working through node substitutions or entire subgraph substitution, is that we can check test/validation loss before training is even complete.
If we train a network to specify a known loss, we can even have that evaluate the networks themselves, and run variations on our network loss node to find better losses during training time, and at some point let nodes refer to these same loss calculation graphs, within themselves, switching between them dynamically..via variation and substitution.
I could even invision probabilistic lists of jump addresses, or mappings of value ranges to jump addresses, or having await() style opcodes on some nodes that upon being encountered, queue-up ticks from upstream nodes whose calculations the await()ed node relies on, to do things like emergent convolution.
I've written all the classes and started on the interpreter itself, just a few things that need fleshed out now.
Heres my shitty little partial sketch of the opcodes and ideas.
https://pastebin.com/5yDTaApS
I think I'll teach it to do convolution, color recognition, maybe try mnist, or teach it step by step how to do sequence masking and prediction, dunno yet.6 -
Our pipeline had been running for 45 minutes, I went to investigate
Our mutation testing framework had identified 90 potential mutants and started running through our 342 tests for every single mutant, so 90 * 342 tests
And I had been complaining about the amount of pipeline agents being too low and pipelines blocking up...5 -
I wish everyone would move away from code coverage as a metric and towards some kind of mutation framework.
There seem to be increasing numbers of devs getting themselves off on their "shiny 95% coverage" and patting themselves on the back for covering everything but the 5% of the codebase that actually needs thorough testing.
Oh, and that's ignoring the tests that just assert an exception isn't thrown, or don't assert anything at all. Completely bloody useless, but hey, you just carry on boasting how great all your tests are because you've got a higher coverage than the team next door 😤🙄3