Baby Steps into Genetic Programming

abhgh 9 hours ago

Note the website (ai-contest.com) that the post links to seems to have been hijacked by a gambling site.

For the use-cases where Genetic Programming was popular, I would recommend looking at Bayesian Optimization (bayesopt) as an alternative today (I know I keep recommending the area - but I hope I do when it is relevant :-)). This is mostly because IMHO it has a principled foundation that has been productively developed further in the past few years. Here's a good book on the topic [1], and I've a tutorial as well [2]. Interestingly one of the books I had encountered when reading up on Genetic Algo. years ago was by Melanie Mitchell [3]!

Bayesopt or Genetic Programming, or any search algorithm that can operate over non-differentiable objective functions are very useful in practice. For ex, when performing model selection in the space of hyperparameters, when your model is not differentiable such as a traditional Decision Tree [4]. Or exotic use-cases like molecule discovery [5].

You can try out bayesopt using the botorch or hyperopt libraries. The latter only implements a specific bayesopt algo. which was/is popular but it seems to have been bettered of late [4].

[1] https://bayesoptbook.com/

[2] Part 1 https://blog.quipu-strands.com/bayesopt_1_key_ideas_GPs

[3] Found a free copy online https://www.boente.eti.br/fuzzy/ebook-fuzzy-mitchell.pdf

[4] "... Analysis of the Black-Box Optimization Challenge 2020" https://proceedings.mlr.press/v133/turner21a.html

[5] ChemBO is an example but there are others https://proceedings.mlr.press/v108/korovina20a.html

jmmcd 4 hours ago

No, you're confusing GA with GP.
- abhgh 4 hours ago
  
  You're right! My bad. Thank you for pointing it out! Leaving my comment up for info on bayesopt.

mark_l_watson 2 hours ago

Nice writeup, short so is more approachable than John Koza’s classic book on GP. That said, if GP looks useful to you, eventually read Koza’s book, or at least experiment with his Common Lisp code.

Also don’t confuse Genetic Algorithms (GA) with GP.

hintymad 12 hours ago

Naive question: what are the most suitable problems that Genetic Programming is to solve, despite that machine learning especially deep learning is all the rage now? Or do we have models that integrate genetic programming into deep learning?

bob1029 5 hours ago

Real-time and energy-constrained problems would be the top use cases in my mind given the current state of affairs. Beyond this, problems that are non-differentiable or where we desire strong symbolic interpretation. I think these techniques also tend to extrapolate more robustly than DNNs. Technically, any problem where algorithmic shortcuts like the chain rule of calculus break down.
I've been working on GP architectures that use short linear program tapes as part of their genome. Executing these programs can be done very quickly. Assuming the program tape is <=10kb in length and you are constraining to ~a megacycle in the interpreter, you can execute these programs 100k~1m times per second on a chunky workstation. The problems you can solve in this time & space are non-trivial. Once you evolve the desired program, you can write the interpreter for it in any language on any platform within half an hour.
Training is definitely going to be expensive but it can fit in one powerful machine. You don't need to keep a terabyte of data around in VRAM to make reasonable rates of progress. Paging out parts of the population to disk or S3 buckets is feasible with these techniques. It subdivides very nicely into however many islands you can afford to run.
Inference is essentially free and could run on the processor in your toaster oven.
These approaches can make CUDA & friends look like a Rube Goldberg machine by comparison. Being able to explain how the entire thing works to a peer in a single afternoon is perhaps another significant feature of these designs.
At the end of the day, there is a pretty good reason these techniques aren't popular - There aren't any quadratic scaling laws that constrain the architecture of GP. Only the much scarier exponential ones. I contend that the search space of all linear programs is viable, but I still don't have any proof that this is the case after chasing the rabbit for about a year. My central argument is that there are many high-quality programs out there. We just need to find one of them. I think the arguments against this tend to unfairly cast the problem as searching for a single needle in a haystack of 10^10000.
ggerules 10 hours ago

Genetic Programming and the wider field of Evolutionary Computation and Genetic Algorithms are really good at some kinds of optimization.
If the problem fits into a tree form, then Genetic Programming programming is your best friend.
If the problem can be encoded onto a set of substrings of 0s and 1s, then Genetic Algorithms are your best friend.
Likewise if your problem can be encoded using floating point numbers, then Evolutionary Strategies is your best friend.
Now... I could ruffle some feathers by saying that Genetic Algorithms and Evolutionary Strategies aren't really all that different. Same algorithms could be used for both. But historicaly they they came roughly at similar times frome different places on earth, Illinois vs Germany.
Back to GP. The cool thing about GP is that when the solution is evolved you have THE solution to the problem. No futzing about with how or what the results mean.
A big problem in the past is that GP doesn't scale well like neural networks. It is not embarrassingly parallel. It has been very limited by the types of hardware/architectures available.
But GP is a great field of exploration!
- italodev 7 hours ago
  
  fitness evaluation can be massively parallel and scales easily..
dimatura 7 hours ago

Note that genetic programming is a specific subset of genetic algorithms that focuses on searching for "programs" (hence the name), typically encoded in the form of a tree structure similar to an AST, though other representations exist. In theory, you could use GP for almost any situation where you'd want to synthesize a mathematical expression or piece of code for which you have a criteria to optimize (i.e. "fitness"). In practice, since GPs are basically just semi-randomly searching a huge combinatorial space, they tend to work best in low-dimensional problems, ideally with a fitness function that is cheap to evaluate. They can work nicely for finding nonlinear symbolic formulas for regression, for example. But there's also some other cool results over the years - Hod Lipson has published several cool results in robotics with them.
Until a few years ago, the popular deep learning methods like CNNs weren't great at that kind of thing, but LLMs definitely changed that - it's safe to say that by drawing on huge amounts of data LLMs are much better at most practical programming tasks. They're not necessarily a complete replacement though, there's definitely space for hybrid approaches, eg https://arxiv.org/abs/2401.07102 or https://www.nature.com/articles/s41586-023-06924-6.
3eb7988a1663 10 hours ago

Genetic algorithms to fit car shapes: https://rednuht.org/genetic_cars_2/
noosphr 10 hours ago

Areas where the problem space is continuous but the solution model isn't differentiable.
An example would be a neural network over a discrete field.
Back at university I played around with boolean neural networks and this was one way of training them directly.
friendzis 6 hours ago

Definitely not an expert here, but the main difference is that in machine learning the goal is to find/optimize algorithm to fit given datasets, whereas in genetic programming the goal is to find/optimize dataset to fit given algorithm. There is a lot of overlap though.
EDIT: > what are the most suitable problems that Genetic Programming is to solve
Given above, genetic algorithms are really suitable when part of your problem domain is bounded execution time. You perform n iterations of genetic algorithm and even if the result does not fit within fitness function, you still get some result that is closer to fitness function than a wild guess.
- jmmcd 5 hours ago
  
  > in genetic programming the goal is to find/optimize dataset to fit given algorithm
  No. Possibly you're confused between GAs and GP, a common confusion. In GP, the goal is to find an algorithm - in a real programming language, not as weights - to optimise an objective function. Often that is specified by input-output pairs, similar to supervised learning.
stargrazer 12 hours ago

I suppose with Genetic Programming, given an appropriate set of descriptive symbols, it is relatively easy to understand the final result and intuit if there is any over-fitting involved. On the other hand, machine learning results are typically a black box, the weights involved typically do not easily lend themselves to understanding the nuances of the solution.
krapht 12 hours ago

Big combinatorial problems still use genetic algorithms. Specifically I know it's still used for logistics routing problems that are too big for solvers like Gurobi.
Deep learning on graphs is unfortunately still a little underwhelming.

artful314w 13 hours ago

The ai-contest.com links go to a strange, irrelevant, and suspicious website.

frainfreeze 9 hours ago

Scamers, spammers, online casinos, porn sites; all often take over expired domains to host their trash and collect stray traffic

user_of_the_wek 6 hours ago

I just hope the baby is fine!

yoyo_looming 5 hours ago

I thought it was a real baby.

ggerules 10 hours ago

Nice work! Thanks for posting.

Cogito 11 hours ago

This is from 2011, FYI.

ribs 12 hours ago

Last I checked, genetic programming wasn’t promising, and I’m a little surprised to see people paying attention to it here.

OTOH that was similar to what people were saying about hidden layers, so YMMV

aaron695 3 hours ago

[dead]