Sat, 04 Jan 2003 05:52:09 -0800
Yes, if you have a huge amount of space and time resources available, you can start your system with a blank slate -- nothing but a very simple learning algorithm, and let it learn how to learn, learn how to structure its memory, etc. etc. etc.
This is pretty much what OOPS does, and what is suggested in Marcus Hutter's related work.
It is not a practical approach, in my view. My belief is that, given realistic resource constraints, you can't take such a general approach and have to start off the system with specific learning methods, and even further than that, with a collection of functionally-specialized combinations of learning algorithms.
I could be wrong of course but I have seen no evidence to the contrary, so far...
A fixed collection of methods won't scale, - power of a method should correspond to generality (predictive power) of a pattern. The whole point of such pattern-specific & level-specific scaling of methods IS computational efficiency, - it's a lot less expensive to incrementally scale methods for individual patterns than to indiscriminately apply a fixed set of them on patterns most of which are either too complex or too simple for any given method.
To select formulas you must have an implicit criterion, why not try to make it explicit? I don't believe we need complex math for AI, complex methods can
Sorry, that was a typo, it should be "can't"
be universal, - generalization is a reduction. What we need is a an autonomously scalable method.
Well, if you know some simple math that is adequate for deriving a practical AI design, please speak up. Point me to the URL where you've posted the paper containing this math! I'll be very curious to read it!!!! ;-)
We both know that there is no practical general AI yet, I'm trying to suggest a theoretically consistent one. Given that the whole endeavor is context-free it should ultimately be the same thing. I don't have any papers, when the theory is finished I'll write a program, not a paper.
My method is ultimately simple: sequential expansion of search for correlations of sequentially increasing arithmetic power/derivation, for inputs which had above-average compression over the shorter range of search / lower arithmetic power/derivation. What's new here (correct me if I'm wrong), is how I define compression, which determines value of a pattern, & encode these patterns to preserve restorability & enable analytical comparison (between individual variable types within patterns). Both are necessary to selectively scale the search, & I don't see it in OOPS
It's in my introduction, someplace, but I realize it must be mental torture to try to figure it. Why would you work on it? Only if you agree with my theoretical assumptions, I suppose, the method is uniquely consistent with them.