Thursday, August 28, 2014

Distraction and focus of attention again

Each layer of the Asa H hierarchy passes a vector up to the next layer.  Perhaps focus of attention might be obtained in the following way:  calculate an average and standard deviation from all of the vector components (assume all components are positive), keep only those components which are "a couple" of standard deviations above the average, delete all other components, renormalize the vector. Report a vector =0 if no components survive this test. (What number should "a couple" really be? Should it vary?) I plan to try this on Asa H.

Wednesday, August 27, 2014

A separate training phase in Asa H

We can give Asa H distinct training and performance stages by altering thresholds like Th2 (line 75 of my code in the blog of 10 Feb. 2011). A casebase can be recorded while using one value of Th2 (code from the 26 Aug. 2013 blog) and then employed by an agent using a different value of Th2 (and possibly other thresholds).

Friday, August 22, 2014

Specialist AIs

Asa H can be trained in an area of expertise and the resulting casebase/knowledgebase saved to an external drive. (see, for example, my blog of 26 Aug. 2013) I have a 4 terabyte drive for this purpose.  Such specialty knowledge can be organized much like the Dewey decimal system and the standard industrial classification.

Friday, August 15, 2014

The Asa H value hierarchy

The values assigned to Asa H cases may vary from one level in the hierarchy to another. At the lowest level(s) case length and how often the case is seen to recur is valued. (see, for instance, Asa H 2.0 light in my blog of 10 Feb 2011).  At the highest level in the hierarchy agent lifespan and number of offspring (diskcopies)  may be whats most highly valued (see, for instance, my paper, Trans. Kan. Acad. Sci., vol. 109, No. 3/4, 2006 )

Ensemble learning with Asa H

Various Asa H experiments have employed ensemble learning.  Perhaps the simplest averages the output from two or more individual Asa H agents.  These may have different similarity measures for instance or have been trained separately. Ensemble learning is also possible within a single Asa agent.  The N best case matches can be followed, for example, and the output can be generated by voting, averaging, interpolation, or the like.  Weighting of the individual outputs by the degree of case match and case utility can be employed. Again, as a rule groups make better decisions than individuals do.

Thursday, August 14, 2014

Granular computing and Asa H

Asa H can be considered to be a project in granular computing (see, for example, Y. Y. Yao, Proc. 4th Chinese National Conf. on Rough Sets and Soft Comp., 2004) "interpreted as the abstraction, generalization, clustering, levels of abstraction, levels of detail, and so on."

Tuesday, August 12, 2014

Big data and artificial intelligence

It is being suggested that big data may be the key to a strong artificial intelligence (see, for example, AI gets its groove back by Lamont Wood, Computerworld, 14 April 2014).  In the 1980s it was common to hear the claim that "you can't be intelligent without knowing a lot" as a part of the work on knowledge based expert systems. 

Certainly big data may offer an environment in which humans find themselves at a disadvantage again. Currently some environments are easier for humans (natural language conversations for example) while some are easier for computing machinery (pocket calculators for example).

Along these lines over the last couple of years I have been slowly increasing the data flow and flow rate into my various Asa H AI experiments.

Monday, August 11, 2014

Boilerplate

A good way to speed up the writing of scientific publications is the use of boilerplate.  People have mixed feelings about this practice.  When I was doing plasma physics boilerplate might include:

1. a diagram of the experimental machine
2. a table of typical operating parameters/conditions
3. a paragraph or two describing the device and its operation
4. a paragraph or two describing the plasma diagnostics used

These would change from one publication to the next only if the device or values really did change or if one could in some way improve the boilerplate.

I have had one or two people criticize this practice as somehow "cheating."  I disagree completely.  If one can refer to an earlier paper to present such information fine, dispense with the boilerplate.  But to the extent that a given publication is to be selfcontained then boilerplate may actually serve as quality control.  (Again, so long as it is kept current.)

Most plasma fusion work will at least have a diagram of the machine and a paragraph describing it.  The cost of such machines is so high that they and their descriptions will not be changing from one paper to the next.

If, for some reason, one were to do the same experiment over and over but with a different fill gas, lets say, each publication might then be much like the one before it.  I know of people who do spectroscopic work (not plasma physics) where this has been common.

I've known a number of scientists who would create a talk/presentation by selecting slides from a collection they had assembled (supplemented by any new results recently obtained).
It is not cheating to work smart.

Saturday, August 9, 2014

Scalar utility

Asa H 2.0 has been run with both scalar and vector utilities.  An example of a scalar utility is Asa H 2.0 light (blog of 11 Feb. 2011).  In that code the utility (of the case) is the total time during which that particular pattern (case) has been observed. (a product of the time duration/length of the case and the number of times the case has occurred)

Wednesday, August 6, 2014

Vector intelligence again

In their paper "Fractionating human intelligence" (Neuron, 19 Dec. 2012) Hampshire, Highfield, and Owen offer evidence that IQ can not be described by a scalar quantity but requires at least 3 (vector) components.

Tuesday, August 5, 2014

Natural language versus mentalese

Natural languages are sequential, we can only say or write one word at a time.  Language of thought (mentalese) is, at least in part, parallel, non-sequential.  The brain is a parallel distributed processor.  Many concepts are activating one another across the brain simultaneously.  Humans must teach one another sequentially, they are using natural language.  Robots could transfer data to one another in parallel.

Symbol grounding

I agree with Werbos' definition of intelligence, "a system to handle all of the calculations from crude inputs through to overt actions in an adaptive way so as to maximize some measure of performance over time" (IEEE Trans. Systems, Mind , and Cybernetics, 1987, pg 7).  A brain is a control system.  In any artificial intelligence all symbols (internal representations) are then surely grounded in so far as they functionally connect sensory and utility/value inputs with outputs/responses. But in deep (complex) networks some symbols will be far removed from primitive/raw perceptions. i.e., more "abstract" concepts.

Monday, August 4, 2014

What thoughts are made up of

In their paper "what thoughts are made of" (in Embodied Grounding, edited by Gun and Smith, Cambridge U. Press, 2008, pg 108) Boroditsky and Prinz detail a view of the nature of thought which is very similar to my own (as it occurs in my artificial intelligence, Asa H).  They also suggest that teaching an agent a natural language (as I have been doing with Asa H) may enhance the agent's level of intelligence.

One difference is that Boroditsky and Prinz discuss representations (of concepts) in terms of feature lists whereas I employ vectors and allow each feature (vector component) to have variable activation.

Friday, August 1, 2014

Repressed memories in Asa H

The performance elements of Asa prefer cases having high utility.  Forgetting/deleting cases with low utility speeds up search.  Additional low utility cases ("repressed" memories) can be retained for use (in a larger "augmented" casebase) by the learning element.  Knowing what NOT to do is useful there.

When a vector utility is employed we prefer to delete cases from the more densely populated regions of the case vector space.  Also, if a case has a single vector component that is high we prefer to retain it.