Читайте также: |
|
After briefly describing the underlying principles of each approach, we proceed to cite representative examples where researchers have applied “intelligent” techniques to solve real-world problems. We restrict ourselves to six biologically
Table 1. Classical vs. soft computing
Classical computing | Soft computing |
2-valued(Boolean/crisp) logic | Many-valued (Fuzzy) logic |
precise | approximate |
deteministic | Stochastic (i.e., incorporates some randomness/unpredictability) |
Exact/precise data | ambiguous/approximate/inconsistent data |
Sequental processing | parallel processing |
inspired soft computing methods here, these being Artificial Neural Networks, Genetic Algorithms, swarms, DNA immune- and membrane-based computing. We basically don’t consider Fuzzy Systems, reasoning systems orrule-based Expert Systems as such. However, we domention such systems in the context of combinations/hybrids of such soft computing techniques, which has become the province of the field of Computational Intelligence–CI–in recent times (Fulcher & Jain, 2008).
ARTIFICIAL NEURAL NETWORKS (ANNS)
Artificial Neural Networks (ANNs) are simplistic models of biological neural networks (brains), typically comprising dozens (but not billions) of neurons and hundreds (not tens of billions) of synapses (connections between neurons). The output (axon) of a biological neuron “fires” (produces a pulse train signal output) whenever the weighted sum of the signals from the inputs (synapses) exceeds some preset threshold. Now excitation (inhibition) of individual neurons is essentially an electrochemical process, involving different concentrations of potassium and sodium ions within and outside of the cell body; moreover, this is inherently an analog (linear) process. In the simplified neuron model commonly used in ANNs, neuron “firing” corresponds to a simple output level shift (0->1; 1->0). One characteristic of biological networks is their localized behaviour; in other words, certain areas of the brain are responsible for processing information incoming from our senses (although substantial preprocessing often takes place in the cerebral cortex prior to arriving in the brain proper), or for producing the necessary outputs (motor movement, speech, and so forth). Some ANN models reflect this localized behaviour, while others employ a more uniform, holistic architecture.
Another characteristic of biological brains is their massively parallel (analog) processing capability, such that despite the relative slow processing capability of individual neurons (milliseconds), their collective processing power far exceeds that of the fastest supercomputers, at least for some tasks. Realization of parallel, analog, neural network hardware in practice is by way of (sequential) digital computer software simulation. The ANN models in common usage are very much simplified versions of the biological networks from which they derive their inspiration. The most popular ANN model (Wong, Lai, & Lam, 2000) is the Multilayer Perceptron (MLP) of Figure 2.
ANNs are not programmed in the traditional algorithmic sense, but rather learn by example, at least in the supervisedkind. Accordingly, supervised networks require numerous input-output training data pairs in order to learn the underlying “intelligence” of the system under study. Once trained, an ANN is capable of correctly recognizing input patterns not previously met during the training process; in other words, it exhibits generalization ability. Note that such a training process is an inherently data-driven, bottom-up approach (in contrast to conventional model-driven, top-down, algorithmic approaches). Furthermore, the training process can be quite time consuming; however, once trained, an ANN can respond almost instantaneously to new inputs
Figure 2. A (fully-connected) 3-layer MLP/BP
The Multilayer Perceptron (MLP) of Figure 2 is a fully connected, 3-layer, supervised, feedforward ANN, comprising input, hidden, and output layers, each of which contain n, p and m neurons,respectively. By “feedforward,” we mean that connections (weights) only exist in a forward direction, that is, from one of the n neurons in the Input Layer to one of the p neurons in the Hidden Layer (or from one of the pneurons in the Hidden Layer to one of the m neurons in the Output Layer). By contrast, no such restrictions apply in
brains. The MLP employs the so-called BackPropagation learning rule, which simply stated says that upon presentation of an input-output training exemplar pair, the actual output
produced by the network is compared with the desired output. During each successive training iteration, the weights are adjusted in proportion to this error (∆ or difference) signal: firstly, to adjust the weights connecting the Hidden Layer to
the Output Layer, then those between the Input Layer and the Hidden Layer. In this manner, the error signal “propagates” backward from the ANN output to its input, adjusting its weights in the process; hence its name (BP).
Presentation of all input-output training pairs (exem-plars)–one “epoch”–will see the weights change in many different (and incompatible/conflicting) directions. In practice many epochs will be necessary in order for the network to converge to an acceptable solution (which corresponds to the network having learned all I/O pattern associations).
It has been proven mathematically that the BP algorithm will eventually converge to an acceptable solution, although this might not be within a convenient timeframe from a user’s perspective! In practice, training of ANNs can take several hours, perhaps even overnight, even on top-of-the-range computers. Training ceases when the error (difference) signal falls below a certain level (say 0.1%), or alternatively after a certain predetermined number of training epochs.
Common practice is to divide the available training data in two, then use one half for training and the other half to test (verify) the network once trained. Now in practice such labelled I/O training data (exemplars) may not always be available, and hence some people prefer to use un
supervised neural networks. One has to exercise caution with the latter, however, because the resulting classes/clusters the network produces are often suspect. We restrict our current discus-
sion here to supervised ANNs, and indeed to only
one type (MLP/BP).
There is also the issue of how many I-O training exemplar pairs constitute a “minimum-yet-sufficient” set: too few will not lead to network convergence, whereas too many could
result in “overtraining” (akin to “overfitting” in mathematical function approximation/curve fitting).
ANNs are especially good at pattern recognition or pattern classification, irrespective of what the pattern actually represents. This means that in practice we need to be able to encode the pattern of interest (be this vision, speech, time series, or whatever) into an appropriate form. Indeed, preprocessing is often the most challenging aspect of applying ANNs to real-world problems. Typical preprocessing tasks
include the handling of missing, incomplete or noisy data, and most especially dimensionality reduction (because from what we have already seen, ANN training times are quite long; in fact, they increase exponentially as a function of the number of network weights; accordingly, any reduction in the dimensionality of the training data will have a dramatic effect on network convergence times).
Verma & Panchal (2006) used ANNs (supervised MLP/BP) in a standard pattern classification task, that of discriminating between malignant (cancerous) and benign
pap smears. Fyfe (2008) applied an unsupervised ANN–the self-organizing map–to data clustering and visualization. Likewise, Yin (2008) showed how the SOM–and variants thereof–could also be applied to vector quantization, image segmentation, density modeling, gene expression analysis,
and text mining. More sophisticated (Higher-Order, supervised) ANN models have been used for both satellite weather prediction (Zhang & Fulcher, 2004) and financial time series prediction (Fulcher, Zhang & Xu, 2006). Zeleznikow (2004) combined ANNs and rule-based reasoning in the develop-
ment of an Intelligent Legal Decision Support System. By contrast, Fu, Li, Wang, Ong, and Turner (2008) combined ANNs and multi-agents in order to predict network traffic
over media grids.
Дата добавления: 2015-07-10; просмотров: 190 | Нарушение авторских прав
<== предыдущая страница | | | следующая страница ==> |
BACKGROUND | | | EVOLUTIONARY ALGORITHMS |