Q1. A multi-layer perceptron differs from the single layer perceptron in that it has more layers of perceptron-like units.
A1. True - there are no other differences
A2. True - but there are other differences as well, that are at least as important
A3. False - layers refers to a mathematical effect and is not used in its usual sense here
Q2. The mulitlayer perceptron multiplies its inputs by weights and then adds up all the values.
A1. True - it then passes this result on to the next layer
A2. True - it then passes it through the Heaviside threshold function
A3. True - it then passes it through an S-shaped function
A4. False - it doesn't do a weighted sum
Q3. The sigmoid function is
A1. S-shaped
A2. Z-shaped
A3. A step function
A4. U-shaped
Q4. Why is the sigmoid function important to the success of the multi-layer perceptron?
A1. It allows the perceptrons to reconfigure their arrangement and hence to solve the problem.
A2. It allows the thresholding to be carried out more quickly
A3. It allows the perceptrons to learn
Q5. What is back-propagation?
A1. It is another name given to the curvy function in the perceptron unit.
A2. It is the transmission of error back through the network to adjust the inputs
A3. It is the transmission of error back through the network to allow weights to be adjusted so that the net can learn.
Q6. The Widrow-Hoff delta rule is used in the m.l.p as well as in the s.l.p
A1. True
A2. False
Q7. A mlp has to have the same number of input nodes as output nodes
A1. True
A2. False
Q8. Multilayer perceptrons have full connectivity between the layers
A1. True
A2. False
Q9. The number of connections in the network scales linearly with the number of nodes
A1. True
A2. False
Q10. What are hidden nodes?
A1. Nodes that do not do any computation and so do not assist in producing the output
A2. Nodes that have '0' as an input value
A3. Nodes that have no direct connection to the input or the output
A4. Nodes that have no direct connections to any other nodes
Q11. What is the error function defined as?
A1. 0.5*sum(weights*inputs)
A2. 0.5*sum(target-output)^2
A3. 0.5*sum(target-output)
Q12. Why is the error function called that?
A1. It just is
A2. Because it reflects the error in the output of the network - large values are close to being correct, whilst small values are very wrong
A3. Because it reflects the error in the output of the network - small values are close to being correct, whilst large values are very wrong
Q12. How can network learning be explained in terms of the error function?
A1. It can't - it's irrelevant
A2. The network learns by altering its weights to reduce the error each time
A3. The network reduces the error by altering the target patterns each time