Question

What does: "t. Train it to fit the target concept -2 + x1 + x2 > 0." mean? in machine leaning ...

What does: "t. Train it to fit the target concept -2 + x1 + x2 > 0." mean? in machine leaning (Delta training function)

0 0
Add a comment Improve this question Transcribed image text
Answer #1

The statement actually indicates that the Sum of your Input variables "x1" and "x2" will always remain greater than 2, and that is the target you have to achieve via Neural Networks classification algorithm where we use Delta Training function in order to train our various input data sets.

You can calculate your results on the basis of the following concept :

The Delta Rule uses the difference between target activation and obtained activation to drive learning. The activation function is called a Linear Activation function, in which the output node’s activation is simply equal to the sum of the network’s respective input/weight products. A threshold activation function is dropped & instead a linear sum of products is used to calculate the activation of the output neuron. The strength of network connections (i.e., the values of the weights) is adjusted to reduce the difference between target and actual output activation (i.e., error).

4 2 3 4

During forward propagation through a network, the output (activation) of a given node is a function of its inputs. The inputs to a node, which are simply the products of the output of preceding nodes with their associated weights, are summed and then passed through an activation function before being sent out from the node. Thus, we have the following:

0*YakkSSORq5iLGqEz.jpg

and

0*LPO0KCwbyVRyM-1q.jpg

where ‘Sj’ is the sum of all relevant products of weights and outputs from the previous layer i, ‘wij’ represents the relevant weights connecting layer i with layer j, ‘ai’ represents the activation of nodes in the previous layer i, ‘aj’ is the activation of the node at hand, and ‘f’ is the activation function.

For any given set of input data and weights, there will be an associated magnitude of error, which is measured by an error function. The Delta Rule employs the error function for what is known as Gradient Descent learning, which involves the ‘modification of weights along the most direct path in weight-space to minimize error’, so change applied to a given weight is proportional to the negative of the derivative of the error with respect to that weight.

The Error/Cost function is commonly given as the sum of the squares of the differences between all target and actual node activation for the output layer. For a particular training pattern (i.e., training case), the error is thus given by:

0*ZMiNPnVW_ZkJsotc.jpg

where ‘Ep’ is total error over the training pattern, ½ is a value applied to simplify the function’s derivative, ’n’ represents all output nodes for a given training pattern, ‘tj’ sub n represents the Target value for node n in output layer j, and ‘aj’ sub n represents the actual activation for the same node. This particular error measure is attractive because its derivative, whose value is needed in the employment of the Delta Rule and is easily calculated. The error over an entire set of training patterns is calculated by summing all ‘Ep:

0*LaY3VegONr4MZQOd.jpg

where ‘E’ is the total error, and ‘p’ represents all training patterns. An equivalent term for E in earlier equation is Sum-of-squares error. A normalized version of this equation is given by the Mean Squared Error (MSE) equation:

0*RxbQReJfZmtuxkDx.jpg

where ‘P’ and ’N’ are the total number of training patterns and output nodes, respectively. It is the error of both previous equations, that gradient descent attempts to minimize (not strictly true if weights are changed after each input pattern is submitted to the network. The error over a given training pattern is commonly expressed in terms of the Total Sum of Squares error, which is simply equal to the sum of all squared errors overall output nodes and all training patterns. ‘The negative of the derivative of the error function is required in order to perform Gradient Descent Learning’. The derivative of our equation(which measures error for a given pattern ‘p’) above, with respect to a particular weight ‘wij’ sub ‘x’, is given by the chain rule as:

0*hSY28kiKptUuP62k.jpg

where ‘aj’ sub ‘z’ is activation of the node in the output layer that corresponds to weight ‘wij’ sub x.It follows that:

0*Ri3Yy-qTsx4xuxhx.jpg

and

0*nzuS6U_0t-tsOYu5.jpg

Thus, the derivative of the error over an individual training pattern is given by the product of the derivatives of our prior equation:

0*FNwGMEkIEXGrj3b0.jpg

Because Gradient Descent learning requires that any change in a particular weight be proportional to the negative of the derivative of the error, the change in a given weight must be proportional to the negative of our prior equation. Replacing the difference between the target and actual activation of the relevant output node by d, and introducing a learning rate epsilon, that equation can be re-written in the final form of the Delta Rule:

0*mSo97zCX2gnQBqej.jpg

Delta Rule for Perceptrons

The reasoning behind the use of a Linear Activation function here instead of a Threshold Activation function can now be justified: Threshold activation function that characterizes both the McCulloch and Pitts network and the perceptron is not differentiable at the transition between the activations of 0and 1 (slope = infinity), and its derivative is 0 over the remainder of the function. Hence, Threshold activation function cannot be used in Gradient Descent learning. Whereas a Linear Activation function (or any other function that is differential) allows the derivative of the error to be calculated.

1*t9qCtZ_hMGTYiEGt7kwT6w.jpeg

Three-dimensional depiction of an Actual error surface

1*jnG5--4AcQSNHaFqquiu7A.jpeg

The two-dimensional depiction of the error surface

Add a comment
Know the answer?
Add Answer to:
What does: "t. Train it to fit the target concept -2 + x1 + x2 > 0." mean? in machine leaning ...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT