2.4. Complexity Regularization Given a fixed training sample, an ANN with excessive hidden neutrons will overfit the data whereas the ANN with insufficient neutrons cannot capture all of systems’ properties and become unstable.
In analogy to the linear programming problem, the excessive neurons issue is like more equations than the variables whereas the insufficient selleck neurons issue is like more variables than equations. One idea is to begin an ANN with zero hidden neurons instead of fixing the ANN structure at first and then insert hidden neurons as needed until the MSE can be reduced to an acceptable level. One of commonly accepted such algorithms is the cascade-correlation (CC) developed by Fahlman and Lebiere [15]. The initial CC neuron network contains
zero hidden neuron and therefore it is likely that the target MSE cannot be reached even with a large size of training data. Secondly, several so-called candidate neurons are created and they are only connected to all input neurons and existing hidden neurons with random weights; the third step is to train the weights of neuron candidates to maximize the correlation between the candidate hidden neurons’ activations and overall network errors, which is calculated with (10). Thirdly select the candidate neuron with the highest correlation, freeze its connection weights (i.e., unchangeable during the later training process) to the input neurons, and connect it to the output neurons with random connect weights. At this point, the original CC network grows by one more neuron and lastly the new CC network is trained again to minimize the MSE. If the target MSE is reached, then the training process ends; otherwise, go back to step 2 and
repeat until the target MSE is reached. Obviously, the final CC network contains multiple single-neuron hidden layers: C=∑o∑php−heop−eo∑o∑peop−eo2, (10) where h is the hidden neuron activation; e is the network error; h, e0 are means. As for the ANN’s applications to the traffic studies, it is still in its infantry. Lu et al. developed a neural network based tool to filter and mining the Brefeldin_A highly skewed traffic data [16]; Huang utilized the wavelet neural network to forecast the traffic flow and the results reveal the forecasting accuracy was improved compared to the traditional methods [17]; Chong et al. deployed the feedforward neural network to train the driver in simulation based on the naturalistic data and the results showed that the driving behavior is closer to the actual observation than the traditional car-following models [18]. Jia et al. trained an ANN-based car-following model with the data collected via a five-wheel system. The inputs include speed of following vehicle, relative speed, relative distance, and desired speed. The output vector includes the acceleration of the following vehicle [19].