Because it takes very long to train a FMP, a parallelization of the algorithm is worthwhile. There are two completely different methods for training a MFP with consequences for the parallelization: on-line training and batch learning.