Randomized Priors vs Stochastic Weight Averaging Gaussian Demo

This project is a comparison of different methods used to generate an approximation to the predictive posterior distribution of a neural networks. A standard, basic classifier was trained on MNIST data and achieved an accuracy of 95.8%. The structure of the network is intentionally quite small so as to enable running multiple models at once. The structure of the network is visible here. Then, techniques from the papers A Simple Baseline for Bayesian Uncertainty in Deep Learning (e.g. SWA-Gaussian) and Randomized Prior Functions for Deep Reinforcement Learning were implemented in PyTorch and exported to ONNX by drawing 20 networks from the "posterior" and taking the average over the outputs. The implementations are available here, and the network structures are explorable for Randomized Priors and SWA-Gaussian using Netron. On the SWA-Gaussian and Randomized Priors outputs, error bars are added which correspond to the standard deviation of the prediction outputs from the 20 models that make up the classifier. This allows us to see how confident the networks are in their predictions. Normally, we would use a Softmax layer to normalize the outputs, but I chose to instead scale the outputs so that they sum to 1.0 in order to make the standard deviations more interpretable. To use the demo, use your mouse to draw a digit in the box and see how the different techniques behave, especially when given digits that are not clearly of one class.