diff --git a/.DS_Store b/.DS_Store index ad0a671abfad4af6d42b24e7b4057108fecddd9c..b0c8a765a5d9a446e18ead874510ac99e3bf575e 100644 Binary files a/.DS_Store and b/.DS_Store differ diff --git a/BLcourse4/bayesian_neural_networks_wine.ipynb b/BLcourse4/bayesian_neural_networks_wine.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..fe250fa5b8b584bc3b8c95fc615b524715b0bbfc --- /dev/null +++ b/BLcourse4/bayesian_neural_networks_wine.ipynb @@ -0,0 +1,746 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "bQufHvzxJZX1" + }, + "source": [ + "# Probabilistic Bayesian Neural Networks\n", + "\n", + "**Author:** [Khalid Salama](https://www.linkedin.com/in/khalid-salama-24403144/)<br>\n", + "**Date created:** 2021/01/15<br>\n", + "**Last modified:** 2021/01/15<br>\n", + "**Description:** Building probabilistic Bayesian neural network models with TensorFlow Probability." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "d6RFQWP6JZX3" + }, + "source": [ + "## Introduction\n", + "\n", + "Taking a probabilistic approach to deep learning allows to account for *uncertainty*,\n", + "so that models can assign less levels of confidence to incorrect predictions.\n", + "Sources of uncertainty can be found in the data, due to measurement error or\n", + "noise in the labels, or the model, due to insufficient data availability for\n", + "the model to learn effectively.\n", + "\n", + "\n", + "This example demonstrates how to build basic probabilistic Bayesian neural networks\n", + "to account for these two types of uncertainty.\n", + "We use [TensorFlow Probability](https://www.tensorflow.org/probability) library,\n", + "which is compatible with Keras API.\n", + "\n", + "This example requires TensorFlow 2.3 or higher.\n", + "You can install Tensorflow Probability using the following command:\n", + "\n", + "```python\n", + "pip install tensorflow-probability\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7upK2dweJZX4" + }, + "source": [ + "## The dataset\n", + "\n", + "We use the [Wine Quality](https://archive.ics.uci.edu/ml/datasets/wine+quality)\n", + "dataset, which is available in the [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/wine_quality).\n", + "We use the red wine subset, which contains 4,898 examples.\n", + "The dataset has 11numerical physicochemical features of the wine, and the task\n", + "is to predict the wine quality, which is a score between 0 and 10.\n", + "In this example, we treat this as a regression task.\n", + "\n", + "You can install TensorFlow Datasets using the following command:\n", + "\n", + "```python\n", + "pip install tensorflow-datasets\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Rg3x9SPVJZX4" + }, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "id": "SZ8HIsVcJZX5", + "outputId": "d986c301-f9f1-4982-b0cf-91571dfd9b2f", + "colab": { + "base_uri": "https://localhost:8080/" + } + }, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", + "Requirement already satisfied: tensorflow-probability in /usr/local/lib/python3.9/dist-packages (0.19.0)\n", + "Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (1.16.0)\n", + "Requirement already satisfied: cloudpickle>=1.3 in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (2.2.1)\n", + "Requirement already satisfied: numpy>=1.13.3 in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (1.22.4)\n", + "Requirement already satisfied: gast>=0.3.2 in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (0.4.0)\n", + "Requirement already satisfied: absl-py in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (1.4.0)\n", + "Requirement already satisfied: decorator in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (4.4.2)\n", + "Requirement already satisfied: dm-tree in /usr/local/lib/python3.9/dist-packages (from tensorflow-probability) (0.1.8)\n", + "Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n", + "Requirement already satisfied: tensorflow-datasets in /usr/local/lib/python3.9/dist-packages (4.8.3)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (1.22.4)\n", + "Requirement already satisfied: termcolor in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (2.2.0)\n", + "Requirement already satisfied: tensorflow-metadata in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (1.12.0)\n", + "Requirement already satisfied: click in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (8.1.3)\n", + "Requirement already satisfied: dm-tree in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (0.1.8)\n", + "Requirement already satisfied: wrapt in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (1.15.0)\n", + "Requirement already satisfied: tqdm in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (4.65.0)\n", + "Requirement already satisfied: toml in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (0.10.2)\n", + "Requirement already satisfied: etils[enp,epath]>=0.9.0 in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (1.1.1)\n", + "Requirement already satisfied: promise in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (2.3)\n", + "Requirement already satisfied: psutil in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (5.9.4)\n", + "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (2.27.1)\n", + "Requirement already satisfied: absl-py in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (1.4.0)\n", + "Requirement already satisfied: protobuf>=3.12.2 in /usr/local/lib/python3.9/dist-packages (from tensorflow-datasets) (3.19.6)\n", + "Requirement already satisfied: typing_extensions in /usr/local/lib/python3.9/dist-packages (from etils[enp,epath]>=0.9.0->tensorflow-datasets) (4.5.0)\n", + "Requirement already satisfied: importlib_resources in /usr/local/lib/python3.9/dist-packages (from etils[enp,epath]>=0.9.0->tensorflow-datasets) (5.12.0)\n", + "Requirement already satisfied: zipp in /usr/local/lib/python3.9/dist-packages (from etils[enp,epath]>=0.9.0->tensorflow-datasets) (3.15.0)\n", + "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->tensorflow-datasets) (1.26.15)\n", + "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->tensorflow-datasets) (2.0.12)\n", + "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->tensorflow-datasets) (3.4)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.9/dist-packages (from requests>=2.19.0->tensorflow-datasets) (2022.12.7)\n", + "Requirement already satisfied: six in /usr/local/lib/python3.9/dist-packages (from promise->tensorflow-datasets) (1.16.0)\n", + "Requirement already satisfied: googleapis-common-protos<2,>=1.52.0 in /usr/local/lib/python3.9/dist-packages (from tensorflow-metadata->tensorflow-datasets) (1.59.0)\n" + ] + } + ], + "source": [ + "!pip install tensorflow-probability\n", + "!pip install tensorflow-datasets\n", + "import numpy as np\n", + "import tensorflow as tf\n", + "from tensorflow import keras\n", + "from tensorflow.keras import layers\n", + "import tensorflow_datasets as tfds\n", + "import tensorflow_probability as tfp" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "r_fw7qGhJZX5" + }, + "source": [ + "## Create training and evaluation datasets\n", + "\n", + "Here, we load the `wine_quality` dataset using `tfds.load()`, and we convert\n", + "the target feature to float. Then, we shuffle the dataset and split it into\n", + "training and test sets. We take the first `train_size` examples as the train\n", + "split, and the rest as the test split." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "id": "47HHMmwvJZX6" + }, + "outputs": [], + "source": [ + "\n", + "def get_train_and_test_splits(train_size, batch_size=1):\n", + " # We prefetch with a buffer the same size as the dataset because th dataset\n", + " # is very small and fits into memory.\n", + " dataset = (\n", + " tfds.load(name=\"wine_quality\", as_supervised=True, split=\"train\")\n", + " .map(lambda x, y: (x, tf.cast(y, tf.float32)))\n", + " .prefetch(buffer_size=dataset_size)\n", + " .cache()\n", + " )\n", + " # We shuffle with a buffer the same size as the dataset.\n", + " train_dataset = (\n", + " dataset.take(train_size).shuffle(buffer_size=train_size).batch(batch_size)\n", + " )\n", + " test_dataset = dataset.skip(train_size).batch(batch_size)\n", + "\n", + " return train_dataset, test_dataset\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BnNWHEbCJZX6" + }, + "source": [ + "## Compile, train, and evaluate the model" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "id": "cz1h4N0jJZX6" + }, + "outputs": [], + "source": [ + "hidden_units = [8, 8]\n", + "learning_rate = 0.001\n", + "\n", + "\n", + "def run_experiment(model, loss, train_dataset, test_dataset):\n", + "\n", + " model.compile(\n", + " optimizer=keras.optimizers.RMSprop(learning_rate=learning_rate),\n", + " loss=loss,\n", + " metrics=[keras.metrics.RootMeanSquaredError()],\n", + " )\n", + "\n", + " print(\"Start training the model...\")\n", + " model.fit(train_dataset, epochs=num_epochs, validation_data=test_dataset)\n", + " print(\"Model training finished.\")\n", + " _, rmse = model.evaluate(train_dataset, verbose=0)\n", + " print(f\"Train RMSE: {round(rmse, 3)}\")\n", + "\n", + " print(\"Evaluating model performance...\")\n", + " _, rmse = model.evaluate(test_dataset, verbose=0)\n", + " print(f\"Test RMSE: {round(rmse, 3)}\")\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "tHoGJPopJZX6" + }, + "source": [ + "## Create model inputs" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "4gjSUoRRJZX7" + }, + "outputs": [], + "source": [ + "FEATURE_NAMES = [\n", + " \"fixed acidity\",\n", + " \"volatile acidity\",\n", + " \"citric acid\",\n", + " \"residual sugar\",\n", + " \"chlorides\",\n", + " \"free sulfur dioxide\",\n", + " \"total sulfur dioxide\",\n", + " \"density\",\n", + " \"pH\",\n", + " \"sulphates\",\n", + " \"alcohol\",\n", + "]\n", + "\n", + "\n", + "def create_model_inputs():\n", + " inputs = {}\n", + " for feature_name in FEATURE_NAMES:\n", + " inputs[feature_name] = layers.Input(\n", + " name=feature_name, shape=(1,), dtype=tf.float32\n", + " )\n", + " return inputs\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RyCIq6FlJZX7" + }, + "source": [ + "## Experiment 1: standard neural network\n", + "\n", + "We create a standard deterministic neural network model as a baseline." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NmMnI_LzJZX7" + }, + "outputs": [], + "source": [ + "\n", + "def create_baseline_model():\n", + " inputs = create_model_inputs()\n", + " input_values = [value for _, value in sorted(inputs.items())]\n", + " features = keras.layers.concatenate(input_values)\n", + " features = layers.BatchNormalization()(features)\n", + "\n", + " # Create hidden layers with deterministic weights using the Dense layer.\n", + " for units in hidden_units:\n", + " features = layers.Dense(units, activation=\"sigmoid\")(features)\n", + " # The output is deterministic: a single point estimate.\n", + " outputs = layers.Dense(units=1)(features)\n", + "\n", + " model = keras.Model(inputs=inputs, outputs=outputs)\n", + " return model\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bV0P5F9FJZX7" + }, + "source": [ + "Let's split the wine dataset into training and test sets, with 85% and 15% of\n", + "the examples, respectively." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "j71HZHqEJZX7" + }, + "outputs": [], + "source": [ + "dataset_size = 4898\n", + "batch_size = 256\n", + "train_size = int(dataset_size * 0.85)\n", + "train_dataset, test_dataset = get_train_and_test_splits(train_size, batch_size)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mwPeGoCTJZX8" + }, + "source": [ + "Now let's train the baseline model. We use the `MeanSquaredError`\n", + "as the loss function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "pAMJafENJZX8" + }, + "outputs": [], + "source": [ + "num_epochs = 100\n", + "mse_loss = keras.losses.MeanSquaredError()\n", + "baseline_model = create_baseline_model()\n", + "run_experiment(baseline_model, mse_loss, train_dataset, test_dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SYVUafCKJZX8" + }, + "source": [ + "We take a sample from the test set use the model to obtain predictions for them.\n", + "Note that since the baseline model is deterministic, we get a single a\n", + "*point estimate* prediction for each test example, with no information about the\n", + "uncertainty of the model nor the prediction." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "sBt9xVWJJZX8" + }, + "outputs": [], + "source": [ + "sample = 10\n", + "examples, targets = list(test_dataset.unbatch().shuffle(batch_size * 10).batch(sample))[\n", + " 0\n", + "]\n", + "\n", + "predicted = baseline_model(examples).numpy()\n", + "for idx in range(sample):\n", + " print(f\"Predicted: {round(float(predicted[idx][0]), 1)} - Actual: {targets[idx]}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RRyLpJQ4JZX8" + }, + "source": [ + "## Experiment 2: Bayesian neural network (BNN)\n", + "\n", + "The object of the Bayesian approach for modeling neural networks is to capture\n", + "the *epistemic uncertainty*, which is uncertainty about the model fitness,\n", + "due to limited training data.\n", + "\n", + "The idea is that, instead of learning specific weight (and bias) *values* in the\n", + "neural network, the Bayesian approach learns weight *distributions*\n", + "- from which we can sample to produce an output for a given input -\n", + "to encode weight uncertainty.\n", + "\n", + "Thus, we need to define prior and the posterior distributions of these weights,\n", + "and the training process is to learn the parameters of these distributions." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "eRZ-K9-yJZX8" + }, + "outputs": [], + "source": [ + "# Define the prior weight distribution as Normal of mean=0 and stddev=1.\n", + "# Note that, in this example, the we prior distribution is not trainable,\n", + "# as we fix its parameters.\n", + "def prior(kernel_size, bias_size, dtype=None):\n", + " n = kernel_size + bias_size\n", + " prior_model = keras.Sequential(\n", + " [\n", + " tfp.layers.DistributionLambda(\n", + " lambda t: tfp.distributions.MultivariateNormalDiag(\n", + " loc=tf.zeros(n), scale_diag=tf.ones(n)\n", + " )\n", + " )\n", + " ]\n", + " )\n", + " return prior_model\n", + "\n", + "\n", + "# Define variational posterior weight distribution as multivariate Gaussian.\n", + "# Note that the learnable parameters for this distribution are the means,\n", + "# variances, and covariances.\n", + "def posterior(kernel_size, bias_size, dtype=None):\n", + " n = kernel_size + bias_size\n", + " posterior_model = keras.Sequential(\n", + " [\n", + " tfp.layers.VariableLayer(\n", + " tfp.layers.MultivariateNormalTriL.params_size(n), dtype=dtype\n", + " ),\n", + " tfp.layers.MultivariateNormalTriL(n),\n", + " ]\n", + " )\n", + " return posterior_model\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1oyFZyk6JZX9" + }, + "source": [ + "We use the `tfp.layers.DenseVariational` layer instead of the standard\n", + "`keras.layers.Dense` layer in the neural network model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "EtdF-vuJJZX9" + }, + "outputs": [], + "source": [ + "\n", + "def create_bnn_model(train_size):\n", + " inputs = create_model_inputs()\n", + " features = keras.layers.concatenate(list(inputs.values()))\n", + " features = layers.BatchNormalization()(features)\n", + "\n", + " # Create hidden layers with weight uncertainty using the DenseVariational layer.\n", + " for units in hidden_units:\n", + " features = tfp.layers.DenseVariational(\n", + " units=units,\n", + " make_prior_fn=prior,\n", + " make_posterior_fn=posterior,\n", + " kl_weight=1 / train_size,\n", + " activation=\"sigmoid\",\n", + " )(features)\n", + "\n", + " # The output is deterministic: a single point estimate.\n", + " outputs = layers.Dense(units=1)(features)\n", + " model = keras.Model(inputs=inputs, outputs=outputs)\n", + " return model\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7mWuwXVoJZX9" + }, + "source": [ + "The epistemic uncertainty can be reduced as we increase the size of the\n", + "training data. That is, the more data the BNN model sees, the more it is certain\n", + "about its estimates for the weights (distribution parameters).\n", + "Let's test this behaviour by training the BNN model on a small subset of\n", + "the training set, and then on the full training set, to compare the output variances." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qVqiPpv3JZX9" + }, + "source": [ + "### Train BNN with a small training subset." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_FTuHpudJZX9" + }, + "outputs": [], + "source": [ + "num_epochs = 500\n", + "train_sample_size = int(train_size * 0.3)\n", + "small_train_dataset = train_dataset.unbatch().take(train_sample_size).batch(batch_size)\n", + "\n", + "bnn_model_small = create_bnn_model(train_sample_size)\n", + "run_experiment(bnn_model_small, mse_loss, small_train_dataset, test_dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fO-VvjydJZX9" + }, + "source": [ + "Since we have trained a BNN model, the model produces a different output each time\n", + "we call it with the same input, since each time a new set of weights are sampled\n", + "from the distributions to construct the network and produce an output.\n", + "The less certain the mode weights are, the more variability (wider range) we will\n", + "see in the outputs of the same inputs." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "oDLpALs7JZX-" + }, + "outputs": [], + "source": [ + "\n", + "def compute_predictions(model, iterations=100):\n", + " predicted = []\n", + " for _ in range(iterations):\n", + " predicted.append(model(examples).numpy())\n", + " predicted = np.concatenate(predicted, axis=1)\n", + "\n", + " prediction_mean = np.mean(predicted, axis=1).tolist()\n", + " prediction_min = np.min(predicted, axis=1).tolist()\n", + " prediction_max = np.max(predicted, axis=1).tolist()\n", + " prediction_range = (np.max(predicted, axis=1) - np.min(predicted, axis=1)).tolist()\n", + "\n", + " for idx in range(sample):\n", + " print(\n", + " f\"Predictions mean: {round(prediction_mean[idx], 2)}, \"\n", + " f\"min: {round(prediction_min[idx], 2)}, \"\n", + " f\"max: {round(prediction_max[idx], 2)}, \"\n", + " f\"range: {round(prediction_range[idx], 2)} - \"\n", + " f\"Actual: {targets[idx]}\"\n", + " )\n", + "\n", + "\n", + "compute_predictions(bnn_model_small)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Mh1cMJ1VJZX-" + }, + "source": [ + "### Train BNN with the whole training set." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_e8b8DCJJZX-" + }, + "outputs": [], + "source": [ + "num_epochs = 500\n", + "bnn_model_full = create_bnn_model(train_size)\n", + "run_experiment(bnn_model_full, mse_loss, train_dataset, test_dataset)\n", + "\n", + "compute_predictions(bnn_model_full)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Zd0ET93XJZX-" + }, + "source": [ + "Notice that the model trained with the full training dataset shows smaller range\n", + "(uncertainty) in the prediction values for the same inputs, compared to the model\n", + "trained with a subset of the training dataset." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2b2sbxYRJZX-" + }, + "source": [ + "## Experiment 3: probabilistic Bayesian neural network\n", + "\n", + "So far, the output of the standard and the Bayesian NN models that we built is\n", + "deterministic, that is, produces a point estimate as a prediction for a given example.\n", + "We can create a probabilistic NN by letting the model output a distribution.\n", + "In this case, the model captures the *aleatoric uncertainty* as well,\n", + "which is due to irreducible noise in the data, or to the stochastic nature of the\n", + "process generating the data.\n", + "\n", + "In this example, we model the output as a `IndependentNormal` distribution,\n", + "with learnable mean and variance parameters. If the task was classification,\n", + "we would have used `IndependentBernoulli` with binary classes, and `OneHotCategorical`\n", + "with multiple classes, to model distribution of the model output." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "9nzDZQedJZX_" + }, + "outputs": [], + "source": [ + "\n", + "def create_probablistic_bnn_model(train_size):\n", + " inputs = create_model_inputs()\n", + " features = keras.layers.concatenate(list(inputs.values()))\n", + " features = layers.BatchNormalization()(features)\n", + "\n", + " # Create hidden layers with weight uncertainty using the DenseVariational layer.\n", + " for units in hidden_units:\n", + " features = tfp.layers.DenseVariational(\n", + " units=units,\n", + " make_prior_fn=prior,\n", + " make_posterior_fn=posterior,\n", + " kl_weight=1 / train_size,\n", + " activation=\"sigmoid\",\n", + " )(features)\n", + "\n", + " # Create a probabilisticÄ output (Normal distribution), and use the `Dense` layer\n", + " # to produce the parameters of the distribution.\n", + " # We set units=2 to learn both the mean and the variance of the Normal distribution.\n", + " distribution_params = layers.Dense(units=2)(features)\n", + " outputs = tfp.layers.IndependentNormal(1)(distribution_params)\n", + "\n", + " model = keras.Model(inputs=inputs, outputs=outputs)\n", + " return model\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "qUFOwXTyJZX_" + }, + "source": [ + "Since the output of the model is a distribution, rather than a point estimate,\n", + "we use the [negative loglikelihood](https://en.wikipedia.org/wiki/Likelihood_function)\n", + "as our loss function to compute how likely to see the true data (targets) from the\n", + "estimated distribution produced by the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5SdzlMuvJZYA" + }, + "outputs": [], + "source": [ + "\n", + "def negative_loglikelihood(targets, estimated_distribution):\n", + " return -estimated_distribution.log_prob(targets)\n", + "\n", + "\n", + "num_epochs = 1000\n", + "prob_bnn_model = create_probablistic_bnn_model(train_size)\n", + "run_experiment(prob_bnn_model, negative_loglikelihood, train_dataset, test_dataset)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "53qcYDKyJZYA" + }, + "source": [ + "Now let's produce an output from the model given the test examples.\n", + "The output is now a distribution, and we can use its mean and variance\n", + "to compute the confidence intervals (CI) of the prediction." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Dv_RwLyFJZYA" + }, + "outputs": [], + "source": [ + "prediction_distribution = prob_bnn_model(examples)\n", + "prediction_mean = prediction_distribution.mean().numpy().tolist()\n", + "prediction_stdv = prediction_distribution.stddev().numpy()\n", + "\n", + "# The 95% CI is computed as mean ± (1.96 * stdv)\n", + "upper = (prediction_mean + (1.96 * prediction_stdv)).tolist()\n", + "lower = (prediction_mean - (1.96 * prediction_stdv)).tolist()\n", + "prediction_stdv = prediction_stdv.tolist()\n", + "\n", + "for idx in range(sample):\n", + " print(\n", + " f\"Prediction mean: {round(prediction_mean[idx][0], 2)}, \"\n", + " f\"stddev: {round(prediction_stdv[idx][0], 2)}, \"\n", + " f\"95% CI: [{round(upper[idx][0], 2)} - {round(lower[idx][0], 2)}]\"\n", + " f\" - Actual: {targets[idx]}\"\n", + " )" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "name": "bayesian_neural_networks_wine", + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.0" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file diff --git a/slides/BLcourse4.key b/slides/BLcourse4.key index fb30c03e2b75c733b4883069300a78f86bd24779..35b87219b6589e1555df9cc7b52be339795d2b23 100755 Binary files a/slides/BLcourse4.key and b/slides/BLcourse4.key differ