소스 검색

Update for MEAP v6.

Eli Stevens 6 년 전
부모
커밋
7c92922760

BIN
data/p1ch6/birds_vs_airplanes.pt


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 0 - 0
p1ch2/2_pre_trained_networks.ipynb


+ 86 - 0
p1ch4/4_audio_chirp.ipynb

@@ -1,5 +1,13 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Audio\n",
+    "===="
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 1,
@@ -11,6 +19,23 @@
     "torch.set_printoptions(edgeitems=2, threshold=50)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Sound can be seen as fluctuations of pressure of a medium, air for instance, at a certain location in time. There are other representations that we'll get into in a minute, but we can think about this as the _raw_, time-domain representation. In order for the human ear to appreciate sound, pressure must fluctuate with a frequency between 20 and 20000 oscillations per second (measured in Hertz, Hz). More oscillations per second will lead to a higher perceived pitch.\n",
+    "\n",
+    "By recording pressure fluctuations in time using a microphone and converting every pressure level at each time point into a number (e.g. a 16-bit integer), we can now represent sound as a vector of numbers. This is known as Pulse Code Modulation (PCM), where a continuous signal is both sampled in time and quantized in amplitude. If we want to make sure we hear the highest possible pitch in a recording, we'll have to record our samples at slightly more than twice the maximum audible frequency, i.e. just over 40000 times per second. It is not by chance that audio CD's have a sampling frequency of 44100 Hz. This means that a one-hour stereo (i.e. 2 channels) CD track where samples are recorded at 16-bit precision will amount to `2 * 16 * 44100 * 3600 = 5080320000 bit = 605.6 MB` if stored without compression.\n",
+    "\n",
+    "There are a plethora of audio formats, WAV, AIFF, MP3, AAC being the most popular, where raw audio signals are typically encoded in compressed form by leveraging on both correlation between successive samples in the time series, between the two stereo channels as well as elimination of barely audible frequencies. This can result in dramatic reduction of storage requirements (a one-hour audio file in AAC format takes less than 60 MB). In addition, audio players can decode these formats on the fly on dedicated hardware, consuming a tiny amount of power.\n",
+    "\n",
+    "In our data scientist role we may have to feed audio samples to our network and classify them, or generate captions, for instance. In that case, we won't work with compressed data, rather we'll have to find a way to load an audio file in some format and lay it out as an uncompressed time series in a tensor. Let's do that now.\n",
+    "\n",
+    "We can download a fair number of environmental sounds at the ESC-50 repository (https://github.com/karoldvl/ESC-50) in the `audio` directory. Let's get `1-100038-A-14.wav` for instance, containing the sound of a bird chirping.\n",
+    "\n",
+    "In order to load the sound we resort to SciPy, specifically `scipy.io.wavfile.read`, which has the nice property to return data as a NumPy array:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
@@ -34,6 +59,15 @@
     "freq, waveform_arr"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `read` function returns two outputs, namely the sampling frequency and the waveform as a 16-bit integer 1D array. It's a single 1D array, which tells us that it's a mono recording - we'd have two waveforms (two channels) if the sound were stereo.\n",
+    "\n",
+    "We can convert the array to a tensor and we're good to go. We might also want to convert the waveform tensor to a float tensor since we're at it."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 3,
@@ -55,6 +89,31 @@
     "waveform.shape"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In a typical dataset, we'll have more than one waveform, and possibly over more than one channel. Depending on the kind of network employed for carrying out a task, for instance a sound classification task, we would be required to lay out the tensor in one of two ways.\n",
+    "\n",
+    "For architectures based on filtering the 1D signal with cascades of learned filter banks, such as convolutional networks, we would need to lay out the tensor as `N x C x L`, where `N` is the number of sounds in a dataset, `C` the number of channels and `L` the number of samples in time.\n",
+    "\n",
+    "Conversely, for architectures that incorporate the notion of temporal sequences, just as recurrent networks we mentioned for text, data needs to be laid out as `L x N x C` - sequence length comes first. Intuitively, this is because the latter architectures take one set of `C` values at a time - the signal is not considered as a whole, but as an individual input changing in time.\n",
+    "\n",
+    "Although the most straightforward, this is only one of the ways to represent audio so that it is digestible by a neural network. Anther way is turning the audio signal into a _spectrogram_.\n",
+    "\n",
+    "Instead of representing oscillations explicitly in time, we can characterize what at frequencies those oscillations occur for short time intervals. So, for instance, if we pluck the fifth string of our (hopefully tuned) guitar and we focus on 0.1 seconds of that recording, we will see that the waveform oscillates at 440 cycles per second, plus smaller spurious oscillations at different frequencies that make up the timbre of the sound. If we move on to subsequent 0.1 second intervals, we now see that the frequency content doesn't change, but the intensity does, as the sound of our string fades. If we now decide to pluck another string, we will observe new frequencies fading in time.\n",
+    "\n",
+    "We could indeed build a plot having time in the X-axis, frequencies heard at that time in the Y-axis and encode intensity of those frequencies as a value at that X and Y. Or color. Ok, that starts to look like an image, right?\n",
+    "\n",
+    "That's correct, spectrograms are a representation of the intensity at each frequency at each point in time. It turns out that one can train convolutional neural networks built for analyzing images (we'll see about those in a couple of chapters) on sound represented as a spectrogram.\n",
+    "\n",
+    "Let's see how we can turn the sound we loaded earlier into a spectrogram. To do that, we need to resort to a method for converting a signal in the time domain into its frequency content. This is known as the Fourier transform, and the algorithm that allows us to compute it efficiently is the Fast Fourier Trasform (FFT), which is one of the most widespread algorithms out there. If we do that consecutively for short bursts of sound in time, we can build out spectrogram column by column.\n",
+    "\n",
+    "This is the general idea and we won't go into too many details here. Luckily for us SciPy has a function that gets us a shiny spectrogram given an input waveform. We import the `signal` module from SciPy,\n",
+    "then provide the `spectrogram` function with the waveform and the sampling frequency that we got previously.\n",
+    "The return values are all NumPy arrays, namely frequency `f_arr` (values along the Y axis), time `t_arr` (values along the X axis) and the actual spectrogra `sp_arr` as a 2D array. Turning the latter into a PyTorch tensor is trivial:\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,
@@ -80,6 +139,16 @@
     "sp_mono.shape"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "Dimensions are `F x T`, where `F` is frequency and `T` is time.\n",
+    "\n",
+    "As we mentioned earlier, stereo sound has two channels, which will lead to a two-channel spectrogram. Suppose we have two spectrograms, one for each channel. We can convert the two channels separately:\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 5,
@@ -103,6 +172,14 @@
     "sp_left_tensor.shape, sp_right_tensor.shape"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "and stack the two tensors along the first dimension to obtain a two channels image of size `C x F x T`, where `C` is the number channels:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 6,
@@ -123,6 +200,15 @@
     "sp_tensor = torch.stack((sp_left_tensor, sp_right_tensor), dim=0)\n",
     "sp_tensor.shape"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If we want to build a dataset to use as input for a network, we will stack multiple spectrograms representing multiple sounds in a dataset along the first dimension, leading to a `N x C x F x T` tensor.\n",
+    "\n",
+    "Such tensor is indistinguishable from what we would build for a dataset set of images, where `F` is represents rows and `T` columns of an image. Indeed, we would tackle a sound classification problem on spectrograms with the exact same networks."
+   ]
   }
  ],
  "metadata": {

+ 45 - 0
p1ch4/7_video_cockatoo.ipynb

@@ -1,5 +1,13 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Video\n",
+    "===="
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 1,
@@ -11,6 +19,16 @@
     "torch.set_printoptions(edgeitems=2, threshold=50)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "When it comes to the shape of tensors, video data can be seen as equivalent to volumetric data, with `depth` replaced by the `time` dimension. The result is again a 5D tensor with shape `N x C x T x H x W`.\n",
+    "\n",
+    "There are several formats for video, especially geared towards compression by exploiting redundancies in space and time. Luckily for us, `imageio` reads video data as well. Suppose we'd like to retain 100 consecutive frames in our 512 x 512 RBG video for classifying an action using a convolutional neural network. We first create a reader instance for the video, that will allow us to get information about the video and iterate over the frames in time.\n",
+    "Let's see what the meta data for the video looks like:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 2,
@@ -41,6 +59,13 @@
     "meta"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We now have all the information to size the tensor that will store the video frames:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 3,
@@ -65,6 +90,14 @@
     "video.shape"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we just iterate over the reader and set the values for all three channels into in the proper `i`-th time slice.\n",
+    "This might take a few seconds to finish!"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 4,
@@ -76,6 +109,18 @@
     "    video[:, i] = torch.transpose(frame, 0, 2)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In the above, we iterate over individual frames and set each frame in the `C x T x H x W` video tensor, after transposing the channel. We can then obtain a batch by stacking multiple 4D tensors or pre-allocating a 5D tensor with a known batch size and filling it iteratively, clip by clip, assuming clips are trimmed to a fixed number of frames.\n",
+    "\n",
+    "Equating video data to volumetric data is not the only way to represent video for training purposes. This is a valid strategy if we deal with video bursts of fixed length. An alternative strategy is to resort to network architectures capable of processing long sequences and exploiting short and long-term relationships in time, just like for text or audio.\n",
+    "// We'll see this kind of architectures when we take on recurrent networks.\n",
+    "\n",
+    "This next approach accounts for time along the batch dimension. Hence, we'll build our dataset as a 4D tensor, stacking frame by frame in the batch:\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 5,

파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 451 - 0
p1ch5/1_parameter_estimation.ipynb


+ 195 - 0
p1ch5/2_autograd.ipynb

@@ -0,0 +1,195 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "import numpy as np\n",
+    "import torch\n",
+    "torch.set_printoptions(edgeitems=2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t_c = torch.tensor([0.5, 14.0, 15.0, 28.0, 11.0, 8.0, 3.0, -4.0, 6.0, 13.0, 21.0])\n",
+    "t_u = torch.tensor([35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4])\n",
+    "t_un = 0.1 * t_u"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def model(t_u, w, b):\n",
+    "    return w * t_u + b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def loss_fn(t_p, t_c):\n",
+    "    squared_diffs = (t_p - t_c)**2\n",
+    "    return squared_diffs.mean()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "params = torch.tensor([1.0, 0.0], requires_grad=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "params.grad is None"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([4517.2969,   82.6000])"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "loss = loss_fn(model(t_u, *params), t_c)\n",
+    "loss.backward()\n",
+    "\n",
+    "params.grad"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "if params.grad is not None:\n",
+    "    params.grad.zero_()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def training_loop(n_epochs, learning_rate, params, t_u, t_c):\n",
+    "    for epoch in range(1, n_epochs + 1):\n",
+    "        if params.grad is not None:  # <1>\n",
+    "            params.grad.zero_()\n",
+    "        \n",
+    "        t_p = model(t_u, *params) \n",
+    "        loss = loss_fn(t_p, t_c)\n",
+    "        loss.backward()\n",
+    "        \n",
+    "        params = (params - learning_rate * params.grad).detach().requires_grad_()\n",
+    "\n",
+    "        if epoch % 500 == 0:\n",
+    "            print('Epoch %d, Loss %f' % (epoch, float(loss)))\n",
+    "            \n",
+    "    return params"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 500, Loss 7.860116\n",
+      "Epoch 1000, Loss 3.828538\n",
+      "Epoch 1500, Loss 3.092191\n",
+      "Epoch 2000, Loss 2.957697\n",
+      "Epoch 2500, Loss 2.933134\n",
+      "Epoch 3000, Loss 2.928648\n",
+      "Epoch 3500, Loss 2.927830\n",
+      "Epoch 4000, Loss 2.927679\n",
+      "Epoch 4500, Loss 2.927652\n",
+      "Epoch 5000, Loss 2.927647\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "tensor([  5.3671, -17.3012], requires_grad=True)"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "training_loop(\n",
+    "    n_epochs = 5000, \n",
+    "    learning_rate = 1e-2, \n",
+    "    params = torch.tensor([1.0, 0.0], requires_grad=True), # <1> \n",
+    "    t_u = t_un, # <2> \n",
+    "    t_c = t_c)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

+ 433 - 0
p1ch5/3_optimizers.ipynb

@@ -0,0 +1,433 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "import numpy as np\n",
+    "import torch\n",
+    "torch.set_printoptions(edgeitems=2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t_c = torch.tensor([0.5, 14.0, 15.0, 28.0, 11.0, 8.0, 3.0, -4.0, 6.0, 13.0, 21.0])\n",
+    "t_u = torch.tensor([35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4])\n",
+    "t_un = 0.1 * t_u"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def model(t_u, w, b):\n",
+    "    return w * t_u + b"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def loss_fn(t_p, t_c):\n",
+    "    squared_diffs = (t_p - t_c)**2\n",
+    "    return squared_diffs.mean()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['ASGD',\n",
+       " 'Adadelta',\n",
+       " 'Adagrad',\n",
+       " 'Adam',\n",
+       " 'Adamax',\n",
+       " 'LBFGS',\n",
+       " 'Optimizer',\n",
+       " 'RMSprop',\n",
+       " 'Rprop',\n",
+       " 'SGD',\n",
+       " 'SparseAdam',\n",
+       " '__builtins__',\n",
+       " '__cached__',\n",
+       " '__doc__',\n",
+       " '__file__',\n",
+       " '__loader__',\n",
+       " '__name__',\n",
+       " '__package__',\n",
+       " '__path__',\n",
+       " '__spec__',\n",
+       " 'lr_scheduler']"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import torch.optim as optim\n",
+    "\n",
+    "dir(optim)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "params = torch.tensor([1.0, 0.0], requires_grad=True)\n",
+    "learning_rate = 1e-5\n",
+    "optimizer = optim.SGD([params], lr=learning_rate)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([ 9.5483e-01, -8.2600e-04], requires_grad=True)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "t_p = model(t_u, *params)\n",
+    "loss = loss_fn(t_p, t_c)\n",
+    "loss.backward()\n",
+    "\n",
+    "optimizer.step()\n",
+    "\n",
+    "params"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "tensor([1.7761, 0.1064], requires_grad=True)"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "params = torch.tensor([1.0, 0.0], requires_grad=True)\n",
+    "learning_rate = 1e-2\n",
+    "optimizer = optim.SGD([params], lr=learning_rate)\n",
+    "\n",
+    "t_p = model(t_un, *params)\n",
+    "loss = loss_fn(t_p, t_c)\n",
+    "\n",
+    "optimizer.zero_grad() # <1>\n",
+    "loss.backward()\n",
+    "optimizer.step()\n",
+    "\n",
+    "params"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def training_loop(n_epochs, optimizer, params, t_u, t_c):\n",
+    "    for epoch in range(1, n_epochs + 1):\n",
+    "        t_p = model(t_u, *params) \n",
+    "        loss = loss_fn(t_p, t_c)\n",
+    "        \n",
+    "        optimizer.zero_grad()\n",
+    "        loss.backward()\n",
+    "        optimizer.step()\n",
+    "\n",
+    "        if epoch % 500 == 0:\n",
+    "            print('Epoch %d, Loss %f' % (epoch, float(loss)))\n",
+    "            \n",
+    "    return params"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 500, Loss 7.860116\n",
+      "Epoch 1000, Loss 3.828538\n",
+      "Epoch 1500, Loss 3.092191\n",
+      "Epoch 2000, Loss 2.957697\n",
+      "Epoch 2500, Loss 2.933134\n",
+      "Epoch 3000, Loss 2.928648\n",
+      "Epoch 3500, Loss 2.927830\n",
+      "Epoch 4000, Loss 2.927679\n",
+      "Epoch 4500, Loss 2.927652\n",
+      "Epoch 5000, Loss 2.927647\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "tensor([  5.3671, -17.3012], requires_grad=True)"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "params = torch.tensor([1.0, 0.0], requires_grad=True)\n",
+    "learning_rate = 1e-2\n",
+    "optimizer = optim.SGD([params], lr=learning_rate) # <1>\n",
+    "\n",
+    "training_loop(\n",
+    "    n_epochs = 5000, \n",
+    "    optimizer = optimizer,\n",
+    "    params = params, # <1> \n",
+    "    t_u = t_un,\n",
+    "    t_c = t_c)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 500, Loss 7.612901\n",
+      "Epoch 1000, Loss 3.086700\n",
+      "Epoch 1500, Loss 2.928578\n",
+      "Epoch 2000, Loss 2.927646\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "tensor([  0.5367, -17.3021], requires_grad=True)"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "params = torch.tensor([1.0, 0.0], requires_grad=True)\n",
+    "learning_rate = 1e-1\n",
+    "optimizer = optim.Adam([params], lr=learning_rate) # <1>\n",
+    "\n",
+    "training_loop(\n",
+    "    n_epochs = 2000, \n",
+    "    optimizer = optimizer,\n",
+    "    params = params,\n",
+    "    t_u = t_u, # <2> \n",
+    "    t_c = t_c)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(tensor([ 8,  0,  3,  6,  4,  1,  2,  5, 10]), tensor([9, 7]))"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "n_samples = t_u.shape[0]\n",
+    "n_val = int(0.2 * n_samples)\n",
+    "\n",
+    "shuffled_indices = torch.randperm(n_samples)\n",
+    "\n",
+    "train_indices = shuffled_indices[:-n_val]\n",
+    "val_indices = shuffled_indices[-n_val:]\n",
+    "\n",
+    "train_indices, val_indices  # <1>"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "t_u_train = t_u[train_indices]\n",
+    "t_c_train = t_c[train_indices]\n",
+    "\n",
+    "t_u_val = t_u[val_indices]\n",
+    "t_c_val = t_c[val_indices]\n",
+    "\n",
+    "t_un_train = 0.1 * t_u_train\n",
+    "t_un_val = 0.1 * t_u_val"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def training_loop(n_epochs, optimizer, params, t_u_train, t_u_val, t_c_train, t_c_val):\n",
+    "    for epoch in range(1, n_epochs + 1):\n",
+    "        t_p_train = model(t_un_train, *params) # <1>\n",
+    "        loss_train = loss_fn(t_p_train, t_c_train)\n",
+    "\n",
+    "        t_p_val = model(t_un_val, *params) # <1>\n",
+    "        loss_val = loss_fn(t_p_val, t_c_val)\n",
+    "        \n",
+    "        optimizer.zero_grad()\n",
+    "        loss_train.backward() # <2>\n",
+    "        optimizer.step()\n",
+    "\n",
+    "        if epoch <= 3 or epoch % 500 == 0:\n",
+    "            print('Epoch {}, Training loss {}, Validation loss {}'.format(\n",
+    "                epoch, float(loss_train), float(loss_val)))\n",
+    "            \n",
+    "    return params"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Epoch 1, Training loss 88.59708404541016, Validation loss 43.31699752807617\n",
+      "Epoch 2, Training loss 34.42190933227539, Validation loss 35.03486633300781\n",
+      "Epoch 3, Training loss 27.57990264892578, Validation loss 40.214229583740234\n",
+      "Epoch 500, Training loss 9.516923904418945, Validation loss 9.02982234954834\n",
+      "Epoch 1000, Training loss 4.543173789978027, Validation loss 2.596876621246338\n",
+      "Epoch 1500, Training loss 3.1108808517456055, Validation loss 2.9066450595855713\n",
+      "Epoch 2000, Training loss 2.6984243392944336, Validation loss 4.1561737060546875\n",
+      "Epoch 2500, Training loss 2.579646348953247, Validation loss 5.138668537139893\n",
+      "Epoch 3000, Training loss 2.5454416275024414, Validation loss 5.755766868591309\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "tensor([  5.6473, -18.7334], requires_grad=True)"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "params = torch.tensor([1.0, 0.0], requires_grad=True)\n",
+    "learning_rate = 1e-2\n",
+    "optimizer = optim.SGD([params], lr=learning_rate)\n",
+    "\n",
+    "training_loop(\n",
+    "    n_epochs = 3000, \n",
+    "    optimizer = optimizer,\n",
+    "    params = params,\n",
+    "    t_u_train = t_un_train, # <1> \n",
+    "    t_u_val = t_un_val, # <1> \n",
+    "    t_c_train = t_c_train,\n",
+    "    t_c_val = t_c_val)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def training_loop(n_epochs, optimizer, params, t_u_train, t_u_val, t_c_train, t_c_val):\n",
+    "    for epoch in range(1, n_epochs + 1):\n",
+    "        t_p_train = model(t_un_train, *params)\n",
+    "        loss_train = loss_fn(t_p_train, t_c_train)\n",
+    "\n",
+    "        with torch.no_grad(): # <1>\n",
+    "            t_p_val = model(t_un_val, *params)\n",
+    "            loss_val = loss_fn(t_p_val, t_c_val)\n",
+    "            assert loss_val.requires_grad == False # <2>\n",
+    "            \n",
+    "        optimizer.zero_grad()\n",
+    "        loss_train.backward()\n",
+    "        optimizer.step()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def calc_forward(t_u, t_c, is_train):\n",
+    "    with torch.set_grad_enabled(is_train):\n",
+    "        t_p = model(t_u, *params)\n",
+    "        loss = loss_fn(t_p, t_c)\n",
+    "    return loss"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.6.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}

파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 573 - 0
p1ch5/4_neural_networks.ipynb


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 0 - 21931
p1ch5/p1ch5.ipynb


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 56 - 0
p1ch6/1_datasets.ipynb


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 208 - 0
p1ch6/2_birds_airplanes.ipynb


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 270 - 0
p1ch6/4_convolution.ipynb


파일 크기가 너무 크기때문에 변경 상태를 표시하지 않습니다.
+ 35 - 8
p1ch6/p1ch6.ipynb


+ 2 - 0
p2ch07/vis.py

@@ -74,3 +74,5 @@ def showNodule(series_uid, batch_ndx=None):
 
 
     print(series_uid, batch_ndx, bool(malignant_tensor[0]), malignant_list)
+
+

+ 200 - 34
p2ch07_explore_data.ipynb

@@ -12,23 +12,9 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
    "metadata": {},
-   "outputs": [
-    {
-     "ename": "ImportError",
-     "evalue": "DLL load failed: The operating system cannot run %1.",
-     "output_type": "error",
-     "traceback": [
-      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
-      "\u001b[1;31mImportError\u001b[0m                               Traceback (most recent call last)",
-      "\u001b[1;32m<ipython-input-3-8765eae447ac>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0mp2ch07\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdsets\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mgetNoduleInfoList\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mgetCt\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m      2\u001b[0m \u001b[0mnoduleInfo_list\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mgetNoduleInfoList\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mrequireDataOnDisk_bool\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;32mFalse\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      3\u001b[0m \u001b[0mmalignantInfo_list\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mx\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mx\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mnoduleInfo_list\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mx\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      4\u001b[0m \u001b[0mdiameter_list\u001b[0m \u001b[1;33m=\u001b[0m \u001b[1;33m[\u001b[0m\u001b[0mx\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0mx\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mmalignantInfo_list\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
-      "\u001b[1;32m~\\linux-home\\edit\\book\\code\\p2ch07\\dsets.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m     11\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     12\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mnumpy\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 13\u001b[1;33m \u001b[1;32mimport\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     14\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcuda\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     15\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mutils\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdata\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mDataset\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
-      "\u001b[1;32m~\\Miniconda3\\envs\\book\\lib\\site-packages\\torch\\__init__.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m()\u001b[0m\n\u001b[0;32m     76\u001b[0m     \u001b[1;32mpass\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     77\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 78\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0mtorch\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_C\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[1;33m*\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     79\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     80\u001b[0m __all__ += [name for name in dir(_C)\n",
-      "\u001b[1;31mImportError\u001b[0m: DLL load failed: The operating system cannot run %1."
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "from p2ch07.dsets import getNoduleInfoList, getCt\n",
     "noduleInfo_list = getNoduleInfoList(requireDataOnDisk_bool=False)\n",
@@ -38,9 +24,18 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 3,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "1351\n",
+      "(True, 32.27003025, '1.3.6.1.4.1.14519.5.2.1.6279.6001.287966244644280690737019247886', (67.61451718, 85.02525992, -109.8084416))\n"
+     ]
+    }
+   ],
    "source": [
     "print(len(malignantInfo_list))\n",
     "print(malignantInfo_list[0])"
@@ -48,9 +43,30 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 4,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "   0  32.3 mm\n",
+      " 100  17.7 mm\n",
+      " 200  13.0 mm\n",
+      " 300  10.0 mm\n",
+      " 400   8.2 mm\n",
+      " 500   7.0 mm\n",
+      " 600   6.3 mm\n",
+      " 700   5.7 mm\n",
+      " 800   5.1 mm\n",
+      " 900   4.7 mm\n",
+      "1000   4.0 mm\n",
+      "1100   0.0 mm\n",
+      "1200   0.0 mm\n",
+      "1300   0.0 mm\n"
+     ]
+    }
+   ],
    "source": [
     "for i in range(0, len(diameter_list), 100):\n",
     "    print('{:4}  {:4.1f} mm'.format(i, diameter_list[i]))"
@@ -58,9 +74,36 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 5,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(True, 32.27003025, '1.3.6.1.4.1.14519.5.2.1.6279.6001.287966244644280690737019247886', (67.61451718, 85.02525992, -109.8084416))\n",
+      "(True, 30.61040636, '1.3.6.1.4.1.14519.5.2.1.6279.6001.112740418331256326754121315800', (47.90350511, 37.60442008, -99.93417567))\n",
+      "(True, 30.61040636, '1.3.6.1.4.1.14519.5.2.1.6279.6001.112740418331256326754121315800', (44.19, 37.79, -107.01))\n",
+      "(True, 30.61040636, '1.3.6.1.4.1.14519.5.2.1.6279.6001.112740418331256326754121315800', (40.69, 32.19, -97.15))\n",
+      "(True, 27.44242293, '1.3.6.1.4.1.14519.5.2.1.6279.6001.943403138251347598519939390311', (-45.29440163, 74.86925386, -97.52812481))\n",
+      "(True, 27.07544345, '1.3.6.1.4.1.14519.5.2.1.6279.6001.481278873893653517789960724156', (-102.571208, -5.186558766, -205.1033412))\n",
+      "(True, 26.83708074, '1.3.6.1.4.1.14519.5.2.1.6279.6001.487268565754493433372433148666', (121.152909372, 12.9136003304, -159.399497186))\n",
+      "(True, 26.83708074, '1.3.6.1.4.1.14519.5.2.1.6279.6001.487268565754493433372433148666', (118.8539408, 11.54202797, -165.5042458))\n",
+      "(True, 25.87269662, '1.3.6.1.4.1.14519.5.2.1.6279.6001.177086402277715068525592995222', (-66.628286875, 57.151972075, -110.12035075))\n",
+      "(True, 25.41540526, '1.3.6.1.4.1.14519.5.2.1.6279.6001.219618492426142913407827034169', (-101.7504204, -95.65460516, -138.4943211))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.107109359065300889765026303943', (-100.57, -66.23, -218.76))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.106379658920626694402549886949', (-71.09, 68.3, -160.4))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.102681962408431413578140925249', (106.18, 12.61, -96.81))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.102681962408431413578140925249', (96.2846726653, 19.0348690723, -88.478440818))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.100621383016233746780170740405', (89.32, 190.84, -516.82))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.100621383016233746780170740405', (89.32, 143.23, -427.1))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.100621383016233746780170740405', (85.12, 152.33, -425.7))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.100621383016233746780170740405', (8.8, 174.74, -401.87))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.100621383016233746780170740405', (5.99, 171.94, -398.37))\n",
+      "(True, 0.0, '1.3.6.1.4.1.14519.5.2.1.6279.6001.100621383016233746780170740405', (1.79, 166.34, -408.88))\n"
+     ]
+    }
+   ],
    "source": [
     "for nodule_tup in malignantInfo_list[:10]:\n",
     "    print(nodule_tup)\n",
@@ -70,18 +113,40 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(array([323, 466, 248, 111,  71,  57,  37,  29,   5,   4], dtype=int64),\n",
+       " array([ 0.        ,  3.22700302,  6.45400605,  9.68100907, 12.9080121 ,\n",
+       "        16.13501512, 19.36201815, 22.58902117, 25.8160242 , 29.04302722,\n",
+       "        32.27003025]))"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
    "source": [
     "np.histogram(diameter_list)"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 7,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2018-12-11 11:24:57,865 INFO     pid:30236 p2ch07.dsets:201:__init__ <p2ch07.dsets.LunaDataset object at 0x000001B30A802438>: 551065 training samples\n"
+     ]
+    }
+   ],
    "source": [
     "from p2ch07.vis import findMalignantSamples, showNodule\n",
     "malignantSample_list = findMalignantSamples()"
@@ -89,9 +154,24 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 8,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2018-12-11 11:25:01,788 INFO     pid:30236 p2ch07.dsets:201:__init__ <p2ch07.dsets.LunaDataset object at 0x000001B30A7D6668>: 602 training samples\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "1.3.6.1.4.1.14519.5.2.1.6279.6001.183982839679953938397312236359 0 True [0, 1, 2, 3, 4, 5, 6]\n"
+     ]
+    }
+   ],
    "source": [
     "series_uid = malignantSample_list[11][2]\n",
     "showNodule(series_uid)"
@@ -99,9 +179,24 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 9,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "2018-12-11 11:25:08,042 INFO     pid:30236 p2ch07.dsets:201:__init__ <p2ch07.dsets.LunaDataset object at 0x000001B30E7C6BE0>: 605 training samples\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "1.3.6.1.4.1.14519.5.2.1.6279.6001.126264578931778258890371755354 0 True [0]\n"
+     ]
+    }
+   ],
    "source": [
     "series_uid = '1.3.6.1.4.1.14519.5.2.1.6279.6001.126264578931778258890371755354'\n",
     "showNodule(series_uid)"
@@ -111,7 +206,30 @@
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "C:\\Users\\elis\\Miniconda3\\envs\\book\\lib\\site-packages\\ipyvolume\\serialize.py:81: RuntimeWarning: invalid value encountered in true_divide\n",
+      "  gradient = gradient / np.sqrt(gradient[0]**2 + gradient[1]**2 + gradient[2]**2)\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "339fa8710d8b459182d9d3afeb08f720",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "VBox(children=(VBox(children=(HBox(children=(Label(value='levels:'), FloatSlider(value=0.25, max=1.0, step=0.0…"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "import numpy as np\n",
     "import ipyvolume as ipv\n",
@@ -129,7 +247,32 @@
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "C:\\Users\\elis\\Miniconda3\\envs\\book\\lib\\site-packages\\ipyvolume\\widgets.py:179: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.\n",
+      "  data_view = self.data_original[view]\n",
+      "C:\\Users\\elis\\Miniconda3\\envs\\book\\lib\\site-packages\\ipyvolume\\utils.py:204: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.\n",
+      "  data = (data[slices1] + data[slices2])/2\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "61e1eb24c53149d8bdc8bd6188257862",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "VBox(children=(VBox(children=(HBox(children=(Label(value='levels:'), FloatSlider(value=0.25, max=1.0, step=0.0…"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "ct = getCt(series_uid)\n",
     "ipv.quickvolshow(ct.ary, level=[0.25, 0.5, 0.9], opacity=0.1, level_width=0.1, data_min=0, data_max=2)"
@@ -150,7 +293,30 @@
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "C:\\Users\\elis\\Miniconda3\\envs\\book\\lib\\site-packages\\ipyvolume\\serialize.py:81: RuntimeWarning: invalid value encountered in sqrt\n",
+      "  gradient = gradient / np.sqrt(gradient[0]**2 + gradient[1]**2 + gradient[2]**2)\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "9c40dfe3a3cc4b49aebce4cbee8e3d62",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "VBox(children=(VBox(children=(HBox(children=(Label(value='levels:'), FloatSlider(value=0.17, max=1.0, step=0.0…"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
    "source": [
     "bones = ct.ary * (ct.ary > 1.5)\n",
     "lungs = ct.ary * air_mask\n",
@@ -187,7 +353,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.5"
+   "version": "3.6.6"
   }
  },
  "nbformat": 4,

+ 31 - 12
p2ch10/dsets.py

@@ -229,7 +229,7 @@ class Ct(object):
 
             slice_list.append(slice(start_ndx, end_ndx))
 
-        ct_chunk = self.ary[slice_list]
+        ct_chunk = self.ary[tuple(slice_list)]
 
         return ct_chunk, center_irc
 
@@ -301,9 +301,14 @@ def getCtCubicChunk(series_uid, center_xyz, maxWidth_mm):
 
     return ct_chunk, center_irc
 
-def getCtAugmentedNodule(augmentation_dict, series_uid, center_xyz, width_mm, voxels_int, maxWidth_mm=32.0):
+def getCtAugmentedNodule(augmentation_dict, series_uid, center_xyz, width_mm, voxels_int, maxWidth_mm=32.0, use_cache=True):
     assert width_mm <= maxWidth_mm
-    cubic_chunk, center_irc = getCtCubicChunk(series_uid, center_xyz, maxWidth_mm)
+
+    if use_cache:
+        cubic_chunk, center_irc = getCtCubicChunk(series_uid, center_xyz, maxWidth_mm)
+    else:
+        ct = getCt(series_uid)
+        ct_chunk, center_irc = ct.getCubicInputChunk(center_xyz, maxWidth_mm)
 
     slice_list = []
     for axis in range(3):
@@ -391,7 +396,7 @@ class LunaPrepcacheDataset(Dataset):
         return 0
 
 
-class LunaNoduleDataset(Dataset):
+class LunaClassificationDataset(Dataset):
     def __init__(self,
                  test_stride=0,
                  isTestSet_bool=None,
@@ -401,6 +406,7 @@ class LunaNoduleDataset(Dataset):
                  scaled_bool=False,
                  multiscaled_bool=False,
                  augmented_bool=False,
+                 noduleInfo_list=None,
             ):
         self.ratio_int = ratio_int
         self.scaled_bool = scaled_bool
@@ -420,7 +426,12 @@ class LunaNoduleDataset(Dataset):
         else:
             self.augmentation_dict = {}
 
-        self.noduleInfo_list = copy.copy(getNoduleInfoList())
+        if noduleInfo_list:
+            self.noduleInfo_list = copy.copy(noduleInfo_list)
+            self.use_cache = False
+        else:
+            self.noduleInfo_list = copy.copy(getNoduleInfoList())
+            self.use_cache = True
 
         if series_uid:
             self.noduleInfo_list = [x for x in self.noduleInfo_list if x[2] == series_uid]
@@ -459,6 +470,7 @@ class LunaNoduleDataset(Dataset):
 
     def __len__(self):
         if self.ratio_int:
+            # return 10000
             return 100000
         elif self.augmentation_dict:
             return len(self.noduleInfo_list) * 5
@@ -567,7 +579,7 @@ class Luna2dSegmentationDataset(Dataset):
 
             ct_tensor[i] = torch.from_numpy(ct.ary[context_ndx].astype(np.float32))
 
-        air_mask, lung_mask = ct.build2dLungMask(sample_ndx)[2:]
+        air_mask, lung_mask = ct.build2dLungMask(sample_ndx)[:2]
 
         ct_tensor[-1] = torch.from_numpy(lung_mask.astype(np.float32))
 
@@ -576,6 +588,7 @@ class Luna2dSegmentationDataset(Dataset):
 
         masks_tensor[0] = torch.from_numpy(mal_mask.astype(np.float32))
         masks_tensor[1] = torch.from_numpy((mal_mask | ben_mask).astype(np.float32))
+        # masks_tensor[1] = torch.from_numpy(ben_mask.astype(np.float32))
 
         return ct_tensor.contiguous(), masks_tensor.contiguous(), ct.series_uid, sample_ndx
 
@@ -593,21 +606,27 @@ class TrainingLuna2dSegmentationDataset(Luna2dSegmentationDataset):
         assert self.series_list
 
     def __len__(self):
+        # return 100
+        # return 1000
         # return 10000
         return 20000
         # return 40000
 
-    def __getitem__(self, key):
+    def __getitem__(self, ndx):
         if self.needsShuffle_bool:
             random.shuffle(self.series_list)
             self.needsShuffle_bool = False
 
-        if random.random() < 0.1:
-            self.series_list.append(self.series_list.pop(0)) ring buffer
+        if random.random() < 0.01:
+            self.series_list.append(self.series_list.pop(0))
 
-        series_uid = self.series_list[ndx % ctCache_depth]
-        ct = getCt(series_uid)
-        sample_ndx = random.choice(ct.malignant_indexes or ct.benign_indexes)
+        if isinstance(ndx, int):
+            series_uid = self.series_list[ndx % ctCache_depth]
+            ct = getCt(series_uid)
+            sample_ndx = random.choice(ct.malignant_indexes or ct.benign_indexes)
+            # series_uid, sample_ndx = self.sample_list[ndx % len(self.sample_list)]
+        else:
+            series_uid, sample_ndx = ndx
 
         # if ndx % 2 == 0:
         #     sample_ndx = random.choice(ct.malignant_indexes or ct.benign_indexes)

+ 13 - 9
p2ch10/prepcache.py

@@ -9,7 +9,7 @@ from torch.optim import SGD
 from torch.utils.data import DataLoader
 
 from util.util import enumerateWithEstimate
-from .dsets import LunaPrepcacheDataset
+from .dsets import LunaClassificationDataset, getCtSize
 from util.logconf import logging
 # from .model import LunaModel
 
@@ -28,7 +28,7 @@ class LunaPrepCacheApp(object):
         parser = argparse.ArgumentParser()
         parser.add_argument('--batch-size',
             help='Batch size to use for training',
-            default=1,
+            default=32,
             type=int,
         )
         parser.add_argument('--num-workers',
@@ -36,11 +36,11 @@ class LunaPrepCacheApp(object):
             default=8,
             type=int,
         )
-        parser.add_argument('--scaled',
-            help="Scale the CT chunks to square voxels.",
-            default=False,
-            action='store_true',
-        )
+        # parser.add_argument('--scaled',
+        #     help="Scale the CT chunks to square voxels.",
+        #     default=False,
+        #     action='store_true',
+        # )
 
         self.cli_args = parser.parse_args(sys_argv)
 
@@ -48,7 +48,8 @@ class LunaPrepCacheApp(object):
         log.info("Starting {}, {}".format(type(self).__name__, self.cli_args))
 
         self.prep_dl = DataLoader(
-            LunaPrepcacheDataset(
+            LunaClassificationDataset(
+                sortby_str='series_uid',
             ),
             batch_size=self.cli_args.batch_size,
             num_workers=self.cli_args.num_workers,
@@ -60,9 +61,12 @@ class LunaPrepCacheApp(object):
             start_ndx=self.prep_dl.num_workers,
         )
         for batch_ndx, batch_tup in batch_iter:
-            pass
+            _nodule_tensor, _malignant_tensor, series_list, _center_list = batch_tup
+            for series_uid in sorted(set(series_list)):
+                getCtSize(series_uid)
             # input_tensor, label_tensor, _series_list, _start_list = batch_tup
 
 
+
 if __name__ == '__main__':
     sys.exit(LunaPrepCacheApp().main() or 0)

+ 333 - 194
p2ch10/training.py

@@ -1,6 +1,7 @@
 import argparse
 import datetime
 import os
+import socket
 import sys
 
 import numpy as np
@@ -14,17 +15,17 @@ from torch.optim import SGD, Adam
 from torch.utils.data import DataLoader
 
 from util.util import enumerateWithEstimate
-from .dsets import TrainingLuna2dSegmentationDataset, TestingLuna2dSegmentationDataset, getCt
+from .dsets import TrainingLuna2dSegmentationDataset, TestingLuna2dSegmentationDataset, LunaClassificationDataset, getCt
 from util.logconf import logging
 from util.util import xyz2irc
-from .model import UNetWrapper
+from .model import UNetWrapper, LunaModel
 
 log = logging.getLogger(__name__)
 # log.setLevel(logging.WARN)
 # log.setLevel(logging.INFO)
 log.setLevel(logging.DEBUG)
 
-# Used for computeBatchLoss and logMetrics to index into metrics_tensor/metrics_ary
+# Used for computeClassificationLoss and logMetrics to index into metrics_tensor/metrics_ary
 # METRICS_LABEL_NDX=0
 # METRICS_PRED_NDX=1
 # METRICS_LOSS_NDX=2
@@ -37,19 +38,24 @@ log.setLevel(logging.DEBUG)
 
 METRICS_LOSS_NDX = 0
 METRICS_LABEL_NDX = 1
-METRICS_MFOUND_NDX = 2
+METRICS_PRED_NDX = 2
 
-METRICS_MOK_NDX = 3
-METRICS_MTP_NDX = 4
-METRICS_MFN_NDX = 5
-METRICS_MFP_NDX = 6
-METRICS_BTP_NDX = 7
-METRICS_BFN_NDX = 8
-METRICS_BFP_NDX = 9
+METRICS_MTP_NDX = 3
+METRICS_MFN_NDX = 4
+METRICS_MFP_NDX = 5
+METRICS_BTP_NDX = 6
+METRICS_BFN_NDX = 7
+METRICS_BFP_NDX = 8
 
-METRICS_MAL_LOSS_NDX = 10
-METRICS_BEN_LOSS_NDX = 11
-METRICS_SIZE = 12
+METRICS_MAL_LOSS_NDX = 9
+METRICS_BEN_LOSS_NDX = 10
+
+# METRICS_MFOUND_NDX = 2
+
+# METRICS_MOK_NDX = 2
+
+# METRICS_FLG_LOSS_NDX = 10
+METRICS_SIZE = 11
 
 
 
@@ -123,7 +129,7 @@ class LunaTrainingApp(object):
         )
 
         self.cli_args = parser.parse_args(sys_argv)
-        self.time_str = datetime.datetime.now().strftime('%Y-%m-%d_%H:%M:%S')
+        self.time_str = datetime.datetime.now().strftime('%Y-%m-%d_%H.%M.%S')
 
         self.trn_writer = None
         self.tst_writer = None
@@ -131,6 +137,11 @@ class LunaTrainingApp(object):
         self.use_cuda = torch.cuda.is_available()
         self.device = torch.device("cuda" if self.use_cuda else "cpu")
 
+        # TODO: remove this if block before print
+        # This is due to an odd setup that the author is using to test the code; please ignore for now
+        if socket.gethostname() == 'c2':
+            self.device = torch.device("cuda:1")
+
         self.model = self.initModel()
         self.optimizer = self.initOptimizer()
 
@@ -142,38 +153,52 @@ class LunaTrainingApp(object):
         if self.cli_args.segmentation:
             model = UNetWrapper(in_channels=8, n_classes=2, depth=5, wf=6, padding=True, batch_norm=True, up_mode='upconv')
         else:
-            assert False
+            model = LunaModel()
 
         if self.use_cuda:
             if torch.cuda.device_count() > 1:
-                model = nn.DataParallel(model)
+
+                # TODO: remove this if block before print
+                # This is due to an odd setup that the author is using to test the code; please ignore for now
+                if socket.gethostname() == 'c2':
+                    model = nn.DataParallel(model, device_ids=[1, 0])
+                else:
+                    model = nn.DataParallel(model)
 
             model = model.to(self.device)
 
+
         return model
 
     def initOptimizer(self):
-
-        # self.optimizer = SGD(self.model.parameters(), lr=0.01, momentum=0.99)
-        return Adam(self.model.parameters())
+        return SGD(self.model.parameters(), lr=0.01, momentum=0.99)
+        # return Adam(self.model.parameters())
 
 
-    def initTrainDl(self, epoch_ndx):
+    def initTrainDl(self):
         if self.cli_args.segmentation:
             train_ds = TrainingLuna2dSegmentationDataset(
                     test_stride=10,
-                    isTestSet_bool=False,
                     contextSlices_count=3,
                 )
         else:
-            assert False
+            train_ds = LunaClassificationDataset(
+                 test_stride=10,
+                 isTestSet_bool=False,
+                 # series_uid=None,
+                 # sortby_str='random',
+                 ratio_int=int(self.cli_args.balanced),
+                 # scaled_bool=False,
+                 # multiscaled_bool=False,
+                 # augmented_bool=False,
+                 # noduleInfo_list=None,
+            )
 
         train_dl = DataLoader(
             train_ds,
             batch_size=self.cli_args.batch_size * (torch.cuda.device_count() if self.use_cuda else 1),
             num_workers=self.cli_args.num_workers,
             pin_memory=self.use_cuda,
-            # sampler=StridedSampler(train_ds, num_workers=self.cli_args.num_workers, batch_size=self.cli_args.batch_size * (torch.cuda.device_count() if self.use_cuda else 1))
         )
 
         return train_dl
@@ -182,15 +207,20 @@ class LunaTrainingApp(object):
         if self.cli_args.segmentation:
             test_ds = TestingLuna2dSegmentationDataset(
                     test_stride=10,
-                    isTestSet_bool=True,
                     contextSlices_count=3,
-
-                    # scaled_bool=self.cli_args.scaled or self.cli_args.multiscaled or self.cli_args.augmented,
-                    # multiscaled_bool=self.cli_args.multiscaled,
-                    # augmented_bool=self.cli_args.augmented,
                 )
         else:
-            assert False
+            test_ds = LunaClassificationDataset(
+                 test_stride=10,
+                 isTestSet_bool=True,
+                 # series_uid=None,
+                 # sortby_str='random',
+                 # ratio_int=int(self.cli_args.balanced),
+                 # scaled_bool=False,
+                 # multiscaled_bool=False,
+                 # augmented_bool=False,
+                 # noduleInfo_list=None,
+            )
 
         test_dl = DataLoader(
             test_ds,
@@ -205,19 +235,25 @@ class LunaTrainingApp(object):
         if self.trn_writer is None:
             log_dir = os.path.join('runs', self.cli_args.tb_prefix, self.time_str)
 
-            self.trn_writer = SummaryWriter(log_dir=log_dir + '_segtrn_' + self.cli_args.comment)
-            self.tst_writer = SummaryWriter(log_dir=log_dir + '_segtst_' + self.cli_args.comment)
+            type_str = 'seg_' if self.cli_args.segmentation else 'cls_'
+
+            self.trn_writer = SummaryWriter(log_dir=log_dir + '_trn_' + type_str + self.cli_args.comment)
+            self.tst_writer = SummaryWriter(log_dir=log_dir + '_tst_' + type_str + self.cli_args.comment)
 
 
 
     def main(self):
         log.info("Starting {}, {}".format(type(self).__name__, self.cli_args))
 
+        train_dl = self.initTrainDl()
         test_dl = self.initTestDl()
 
-        for epoch_ndx in range(1, self.cli_args.epochs + 1):
-            train_dl = self.initTrainDl(epoch_ndx)
+        self.initTensorboardWriters()
+        self.logModelMetrics(self.model)
+
+        best_score = 0.0
 
+        for epoch_ndx in range(1, self.cli_args.epochs + 1):
             log.info("Epoch {} of {}, {}/{} batches of size {}*{}".format(
                 epoch_ndx,
                 self.cli_args.epochs,
@@ -228,13 +264,19 @@ class LunaTrainingApp(object):
             ))
 
             trainingMetrics_tensor = self.doTraining(epoch_ndx, train_dl)
+            if epoch_ndx > 0:
+                self.logPerformanceMetrics(epoch_ndx, 'trn', trainingMetrics_tensor)
+
+            self.logModelMetrics(self.model)
+
             if self.cli_args.segmentation:
                 self.logImages(epoch_ndx, train_dl, test_dl)
 
             testingMetrics_tensor = self.doTesting(epoch_ndx, test_dl)
-            self.logMetrics(epoch_ndx, trainingMetrics_tensor, testingMetrics_tensor)
+            score = self.logPerformanceMetrics(epoch_ndx, 'tst', testingMetrics_tensor)
+            best_score = max(score, best_score)
 
-            self.saveModel(epoch_ndx)
+            self.saveModel('seg' if self.cli_args.segmentation else 'cls', epoch_ndx, score == best_score)
 
         if hasattr(self, 'trn_writer'):
             self.trn_writer.close()
@@ -243,7 +285,7 @@ class LunaTrainingApp(object):
     def doTraining(self, epoch_ndx, train_dl):
         self.model.train()
         trainingMetrics_tensor = torch.zeros(METRICS_SIZE, len(train_dl.dataset))
-        train_dl.dataset.shuffleSamples()
+        # train_dl.dataset.shuffleSamples()
         batch_iter = enumerateWithEstimate(
             train_dl,
             "E{} Training".format(epoch_ndx),
@@ -255,7 +297,7 @@ class LunaTrainingApp(object):
             if self.cli_args.segmentation:
                 loss_var = self.computeSegmentationLoss(batch_ndx, batch_tup, train_dl.batch_size, trainingMetrics_tensor)
             else:
-                loss_var = self.computeBatchLoss(batch_ndx, batch_tup, train_dl.batch_size, trainingMetrics_tensor)
+                loss_var = self.computeClassificationLoss(batch_ndx, batch_tup, train_dl.batch_size, trainingMetrics_tensor)
 
             if loss_var is not None:
                 loss_var.backward()
@@ -279,11 +321,11 @@ class LunaTrainingApp(object):
                 if self.cli_args.segmentation:
                     self.computeSegmentationLoss(batch_ndx, batch_tup, test_dl.batch_size, testingMetrics_tensor)
                 else:
-                    self.computeBatchLoss(batch_ndx, batch_tup, test_dl.batch_size, testingMetrics_tensor)
+                    self.computeClassificationLoss(batch_ndx, batch_tup, test_dl.batch_size, testingMetrics_tensor)
 
         return testingMetrics_tensor
 
-    def computeBatchLoss(self, batch_ndx, batch_tup, batch_size, metrics_tensor):
+    def computeClassificationLoss(self, batch_ndx, batch_tup, batch_size, metrics_tensor):
         input_tensor, label_tensor, _series_list, _center_list = batch_tup
 
         input_devtensor = input_tensor.to(self.device)
@@ -294,9 +336,38 @@ class LunaTrainingApp(object):
 
         start_ndx = batch_ndx * batch_size
         end_ndx = start_ndx + label_tensor.size(0)
-        metrics_tensor[METRICS_LABEL_NDX, start_ndx:end_ndx] = label_tensor
-        metrics_tensor[METRICS_PRED_NDX, start_ndx:end_ndx] = prediction_devtensor.to('cpu')
-        metrics_tensor[METRICS_LOSS_NDX, start_ndx:end_ndx] = loss_devtensor.to('cpu')
+
+        with torch.no_grad():
+            # log.debug([metrics_tensor[METRICS_LABEL_NDX, start_ndx:end_ndx].shape, label_tensor.shape])
+
+            metrics_tensor[METRICS_LABEL_NDX, start_ndx:end_ndx] = label_tensor[:,0]
+            metrics_tensor[METRICS_PRED_NDX, start_ndx:end_ndx] = prediction_devtensor.to('cpu')[:,0]
+            # metrics_tensor[METRICS_LOSS_NDX, start_ndx:end_ndx] = loss_devtensor.to('cpu')
+
+
+
+
+            prediction_tensor = prediction_devtensor.to('cpu', non_blocking=True)
+            loss_tensor = loss_devtensor.to('cpu', non_blocking=True)[:,0]
+            malLabel_tensor = (label_tensor > 0.5)[:,0]
+            benLabel_tensor = ~malLabel_tensor
+
+
+            malPred_tensor = prediction_tensor > 0.5
+            benPred_tensor = ~malPred_tensor
+            metrics_tensor[METRICS_MTP_NDX, start_ndx:end_ndx] = (malLabel_tensor * malPred_tensor).sum(dim=1)
+            metrics_tensor[METRICS_MFN_NDX, start_ndx:end_ndx] = (malLabel_tensor * benPred_tensor).sum(dim=1)
+            metrics_tensor[METRICS_MFP_NDX, start_ndx:end_ndx] = (benLabel_tensor * malPred_tensor).sum(dim=1)
+
+            metrics_tensor[METRICS_BTP_NDX, start_ndx:end_ndx] = (benLabel_tensor * benPred_tensor).sum(dim=1)
+            metrics_tensor[METRICS_BFN_NDX, start_ndx:end_ndx] = (benLabel_tensor * malPred_tensor).sum(dim=1)
+            metrics_tensor[METRICS_BFP_NDX, start_ndx:end_ndx] = (malLabel_tensor * benPred_tensor).sum(dim=1)
+
+            metrics_tensor[METRICS_LOSS_NDX, start_ndx:end_ndx] = loss_tensor
+
+            metrics_tensor[METRICS_BEN_LOSS_NDX, start_ndx:end_ndx] = loss_tensor * benLabel_tensor.type(torch.float32)
+            metrics_tensor[METRICS_MAL_LOSS_NDX, start_ndx:end_ndx] = loss_tensor * malLabel_tensor.type(torch.float32)
+
 
         # TODO: replace with torch.autograd.detect_anomaly
         # assert np.isfinite(metrics_tensor).all()
@@ -322,34 +393,38 @@ class LunaTrainingApp(object):
         intersectionSum = lambda a, b: (a * b.to(torch.float32)).view(a.size(0), -1).sum(dim=1)
 
         diceLoss_devtensor = self.diceLoss(label_devtensor, prediction_devtensor)
+        malLoss_devtensor = self.diceLoss(label_devtensor[:,0], prediction_devtensor[:,0])
+        benLoss_devtensor = self.diceLoss(label_devtensor[:,1], prediction_devtensor[:,1])
 
         with torch.no_grad():
+            bPred_tensor = prediction_devtensor.to('cpu', non_blocking=True)
+            diceLoss_tensor = diceLoss_devtensor.to('cpu', non_blocking=True)
+            malLoss_tensor = malLoss_devtensor.to('cpu', non_blocking=True)
+            benLoss_tensor = benLoss_devtensor.to('cpu', non_blocking=True)
 
-            boolPrediction_tensor = prediction_devtensor.to('cpu') > 0.5
+            # flgLoss_devtensor = self.diceLoss(label_devtensor[:,0], label_devtensor[:,0] * prediction_devtensor[:,1])
+            # flgLoss_tensor = flgLoss_devtensor.to('cpu', non_blocking=True)#.unsqueeze(1)
 
-            metrics_tensor[METRICS_LABEL_NDX, start_ndx:end_ndx] = max2(label_tensor[:,0])
-            metrics_tensor[METRICS_MFOUND_NDX, start_ndx:end_ndx] = (max2(label_tensor[:, 0] * boolPrediction_tensor[:, 1].to(torch.float32)) > 0.5)
+            metrics_tensor[METRICS_LABEL_NDX, start_ndx:end_ndx] = max2(label_tensor[:,0]) + max2(label_tensor[:,1]) * 2
+            # metrics_tensor[METRICS_MFOUND_NDX, start_ndx:end_ndx] = (max2(label_tensor[:, 0] * bPred_tensor[:, 1].to(torch.float32)) > 0.5)
 
-            metrics_tensor[METRICS_MOK_NDX, start_ndx:end_ndx] = intersectionSum( label_tensor[:,0],  torch.max(boolPrediction_tensor, dim=1)[0])
+            # metrics_tensor[METRICS_MOK_NDX, start_ndx:end_ndx] = intersectionSum( label_tensor[:,0],  bPred_tensor[:,1])
 
-            metrics_tensor[METRICS_MTP_NDX, start_ndx:end_ndx] = intersectionSum( label_tensor[:,0],  boolPrediction_tensor[:,0])
-            metrics_tensor[METRICS_MFN_NDX, start_ndx:end_ndx] = intersectionSum( label_tensor[:,0], ~boolPrediction_tensor[:,0])
-            metrics_tensor[METRICS_MFP_NDX, start_ndx:end_ndx] = intersectionSum(1 - label_tensor[:,0],  boolPrediction_tensor[:,0])
+            bPred_tensor = bPred_tensor > 0.5
+            metrics_tensor[METRICS_MTP_NDX, start_ndx:end_ndx] = intersectionSum(    label_tensor[:,0],  bPred_tensor[:,0])
+            metrics_tensor[METRICS_MFN_NDX, start_ndx:end_ndx] = intersectionSum(    label_tensor[:,0], ~bPred_tensor[:,0])
+            metrics_tensor[METRICS_MFP_NDX, start_ndx:end_ndx] = intersectionSum(1 - label_tensor[:,0],  bPred_tensor[:,0])
 
-            metrics_tensor[METRICS_BTP_NDX, start_ndx:end_ndx] = intersectionSum( label_tensor[:,1],  boolPrediction_tensor[:,1])
-            metrics_tensor[METRICS_BFN_NDX, start_ndx:end_ndx] = intersectionSum( label_tensor[:,1], ~boolPrediction_tensor[:,1])
-            metrics_tensor[METRICS_BFP_NDX, start_ndx:end_ndx] = intersectionSum(1 - label_tensor[:,1],  boolPrediction_tensor[:,1])
+            metrics_tensor[METRICS_BTP_NDX, start_ndx:end_ndx] = intersectionSum(    label_tensor[:,1],  bPred_tensor[:,1])
+            metrics_tensor[METRICS_BFN_NDX, start_ndx:end_ndx] = intersectionSum(    label_tensor[:,1], ~bPred_tensor[:,1])
+            metrics_tensor[METRICS_BFP_NDX, start_ndx:end_ndx] = intersectionSum(1 - label_tensor[:,1],  bPred_tensor[:,1])
 
-            diceLoss_tensor = diceLoss_devtensor.to('cpu')
             metrics_tensor[METRICS_LOSS_NDX, start_ndx:end_ndx] = diceLoss_tensor
 
-            malLoss_devtensor = self.diceLoss(label_devtensor[:,0], prediction_devtensor[:,0])
-            malLoss_tensor = malLoss_devtensor.to('cpu')#.unsqueeze(1)
+            metrics_tensor[METRICS_BEN_LOSS_NDX, start_ndx:end_ndx] = benLoss_tensor
             metrics_tensor[METRICS_MAL_LOSS_NDX, start_ndx:end_ndx] = malLoss_tensor
+            # metrics_tensor[METRICS_FLG_LOSS_NDX, start_ndx:end_ndx] = flgLoss_tensor
 
-            benLoss_devtensor = self.diceLoss(label_devtensor[:,1], prediction_devtensor[:,1])
-            benLoss_tensor = benLoss_devtensor.to('cpu')#.unsqueeze(1)
-            metrics_tensor[METRICS_BEN_LOSS_NDX, start_ndx:end_ndx] = benLoss_tensor
 
             # lungLoss_devtensor = self.diceLoss(label_devtensor[:,2], prediction_devtensor[:,2])
             # lungLoss_tensor = lungLoss_devtensor.to('cpu').unsqueeze(1)
@@ -360,10 +435,10 @@ class LunaTrainingApp(object):
 
         # return nn.MSELoss()(prediction_devtensor, label_devtensor)
 
-        return diceLoss_devtensor.mean()
+        return malLoss_devtensor.mean() + benLoss_devtensor.mean()
         # return self.diceLoss(label_devtensor[:,0], prediction_devtensor[:,0]).mean()
 
-    def diceLoss(self, label_devtensor, prediction_devtensor, epsilon=0.01):
+    def diceLoss(self, label_devtensor, prediction_devtensor, epsilon=0.01, p=False):
         # sum2 = lambda t: t.sum([1,2,3,4])
         sum2 = lambda t: t.view(t.size(0), -1).sum(dim=1)
         # max2 = lambda t: t.view(t.size(0), -1).max(dim=1)[0]
@@ -374,188 +449,246 @@ class LunaTrainingApp(object):
         epsilon_devtensor = torch.ones_like(diceCorrect_devtensor) * epsilon
         diceLoss_devtensor = 1 - (2 * diceCorrect_devtensor + epsilon_devtensor) / (dicePrediction_devtensor + diceLabel_devtensor + epsilon_devtensor)
 
+        if not torch.isfinite(diceLoss_devtensor).all():
+            log.debug('')
+            log.debug('diceLoss_devtensor')
+            log.debug(diceLoss_devtensor.to('cpu'))
+            log.debug('diceCorrect_devtensor')
+            log.debug(diceCorrect_devtensor.to('cpu'))
+            log.debug('dicePrediction_devtensor')
+            log.debug(dicePrediction_devtensor.to('cpu'))
+            log.debug('diceLabel_devtensor')
+            log.debug(diceLabel_devtensor.to('cpu'))
+
         return diceLoss_devtensor
 
 
 
     def logImages(self, epoch_ndx, train_dl, test_dl):
-        if epoch_ndx > 0: # TODO revert
-            self.initTensorboardWriters()
-
-            for mode_str, dl in [('trn', train_dl), ('tst', test_dl)]:
-                for i, series_uid in enumerate(sorted(dl.dataset.series_list)[:12]):
-                    ct = getCt(series_uid)
-                    noduleInfo_tup = (ct.malignantInfo_list or ct.benignInfo_list)[0]
-                    center_irc = xyz2irc(noduleInfo_tup.center_xyz, ct.origin_xyz, ct.vxSize_xyz, ct.direction_tup)
+        for mode_str, dl in [('trn', train_dl), ('tst', test_dl)]:
+            for i, series_uid in enumerate(sorted(dl.dataset.series_list)[:12]):
+                ct = getCt(series_uid)
+                noduleInfo_tup = (ct.malignantInfo_list or ct.benignInfo_list)[0]
+                center_irc = xyz2irc(noduleInfo_tup.center_xyz, ct.origin_xyz, ct.vxSize_xyz, ct.direction_tup)
 
-                    sample_tup = dl.dataset[(series_uid, int(center_irc.index))]
-                    input_tensor = sample_tup[0].unsqueeze(0)
-                    label_tensor = sample_tup[1].unsqueeze(0)
+                sample_tup = dl.dataset[(series_uid, int(center_irc.index))]
+                input_tensor = sample_tup[0].unsqueeze(0)
+                label_tensor = sample_tup[1].unsqueeze(0)
 
-                    input_devtensor = input_tensor.to(self.device)
-                    label_devtensor = label_tensor.to(self.device)
+                input_devtensor = input_tensor.to(self.device)
+                label_devtensor = label_tensor.to(self.device)
 
-                    prediction_devtensor = self.model(input_devtensor)
-                    prediction_ary = prediction_devtensor.to('cpu').detach().numpy()
+                prediction_devtensor = self.model(input_devtensor)
+                prediction_ary = prediction_devtensor.to('cpu').detach().numpy()
 
-                    image_ary = np.zeros((512, 512, 3), dtype=np.float32)
-                    image_ary[:,:,:] = (input_tensor[0,2].numpy().reshape((512,512,1))) * 0.25
-                    image_ary[:,:,0] += prediction_ary[0,0] * 0.5
-                    image_ary[:,:,1] += prediction_ary[0,1] * 0.25
-                    # image_ary[:,:,2] += prediction_ary[0,2] * 0.5
+                image_ary = np.zeros((512, 512, 3), dtype=np.float32)
+                image_ary[:,:,:] = (input_tensor[0,2].numpy().reshape((512,512,1))) * 0.25
+                image_ary[:,:,0] += prediction_ary[0,0] * 0.5
+                image_ary[:,:,1] += prediction_ary[0,1] * 0.25
+                # image_ary[:,:,2] += prediction_ary[0,2] * 0.5
 
-                    # log.debug([image_ary.__array_interface__['typestr']])
+                # log.debug([image_ary.__array_interface__['typestr']])
 
-                    # image_ary = (image_ary * 255).astype(np.uint8)
+                # image_ary = (image_ary * 255).astype(np.uint8)
 
-                    # log.debug([image_ary.__array_interface__['typestr']])
+                # log.debug([image_ary.__array_interface__['typestr']])
 
-                    writer = getattr(self, mode_str + '_writer')
-                    writer.add_image('{}/{}_pred'.format(mode_str, i), image_ary, self.totalTrainingSamples_count)
+                writer = getattr(self, mode_str + '_writer')
+                try:
+                    writer.add_image('{}/{}_pred'.format(mode_str, i), image_ary, self.totalTrainingSamples_count, dataformats='HWC')
+                except:
+                    log.debug([image_ary.shape, image_ary.dtype])
+                    raise
 
-                    if epoch_ndx == 1:
-                        label_ary = label_tensor.numpy()
+                if epoch_ndx == 1:
+                    label_ary = label_tensor.numpy()
 
-                        image_ary = np.zeros((512, 512, 3), dtype=np.float32)
-                        image_ary[:,:,:] = (input_tensor[0,2].numpy().reshape((512,512,1))) * 0.25
-                        image_ary[:,:,0] += label_ary[0,0] * 0.5
-                        image_ary[:,:,1] += label_ary[0,1] * 0.25
-                        image_ary[:,:,2] += (input_tensor[0,-1].numpy() - (label_ary[0,0].astype(np.bool) | label_ary[0,1].astype(np.bool))) * 0.25
+                    image_ary = np.zeros((512, 512, 3), dtype=np.float32)
+                    image_ary[:,:,:] = (input_tensor[0,2].numpy().reshape((512,512,1))) * 0.25
+                    image_ary[:,:,0] += label_ary[0,0] * 0.5
+                    image_ary[:,:,1] += label_ary[0,1] * 0.25
+                    image_ary[:,:,2] += (input_tensor[0,-1].numpy() - (label_ary[0,0].astype(np.bool) | label_ary[0,1].astype(np.bool))) * 0.25
 
-                        # log.debug([image_ary.__array_interface__['typestr']])
+                    # log.debug([image_ary.__array_interface__['typestr']])
 
-                        image_ary = (image_ary * 255).astype(np.uint8)
+                    image_ary = (image_ary * 255).astype(np.uint8)
 
-                        # log.debug([image_ary.__array_interface__['typestr']])
+                    # log.debug([image_ary.__array_interface__['typestr']])
 
-                        writer = getattr(self, mode_str + '_writer')
-                        writer.add_image('{}/{}_label'.format(mode_str, i), image_ary, self.totalTrainingSamples_count)
+                    writer = getattr(self, mode_str + '_writer')
+                    writer.add_image('{}/{}_label'.format(mode_str, i), image_ary, self.totalTrainingSamples_count, dataformats='HWC')
 
 
-    def logMetrics(self,
-                   epoch_ndx,
-                   trainingMetrics_tensor,
-                   testingMetrics_tensor,
-                   classificationThreshold_float=0.5,
-                   ):
+    def logPerformanceMetrics(self,
+                              epoch_ndx,
+                              mode_str,
+                              metrics_tensor,
+                              # trainingMetrics_tensor,
+                              # testingMetrics_tensor,
+                              classificationThreshold_float=0.5,
+                              ):
         log.info("E{} {}".format(
             epoch_ndx,
             type(self).__name__,
         ))
 
+        score = 0.0
+
 
-        for mode_str, metrics_tensor in [('trn', trainingMetrics_tensor), ('tst', testingMetrics_tensor)]:
-            metrics_ary = metrics_tensor.cpu().detach().numpy()
-            sum_ary = metrics_ary.sum(axis=1)
-            assert np.isfinite(metrics_ary).all()
+        # for mode_str, metrics_tensor in [('trn', trainingMetrics_tensor), ('tst', testingMetrics_tensor)]:
+        metrics_ary = metrics_tensor.cpu().detach().numpy()
+        sum_ary = metrics_ary.sum(axis=1)
+        assert np.isfinite(metrics_ary).all()
 
-            malLabel_mask = metrics_ary[METRICS_LABEL_NDX] > classificationThreshold_float
-            malFound_mask = metrics_ary[METRICS_MFOUND_NDX] > classificationThreshold_float
+        malLabel_mask = (metrics_ary[METRICS_LABEL_NDX] == 1) | (metrics_ary[METRICS_LABEL_NDX] == 3)
 
-            # malLabel_mask = ~benLabel_mask
-            # malPred_mask = ~benPred_mask
+        if self.cli_args.segmentation:
+            benLabel_mask = (metrics_ary[METRICS_LABEL_NDX] == 2) | (metrics_ary[METRICS_LABEL_NDX] == 3)
+        else:
+            benLabel_mask = ~malLabel_mask
+        # malFound_mask = metrics_ary[METRICS_MFOUND_NDX] > classificationThreshold_float
+
+        # malLabel_mask = ~benLabel_mask
+        # malPred_mask = ~benPred_mask
 
-            benLabel_count = sum_ary[METRICS_BTP_NDX] + sum_ary[METRICS_BFN_NDX]
-            malLabel_count = sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]
+        benLabel_count = sum_ary[METRICS_BTP_NDX] + sum_ary[METRICS_BFN_NDX]
+        malLabel_count = sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]
 
-            trueNeg_count = benCorrect_count = sum_ary[METRICS_BTP_NDX]
-            truePos_count = malCorrect_count = sum_ary[METRICS_MTP_NDX]
+        trueNeg_count = benCorrect_count = sum_ary[METRICS_BTP_NDX]
+        truePos_count = malCorrect_count = sum_ary[METRICS_MTP_NDX]
 #
 #             falsePos_count = benLabel_count - benCorrect_count
 #             falseNeg_count = malLabel_count - malCorrect_count
 
 
-            metrics_dict = {}
-            metrics_dict['loss/all'] = metrics_ary[METRICS_LOSS_NDX].mean()
-            # metrics_dict['loss/msk'] = metrics_ary[METRICS_MASKLOSS_NDX].mean()
-            # metrics_dict['loss/mal'] = metrics_ary[METRICS_MALLOSS_NDX].mean()
-            # metrics_dict['loss/lng'] = metrics_ary[METRICS_LUNG_LOSS_NDX, benLabel_mask].mean()
-            metrics_dict['loss/mal'] = metrics_ary[METRICS_MAL_LOSS_NDX].mean()
-            metrics_dict['loss/ben'] = metrics_ary[METRICS_BEN_LOSS_NDX].mean()
+        metrics_dict = {}
+        metrics_dict['loss/all'] = metrics_ary[METRICS_LOSS_NDX].mean()
+        # metrics_dict['loss/msk'] = metrics_ary[METRICS_MASKLOSS_NDX].mean()
+        # metrics_dict['loss/mal'] = metrics_ary[METRICS_MALLOSS_NDX].mean()
+        # metrics_dict['loss/lng'] = metrics_ary[METRICS_LUNG_LOSS_NDX, benLabel_mask].mean()
+        metrics_dict['loss/mal'] = np.nan_to_num(metrics_ary[METRICS_MAL_LOSS_NDX, malLabel_mask].mean())
+        metrics_dict['loss/ben'] = metrics_ary[METRICS_BEN_LOSS_NDX, benLabel_mask].mean()
+        # metrics_dict['loss/flg'] = metrics_ary[METRICS_FLG_LOSS_NDX].mean()
 
-            metrics_dict['flagged/all'] = sum_ary[METRICS_MOK_NDX] / (sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]) * 100
-            metrics_dict['flagged/slices'] = (malLabel_mask & malFound_mask).sum() / malLabel_mask.sum() * 100
+        # metrics_dict['flagged/all'] = sum_ary[METRICS_MOK_NDX] / (sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]) * 100
+        # metrics_dict['flagged/slices'] = (malLabel_mask & malFound_mask).sum() / malLabel_mask.sum() * 100
 
-            metrics_dict['correct/mal'] = sum_ary[METRICS_MTP_NDX] / (sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]) * 100
-            metrics_dict['correct/ben'] = sum_ary[METRICS_BTP_NDX] / (sum_ary[METRICS_BTP_NDX] + sum_ary[METRICS_BFN_NDX]) * 100
+        metrics_dict['correct/mal'] = sum_ary[METRICS_MTP_NDX] / (sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]) * 100
+        metrics_dict['correct/ben'] = sum_ary[METRICS_BTP_NDX] / (sum_ary[METRICS_BTP_NDX] + sum_ary[METRICS_BFN_NDX]) * 100
 
-            precision = metrics_dict['pr/precision'] = sum_ary[METRICS_MTP_NDX] / ((sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFP_NDX]) or 1)
-            recall    = metrics_dict['pr/recall']    = sum_ary[METRICS_MTP_NDX] / ((sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]) or 1)
+        precision = metrics_dict['pr/precision'] = sum_ary[METRICS_MTP_NDX] / ((sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFP_NDX]) or 1)
+        recall    = metrics_dict['pr/recall']    = sum_ary[METRICS_MTP_NDX] / ((sum_ary[METRICS_MTP_NDX] + sum_ary[METRICS_MFN_NDX]) or 1)
 
-            metrics_dict['pr/f1_score'] = 2 * (precision * recall) / ((precision + recall) or 1)
+        metrics_dict['pr/f1_score'] = 2 * (precision * recall) / ((precision + recall) or 1)
 
-            log.info(("E{} {:8} "
-                     + "{loss/all:.4f} loss, "
-                     + "{flagged/all:-5.1f}% pixels flagged, "
-                     + "{flagged/slices:-5.1f}% slices flagged, "
-                     + "{pr/precision:.4f} precision, "
-                     + "{pr/recall:.4f} recall, "
-                     + "{pr/f1_score:.4f} f1 score"
-                      ).format(
-                epoch_ndx,
-                mode_str,
-                **metrics_dict,
-            ))
-            log.info(("E{} {:8} "
-                     + "{loss/mal:.4f} loss, "
-                     + "{correct/mal:-5.1f}% correct ({malCorrect_count:} of {malLabel_count:})"
-            ).format(
-                epoch_ndx,
-                mode_str + '_mal',
-                malCorrect_count=malCorrect_count,
-                malLabel_count=malLabel_count,
-                **metrics_dict,
-            ))
-            log.info(("E{} {:8} "
-                     + "{loss/ben:.4f} loss, "
-                     + "{correct/ben:-5.1f}% correct ({benCorrect_count:} of {benLabel_count:})"
-            ).format(
-                epoch_ndx,
-                mode_str + '_msk',
-                benCorrect_count=benCorrect_count,
-                benLabel_count=benLabel_count,
-                **metrics_dict,
-            ))
+        log.info(("E{} {:8} "
+                 + "{loss/all:.4f} loss, "
+                 # + "{loss/flg:.4f} flagged loss, "
+                 # + "{flagged/all:-5.1f}% pixels flagged, "
+                 # + "{flagged/slices:-5.1f}% slices flagged, "
+                 + "{pr/precision:.4f} precision, "
+                 + "{pr/recall:.4f} recall, "
+                 + "{pr/f1_score:.4f} f1 score"
+                  ).format(
+            epoch_ndx,
+            mode_str,
+            **metrics_dict,
+        ))
+        log.info(("E{} {:8} "
+                 + "{loss/mal:.4f} loss, "
+                 + "{correct/mal:-5.1f}% correct ({malCorrect_count:} of {malLabel_count:})"
+        ).format(
+            epoch_ndx,
+            mode_str + '_mal',
+            malCorrect_count=malCorrect_count,
+            malLabel_count=malLabel_count,
+            **metrics_dict,
+        ))
+        log.info(("E{} {:8} "
+                 + "{loss/ben:.4f} loss, "
+                 + "{correct/ben:-5.1f}% correct ({benCorrect_count:} of {benLabel_count:})"
+        ).format(
+            epoch_ndx,
+            mode_str + '_ben',
+            benCorrect_count=benCorrect_count,
+            benLabel_count=benLabel_count,
+            **metrics_dict,
+        ))
 
-            if epoch_ndx > 0: # TODO revert
-                self.initTensorboardWriters()
-                writer = getattr(self, mode_str + '_writer')
+        writer = getattr(self, mode_str + '_writer')
 
-                for key, value in metrics_dict.items():
-                    writer.add_scalar('seg_' + key, value, self.totalTrainingSamples_count)
+        prefix_str = 'seg_' if self.cli_args.segmentation else ''
 
-#                 writer.add_pr_curve(
-#                     'pr',
-#                     metrics_ary[METRICS_LABEL_NDX],
-#                     metrics_ary[METRICS_PRED_NDX],
-#                     self.totalTrainingSamples_count,
-#                 )
+        for key, value in metrics_dict.items():
+            writer.add_scalar(prefix_str + key, value, self.totalTrainingSamples_count)
 
-#                 benHist_mask = benLabel_mask & (metrics_ary[METRICS_PRED_NDX] > 0.01)
-#                 malHist_mask = malLabel_mask & (metrics_ary[METRICS_PRED_NDX] < 0.99)
-#
-#                 bins = [x/50.0 for x in range(51)]
-#                 writer.add_histogram(
-#                     'is_ben',
-#                     metrics_ary[METRICS_PRED_NDX, benHist_mask],
-#                     self.totalTrainingSamples_count,
-#                     bins=bins,
-#                 )
-#                 writer.add_histogram(
-#                     'is_mal',
-#                     metrics_ary[METRICS_PRED_NDX, malHist_mask],
-#                     self.totalTrainingSamples_count,
-#                     bins=bins,
-#                 )
-
-    def saveModel(self, epoch_ndx):
-        file_path = os.path.join('data', 'models', self.cli_args.tb_prefix, '{}_{}.{}.state'.format(self.time_str, self.cli_args.comment, self.totalTrainingSamples_count))
+            if not self.cli_args.segmentation:
+                writer.add_pr_curve(
+                    'pr',
+                    metrics_ary[METRICS_LABEL_NDX],
+                    metrics_ary[METRICS_PRED_NDX],
+                    self.totalTrainingSamples_count,
+                )
+
+                benHist_mask = benLabel_mask & (metrics_ary[METRICS_PRED_NDX] > 0.01)
+                malHist_mask = malLabel_mask & (metrics_ary[METRICS_PRED_NDX] < 0.99)
+
+                bins = [x/50.0 for x in range(51)]
+                writer.add_histogram(
+                    'is_ben',
+                    metrics_ary[METRICS_PRED_NDX, benHist_mask],
+                    self.totalTrainingSamples_count,
+                    bins=bins,
+                )
+                writer.add_histogram(
+                    'is_mal',
+                    metrics_ary[METRICS_PRED_NDX, malHist_mask],
+                    self.totalTrainingSamples_count,
+                    bins=bins,
+                )
+
+        score = 1 \
+            + metrics_dict['pr/f1_score'] \
+            - metrics_dict['loss/mal'] * 0.01 \
+            - metrics_dict['loss/all'] * 0.0001
+
+        return score
+
+    def logModelMetrics(self, model):
+        writer = getattr(self, 'trn_writer')
+
+        model = getattr(model, 'module', model)
+
+        for name, param in model.named_parameters():
+            if param.requires_grad:
+                min_data = float(param.data.min())
+                max_data = float(param.data.max())
+                max_extent = max(abs(min_data), abs(max_data))
+
+                bins = [x/50*max_extent for x in range(-50, 51)]
+
+                writer.add_histogram(
+                    name.rsplit('.', 1)[-1] + '/' + name,
+                    param.data.cpu().numpy(),
+                    # metrics_ary[METRICS_PRED_NDX, benHist_mask],
+                    self.totalTrainingSamples_count,
+                    bins=bins,
+                )
+
+                # print name, param.data
+
+    def saveModel(self, type_str, epoch_ndx, isBest=False):
+        file_path = os.path.join('data-unversioned', 'models', self.cli_args.tb_prefix, '{}_{}_{}.{}.state'.format(type_str, self.time_str, self.cli_args.comment, self.totalTrainingSamples_count))
 
         os.makedirs(os.path.dirname(file_path), mode=0o755, exist_ok=True)
 
+        model = self.model
+        if hasattr(model, 'module'):
+            model = model.module
+
         state = {
-            'model_state': self.model.state_dict(),
-            'model_name': type(self.model).__name__,
+            'model_state': model.state_dict(),
+            'model_name': type(model).__name__,
             'optimizer_state' : self.optimizer.state_dict(),
             'optimizer_name': type(self.optimizer).__name__,
             'epoch': epoch_ndx,
@@ -566,6 +699,12 @@ class LunaTrainingApp(object):
 
         log.debug("Saved model params to {}".format(file_path))
 
+        if isBest:
+            file_path = os.path.join('data-unversioned', 'models', self.cli_args.tb_prefix, '{}_{}_{}.{}.state'.format(type_str, self.time_str, self.cli_args.comment, 'best'))
+            torch.save(state, file_path)
+
+            log.debug("Saved model params to {}".format(file_path))
+
 
 if __name__ == '__main__':
     sys.exit(LunaTrainingApp().main() or 0)

+ 1 - 1
p2ch10_explore_data.ipynb

@@ -412,7 +412,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.6.5"
+   "version": "3.6.6"
   }
  },
  "nbformat": 4,

+ 7 - 1
util/disk.py

@@ -78,7 +78,13 @@ class GzipDisk(Disk):
         return value
 
 def getCache(scope_str):
-    return FanoutCache('data/cache/' + scope_str, disk=GzipDisk, shards=32, timeout=1, size_limit=2e11)
+    return FanoutCache('data-unversioned/cache/' + scope_str,
+                       disk=GzipDisk,
+                       shards=128,
+                       timeout=1,
+                       size_limit=2e11,
+                       disk_min_file_size=2**20,
+                       )
 
 # def disk_cache(base_path, memsize=2):
 #     def disk_cache_decorator(f):

+ 2 - 0
util/unet.py

@@ -98,12 +98,14 @@ class UNetConvBlock(nn.Module):
         block.append(nn.Conv2d(in_size, out_size, kernel_size=3,
                                padding=int(padding)))
         block.append(nn.ReLU())
+        # block.append(nn.LeakyReLU())
         if batch_norm:
             block.append(nn.BatchNorm2d(out_size))
 
         block.append(nn.Conv2d(out_size, out_size, kernel_size=3,
                                padding=int(padding)))
         block.append(nn.ReLU())
+        # block.append(nn.LeakyReLU())
         if batch_norm:
             block.append(nn.BatchNorm2d(out_size))
 

+ 9 - 1
util/util.py

@@ -32,7 +32,15 @@ def xyz2irc(coord_xyz, origin_xyz, vxSize_xyz, direction_tup):
 def irc2xyz(coord_irc, origin_xyz, vxSize_xyz, direction_tup):
     # Note: _cri means Col,Row,Index
     coord_cri = np.array(list(reversed(coord_irc)))
-    coord_xyz = coord_cri * np.array(vxSize_xyz) + np.array(origin_xyz)
+    if direction_tup == (1, 0, 0, 0, 1, 0, 0, 0, 1):
+        direction_ary = np.ones((3,))
+    elif direction_tup == (-1, 0, 0, 0, -1, 0, 0, 0, 1):
+        direction_ary = np.array((-1, -1, 1))
+    else:
+        raise Exception("Unsupported direction_tup: {}".format(direction_tup))
+
+
+    coord_xyz = coord_cri * direction_ary * np.array(vxSize_xyz) + np.array(origin_xyz)
     return XyzTuple(*coord_xyz.tolist())
 
 

이 변경점에서 너무 많은 파일들이 변경되어 몇몇 파일들은 표시되지 않았습니다.