Tensorflow1 is an open source platform that allows ease of creating your own machine learning algorithm. Instead of dealing with linear algebra and writing each layer by individually, you can simply specify the number of nodes, layers and activation function. In the following section, we will demonstrate how to retrain a machine learning algorithm for a new classification task.
Getting started
The following instructions are for Ubuntu and a Nvidia graphics card. Install Nvidia driver, Docker (https://docs.docker.com/install/) and Nvidia docker (https://github.com/NVIDIA/nvidia-docker). Pull the tensorflow docker image using the terminal.
For docker image with gpu support and jupyter notebook:
$ docker pull tensorflow/tensorflow:2.0.0b1-gpu-py3-jupyter
For docker image without gpu support:
$ docker pull tensorflow/tensorflow:2.0.0b1-py3-jupyter
For docker image with gpu support without jupyter noteboook:
$ docker pull tensorflow/tensorflow:2.0.0b1-gpu-py3
To start a jupyter notebook:
$ docker run --runtime=nvidia -it -p 8888:8888 tensorflow/tensorflow:2.0.0b1-gpu-py3-jupyter
Alternatively, to start a tensorflow session without using jupyter notebook:
$ docker run --runtime=nvidia -it tensorflow/tensorflow:2.0.0b1-gpu-py3
To test whether installation is correct in jupyter notebook:
import tensorflow as tf
print(tf.__version__)
This should give the output 2.0.0-beta1, as shown in Fig. 10
The following are performed in jupyter notebook.
Install the following ( Fig. 11 ):
pip install keras
pip install -U scikit-learn
pip install scikit-image
Upload your data in a zip or tgz file (example data: http://download.tensorflow.org/example_images/flower_photos.tgz). You can also use your own data, however, the structure should follow that of figure 12. Each of your classifications (Class_1, Class_2, etc) has a folder and images of that classification are placed into the folder. In this example data, the classifications are the types of flowers: daisy, dandelion, roses, sunflowers and tulips.
To extract your uploaded file, use:
import tarfile
tar = tarfile.open("RANZCR_data.tgz", "r:gz")
tar.extractall()
tar.close()
The algorithm
The following code is adapted from the tensorflow 2.0 beta online tutorial (https://www.tensorflow.org/beta).
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
import pathlib
import random
import numpy as np
AUTOTUNE = tf.data.experimental.AUTOTUNE
## Import needed modules
def load_and_preprocess_image(path):
image = tf.io.read_file(path)
return preprocess_image(image)
def preprocess_image(image):
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, (299, 299))
image /= 255.0
return image
## Define functions for loading images. Inception v3 expects a image size of 299 x 299 pixels. We need to normalize the image so that pixel values lie between 0 and 1.
data_root = pathlib.Path('./RANZCR_data')
## This is where the data is located
all_image_paths = list(data_root.glob('*/*'))
all_image_paths = [str(path) for path in all_image_paths]
random.shuffle(all_image_paths)
image_count = len(all_image_paths)
## Gives list of all the image paths then shuffles them.
label_names = sorted(item.name for item in data_root.glob('*/') if item.is_dir())
## label_names is taken from the names of the subfolders of data_root
label_to_index = dict((name, index) for index, name in enumerate(label_names))
## We assign an index to each each label, so each label becomes a number
all_image_labels = [label_to_index[pathlib.Path(path).parent.name]
for path in all_image_paths]
## Now we create a list of labels for each file
path_ds = tf.data.Dataset.from_tensor_slices(all_image_paths)
image_ds = path_ds.map(load_and_preprocess_image, num_parallel_calls=AUTOTUNE)
## This loads and formats images over the dataset of paths
label_ds = tf.data.Dataset.from_tensor_slices(tf.cast(all_image_labels, tf.int64))
image_label_ds = tf.data.Dataset.zip((image_ds, label_ds))
ds = tf.data.Dataset.from_tensor_slices((all_image_paths, all_image_labels))
def load_and_preprocess_from_path_label(path, label):
return load_and_preprocess_image(path), label
image_label_ds = ds.map(load_and_preprocess_from_path_label)
## Create our image, label pairs
BATCH_SIZE = 32
ds = image_label_ds.shuffle(buffer_size=image_count)
ds = ds.repeat()
ds = ds.batch(BATCH_SIZE)
ds = ds.prefetch(buffer_size=AUTOTUNE)
## Shuffles the dataset
inception = tf.keras.applications.inception_v3.InceptionV3(input_shape=(299, 299, 3), include_top=False, weights='imagenet')
inception.trainable=False
## Loads the weights of inception v3, pretrained on imagenet.
def change_range(image,label):
return 2*image-1, label
keras_ds = ds.map(change_range)
## model expects to be normalised to [-1,1] range. So need to convert from [0,1] to [-1, 1]
model = tf.keras.Sequential([
inception,
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(len(label_names), activation = 'softmax')])
## This is our model, with the base model as inception v3 and the output model layer having the same number of nodes as our label_names
model.compile(optimizer=tf.keras.optimizers.Adam(),
loss='sparse_categorical_crossentropy',
metrics=["accuracy"])
## We compile the model, specifying the optimizer, how loss is calculated
steps_per_epoch=tf.math.ceil(len(all_image_paths)/BATCH_SIZE).numpy()
model.fit(keras_ds, epochs=10, steps_per_epoch=steps_per_epoch)
## Training our model, specifying the number of epochs (one epoch is when all the training data has been passed through the algorithm) and number of steps per epoch. See example of training in Fig. 12 .
Testing images
from keras.preprocessing import image
from keras.applications.inception_v3 import preprocess_input
## importing the modules we need
img_path='test_image.jpg'
img = image.load_img(img_path, target_size=(299, 299))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print(preds)
## This gives an output based on the probability of each category. See example in Fig. 13 . The test_image was a tulip and the algorithm correctly identified it with 97% probability.
print(label_names)
## This is the order of our labels. See Fig. 13 .
import IPython.display as display
display.display(display.Image(img_path))
## This loads our test image so we can visually see if the algorithm has predicted correctly.