Keras: Multi-class Classification Example

We’ll be using packaged data from Reuters, which contains short news articles that were binned into 1 of 46 topics. The task is to train a classifier to classify an article into 1 of 46 topics.

from keras.datasets import reuters
(tr_data, tr_labels), (ts_data, ts_labels) = reuters.load_data(num_words=10000)

The num_words argument tells it we want to load only the top 10,000 most common words in the text.

Preparing the Data

The training and test data consists of arrays of indexes which refer to unique words. For example, tr_data[1] is [1,3267,699…]
Index 1 refers to word “?”, which is an indicator for beginning of text. Index 3267 refers to the word “generale” and index 699 refers to the word “de”.

You can decode the data with this:

word_index = imdb.get_word_index()
reverse_word_index = dict([(val, key) for (key, val) in word_index.items()])
review = ' '.join([reverse_word_index.get(i-3, '?') for i in tr_data[1]])

We want to one-hot encode this data so that we end up with an array of 10,000 words and an 1 in the array position if that word exist in the review. You can think of deep learning working on an input that look like a 2D table. There needs to be a set number of columns. With review text, there can be arbitrary number of words. We project those words onto 10000 columns, where each column represents one of the 10000 most common unique words.

import numpy as np

def vectorize_sequences(sequences, dimension=10000):
    vector = np.zeros((len(sequences), dimension))
    for i, sequence in enumerate(sequences):
        vector[i, sequence] = 1
    return vector

tr_x = vectorize_sequences(tr_data)
ts_x = vectorize_sequences(ts_data)

Preparing the Labels

The training and testing labels are categorical values from 1 to 46 and we want to one-hot-encode these as well. We could write a similar function as above or use the built-in to_categorical() method

from keras.utils.np_utils import to_categorical

tr_y = to_categorical(tr_label)
ts_y = to_categorical(ts_label)

Building the Model

Now we build our model or deep learning architecture for multi-class classification.

from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(128, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(46, activation='softmax'))

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

3 things to note in the above model-building code:

  • In the last hidden layer, it has 46 node because we’re trying to classify the examples into 1 of 46 classes.
  • In the last hidden layer, we use a softmax activation function to produce a probability distribution over the 46 different output classes.
  • In the model compilation, our loss function is “categorical_crossentropy” for multi-class classification task.

Train and Test Model

Finally, train/fit the model and evaluate over test data and labels.

model.fit(tr_x, tr_y, epochs=4, batch_size=512)

model.evaluate(ts_x, ts_y)
Tagged , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: