We’ll be using packaged data from Reuters, which contains short news articles that were binned into 1 of 46 topics. The task is to train a classifier to classify an article into 1 of 46 topics.
from keras.datasets import reuters (tr_data, tr_labels), (ts_data, ts_labels) = reuters.load_data(num_words=10000)
The num_words argument tells it we want to load only the top 10,000 most common words in the text.
Preparing the Data
The training and test data consists of arrays of indexes which refer to unique words. For example, tr_data is [1,3267,699…]
Index 1 refers to word “?”, which is an indicator for beginning of text. Index 3267 refers to the word “generale” and index 699 refers to the word “de”.
You can decode the data with this:
word_index = imdb.get_word_index() reverse_word_index = dict([(val, key) for (key, val) in word_index.items()]) review = ' '.join([reverse_word_index.get(i-3, '?') for i in tr_data])
We want to one-hot encode this data so that we end up with an array of 10,000 words and an 1 in the array position if that word exist in the review. You can think of deep learning working on an input that look like a 2D table. There needs to be a set number of columns. With review text, there can be arbitrary number of words. We project those words onto 10000 columns, where each column represents one of the 10000 most common unique words.
import numpy as np def vectorize_sequences(sequences, dimension=10000): vector = np.zeros((len(sequences), dimension)) for i, sequence in enumerate(sequences): vector[i, sequence] = 1 return vector tr_x = vectorize_sequences(tr_data) ts_x = vectorize_sequences(ts_data)
Preparing the Labels
The training and testing labels are categorical values from 1 to 46 and we want to one-hot-encode these as well. We could write a similar function as above or use the built-in to_categorical() method
from keras.utils.np_utils import to_categorical tr_y = to_categorical(tr_label) ts_y = to_categorical(ts_label)
Building the Model
Now we build our model or deep learning architecture for multi-class classification.
from keras import models from keras import layers model = models.Sequential() model.add(layers.Dense(128, activation='relu', input_shape=(10000,))) model.add(layers.Dense(128, activation='relu')) model.add(layers.Dense(46, activation='softmax')) model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
3 things to note in the above model-building code:
- In the last hidden layer, it has 46 node because we’re trying to classify the examples into 1 of 46 classes.
- In the last hidden layer, we use a softmax activation function to produce a probability distribution over the 46 different output classes.
- In the model compilation, our loss function is “categorical_crossentropy” for multi-class classification task.
Train and Test Model
Finally, train/fit the model and evaluate over test data and labels.
model.fit(tr_x, tr_y, epochs=4, batch_size=512) model.evaluate(ts_x, ts_y)