Super quick guide to starting keras job using AWS

You can follow my last post on how to set up all the drivers, docker, and jupyter to get your own keras/tensorflow up and running from a plain ubuntu machine.

But AWS provides an AMI that includes everything you need without the dockerization.

Simply launch a Deep Learning AMI. I used the following one.
Screen Shot 2019-02-09 at 5.58.29 PM

Then ssh into your machine and simply run

jupyter notebook --ip=0.0.0.0 --allow-root --NotebookApp.token=''

Point your browser to http://SERVER:8888/lab

And that’s it!

Advertisements
Tagged , ,

Setting up Docker Container with Tensorflow/Keras using Ubuntu Nvidia GPU acceleration

Deep learning is all the rage now. Here’s a quick and dirty guide to setting up a docker container with tensorflow/keras and leveraging gpu accelerations. The info here is available on the official sites of Docker, Nvidia, Ubuntu, and Tensorflow, but I put it all together here for you so you don’t have to hunt around.

I’m assuming you’re on Ubuntu with an Nvidia GPU. (I tested on Ubuntu 18)
In AWS, you can set your instance type to anything that starts with p* (e.g. p3.16xlarge).

Download the Nvidia driver

Visit https://www.nvidia.com/object/unix.html
(Probably pick the Latest Long Lived Branch Version of Linux x86_64/AMD64/EM64T)

wget the download link
e.g.

wget http://us.download.nvidia.com/XFree86/Linux-x86_64/410.93/NVIDIA-Linux-x86_64-410.93.run

Run the nvidia driver install script

chmod +x NVIDIA-Linux-x86_64-410.93.run
sudo ./NVIDIA-Linux-x86_64-410.93.run

Install Docker
reference

sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

sudo apt-get update

sudo apt-get install docker-ce

Install Nvidia-Docker 2
reference

# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
  sudo apt-key add -

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update

# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd

# Test nvidia-smi with the latest official CUDA image
sudo docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi

This is some personal misc setup

Create a “notebooks” under your home dir (/home/ubuntu)

mkdir ~/notebooks

Create a jupyter start up script in your home folder (/home/ubuntu)
filename: jup
Content:

if [ $# -eq 0 ]
  then
    cd $NOTEBOOK_HOME && jupyter notebook --ip=0.0.0.0 --allow-root --NotebookApp.token=''
  else
    cd $1 && jupyter notebook --ip=0.0.0.0 --allow-root --NotebookApp.token=''
fi

Start Docker container with tensorflow-gpu

sudo docker run --runtime=nvidia --env NOTEBOOK_HOME=/home/ubuntu/notebooks -p 8888:8888 -p 8080:8080 -v /home:/home -it --rm tensorflow/tensorflow:latest-gpu-py3-jupyter bash

This docker container will give you tensorflow with gpu support, python3, and a jupyter notebook.
For a list of other tensorflow container (e.g. non-gpu or python2 versions), see here.

If you created the jup script earlier, you can call that to start the Jupyter Notebook. This will also point the notebook home dir to ~/notebooks folder you created:

/home/ubuntu/jup

If you did not install the jup script, then you can run the following command.

jupyter notebook --allow-root

Note that the first time you invoke this, you’ll need to hit the url with the token that’s given to you

To exit the terminal without shutting down Jupyter notebook and the docker container:

Hit Ctrl+p+q

Inside Jupyter Notebook
Open a browser to:

http://SERVER:8888/tree

Some packages require git, so you may install it like so

!apt-get update
!apt-get install --assume-yes git

Inside the notebook, you can install python libraries like so:

!pip install keras
!pip install git+https://www.github.com/keras-team/keras-contrib.git

You can check to make sure your keras is using gpu as backend:

from keras import backend
assert len(backend.tensorflow_backend._get_available_gpus()) > 0
backend.tensorflow_backend._get_available_gpus()

And that’s how you create a docker container with gpu support in ubuntu.
After you install your packages, feel free to save your docker image so you don’t have to redo the apt-get and pip installs every time.

Tagged , , , ,

Deep Learning with Google Colab

I am beginning to learn deep learning and I’ve been working in the AWS environment with dockerized containers of tensorflow and keras. But it’s been a bit of a pain, transferring files fro/to the machines, starting/stopping them, etc. It’s also pretty expensive for a gpu machine.

Google is now offering free GPU-acclerated Jupyter notebooks which they call Colab. You just create a folder in your Google Drive and a Colaboratory App.
Just follow this tutorial and you’ll be up and running in 5 minutes!

Google Colab Free GPU Tutorial

(Only hitch I had was in mounting the drive. Blog said to run

drive.mount(‘/content/drive/’)

That gave me an error. Instead I ran

drive.mount(‘/content/drive’)

Note the removal of the trailing slash

Tagged ,

Angularjs: programmatically clicking a link on behalf of user

If you have a link, you can give embed it in the html for the user

<a>Click here</a>

But what if you didn’t have the url already but it had to be fetched. When does this scenario arise? As an example, I’ve developed a protocol where the user clicks a link which invokes an API that generates some data, which then the user can click on. Now I could have the user click the button once to generate a one-time download link and then write that link on the page and have the user click again. But that’s 2 clicks. Can we do the 2nd click on behalf of the user?

Yes, we can. Here’s how.

<a href="">Click here</a>

$scope.initiateDownloadSequence = function(first_url) {
  $http.get(first_url)
    .then(function(resp) {
      var second_url = resp.data;

      var anchor = document.createElement("a");
      anchor.href = second_url;

      // next 2 lines only if it's a download url
      // var filename = my-download.txt;
      // anchor.download = filename; 
    
      anchor.click();
    })
}
Tagged ,

AngularJs: Initiating Downloads with Headers

If you have an url to download a file, you can simply just put it in the html like so

<a href="/path/to/download" download>Download Now</a>

But what if that path is an API that requires authentication or some other headers to be set when it’s invoked? The above solution does not send over any headers that your app needs. Instead do this:

<a href="">Download Now</a>

  $scope.startDownload = function(downloadUrl) {

    $http({
      method: 'GET',
      url: downloadUrl,
      headers: {
        'Authorization': 'Bearer XSDREDX...'
      }
    }).then(function(resp) {
        var data = JSON.stringify(resp.data);
        var blob = new Blob([data], { type: "text/plain" });
        let objectUrl = window.URL.createObjectURL(blob);

        let anchor = document.createElement("a");
        anchor.href = objectUrl;
        var filename = 'downloaded-file.txt';
        anchor.download = filename;
        anchor.click();

        window.URL.revokeObjectURL(objectUrl);      
      })
  }

Tagged ,

ellipsis

Use this to ellipsicize content that doesn’t fit within its specified width.

.ellipsis {
  white-space: nowrap;
  overflow: hidden;
  text-overflow: ellipsis;
  -o-text-overflow: ellipsis;
}
Tagged

Using AWS Glacier

This is a surprisingly difficult task and infinitely harder than using AWS S3. So why store stuff in Glacier? Because it’s cheap.

Here’s how much you will pay to store a 1TB (1000GB) file today.

Storage Cost per GB Monthly Cost
EBS SSD $0.10/GB $100/month
EBS Snapshot $0.05/GB $50/month
S3 $0.023/GB $23/month
Glacier $0.004/GB $4/month

What else do you need to know about it? Glacier retrieval is not instant. You make a request, it takes awhile to get fulfilled, and then you pick it up when it’s ready. That’s what makes it different from other storage models.
 

Anyways, here are the steps

1) In the AWS Console, go to the Glacier service and create a vault (e.g. my-vault)

2) Make sure you have a user that has permissions to the vault. Here’s a sample policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "glacier:*"
            ],
            "Sid": "Stmt1376667184000",
            "Resource": [
                "arn:aws:glacier:us-east-1:112233445566:vaults/my-vault"
            ],
            "Effect": "Allow"
        }
    ]
}

3) On your machine, ensure you have the awscli

4) Upload your files to the vault using the awscli

aws glacier upload-archive --vault-name my-vault --account-id - --body my-file.zip

Output will look like this:

{
    "checksum": "e5d002bf40...",
    "location": "/112233445566/vaults/my-vault/archives/KYKdL...",
    "archiveId": "KYKdL..."
}

5) To download, you have to make a request. Depending on the Tier service, it could take minutes to hours before your request is fulfilled.

First, create a request.json file like the following:

{
  "Type": "archive-retrieval",
  "ArchiveId": "KYKdL...",
  "Description": "Retrieve archive on 2015-07-17",
  "Tier":"Expedited",
  "SNSTopic":"arn:aws:sns:us-east-1:112233445566:glacier-alert"
}

Type defines the type of job. In this case, you want “archive-retrieval” to retrieve the archived file.

ArchiveId is the archiveId returned in the output when you uploaded the file.

Tier determines how quickly your request is fulfilled. It has several choices that vary in speed and price.
For the most up-to-date tier pricing and speeds, check Data Retrievals section of FAQ
Here’s the pricing and speeds as of this writing

Tier Price Fullfillment
Standard (default) $0.01/GB + $0.05/retrieval 3 – 5 hours
Bulk $0.01/GB + $0.05/retrieval 5 – 12 hours
Expedited $0.03/GB + $0.01/retrieval 1 – 5 minutes

SNSTopic allows you to register for alerts when the request is done.
You can create SNS Topics in the Simple Notification Service tab of the AWS Console and then copy the Topic ARN here.

Next, run this command to initiate the job request

aws glacier initiate-job --vault-name my-vault --account-id - --job-parameters file://request.json

You’ll get a job with a JobID which you’ll need later.

6) You can also check the status of the job like this:

aws glacier describe-job --vault-name my-vault --account-id - --job-id WI6sdXS...

7) Once the job is complete, you can get the output of the job

aws glacier get-job-output --vault-name my-vault --account-id - --job-id WI6sdXS... [MY-OUTPUT-FILE]

4.5) Yes, 4.5 because this should have gone between steps 4 and 5. You can list the files in your vault like so:

aws glacier initiate-job --account-id - --vault-name my-vault --job-parameters '{"Type": "inventory-retrieval"}'

So why didn’t I just tell you about this earlier? Well because, you don’t get back a list of archives after calling this. Notice you “initiate-job” again. Which means, you have to wait for the job to complete (step 5) and then get the output of the job (step 6). So you have to learn steps 5 and 6 before you can do step 4.5.

Clear as mud? Good.

Tagged , ,

Auth0: enriching the id token and access token

With a little auth0 experience under my belt, let’s dive further into a new topic. What are the ID Token and Access Token.

Once you log in, you get back an object that looks like this in angular

{
accessToken: “eyJ0eX…”
, idToken: “eyJ0eXA…”
, idTokenPayload: {
sub: “auth0|5adfa…”
, nickname: “kane”
, name: “ksee@inferlink.com”
, picture: “https://s.gravatar.com/…&#8221;
, email: “ksee@inferlink.com”
}
, …
}

At this point, you’ve got the idTokenPayload which stores all the info you would need about the user. But if you need more, you can create additional “claims”.
Reference

To do this, create a new Rule in auth0.com. Start with an empty rule and paste in something like this:

function (user, context, callback) {
  const namespace = 'https://myapp.example.com/';
  context.idToken[namespace + 'favorite_color'] = user.favorite_color;
  context.idToken[namespace + 'preferred_contact'] = user.user_metadata.preferred_contact;
  callback(null, user, context);
}

Now your payload should contain the favorite_color and preferred_contact fields.

But what about the backend? We only send the access Token in the Authorization header and you’ll notice that you don’t have these other neat user fields in there.

You can do the same thing with a Rule. Simply use context.accessToken instead like so:

function (user, context, callback) {
  const namespace = 'https://myapp.example.com/';
  context.accessToken[namespace + 'favorite_color'] = user.favorite_color;
  context.accessToken[namespace + 'preferred_contact'] = user.user_metadata.preferred_contact;
  callback(null, user, context);
}

Auth0 Authentication and NodeJs

If you follow this blog, you’ll know I posted a blog and starting kit for doing authentication in nodejs leveraging passportjs. In it, I rolled out my own authentication package which you can use to start your own projects.

What if you want a more robust solution with plenty more features? I’ve had the opportunity to switch my company to a 3rd party vendor’s solution: auth0. And it’s pretty good.

So what do you get with auth0 that you don’t get with my home-grown solution?

  • You can easily offer other types of Social logins without doing any extra work (e.g. Facebook, Google, Twitter logins)
  • You get security features such as email verification and multifactor authentication
  • You get a user management console to delete users, add users, etc.
  • You get Single Sign On (SSO). It’s what makes it possible for you to sign into gmail service and then use google calendars and google docs without having to sign in again. This is very useful if you have a suite of services to offer users across different domains.

What’s the cost?

  • Pricing (at the time of this post, they’re offering 7k free active users and unlimited logins)
  • There is a learning curve to understand how to use this very flexible but complex system. They have lots of examples and offer technical support. Fortunately for you, I’m going to simplify things for you by giving you a starter kit although it’s specific to our application model. You can probably set up a lot of other different authentication models with their service.

So what’s similar between auth0 and what we did here? For nodejs, they both leverage passportjs. They both use jwt to sign the payload to prevent tampering. But there are definitely big enough workflow differences that we need to build a different wrapper around it.

One big difference is that the user is redirected off your site to auth0.com for login and registration. Once complete, they are redirected back to your site. So if you hold any temporary variables/state, you’ll lose it. In the previous implementation, for example, we held the last url route in the $rootScope so we could redirect the user back after login. However, the $rootScope is wiped once you leave the site to auth0.com for login, so we have to leverage the browser’s web store instead.

You still have to build some code around auth0 even though it tries to abstract away most of the hard authentication stuff. For example, once you log in, it hands you an access_token and id_token. It’s up to you to store that for the duration of the user’s session. Fortunately, we build some of this infrastructure the last time we made our home-grown authentication starter kit. We just have to adapt it a bit to auth0.

For my starter kit, I started with auth0’s angular example which can be found here.

To integrate into our application model, I made the following changes and additions.

AuthService

Their AuthService was in auth.service.js and I merged it in with other services and options in a auth0.services.js file.

By default, the authResult.idTokenPayload (in handleAuthentication()) included only a sub field

{sub:”auth0|ea4d8b…”}

I modified the login call to angularAuth0.authorize() to include a scope

    angularAuth0.authorize({
      scope:'openid profile email'
    });

This returns much more info back to your app

{
sub:”auth0|ea4d8b…”,
nickname:”…”,
name:”…”,
email:”…”,
email_verified: true,

}

I also implemented a redirect so that after the user returns from auth0’s login page, they land back on the page they last visited.

function login() {
  $window.localStorage.setItem('nextPath', $location.path());
}

function handleAuthentication() {
  angularAuth0.parseHash(function(err, authResult) {
    var nextPath = $window.localStorage.nextPath;
    if( nextPath ) {
      $location.path(nextPath);
    }
}

UserSession

I added a UserSession service to store auth0’s tokens and user info. This service is the same as before.

authHandler

Just like before, there’s an authHandler service that handles redirecting the user to auth0’s login page by invoking AuthService.login() when it detects a 401 (Unauthorized).

It also attaches the Authorization header to every request once a login is established.

User Registration Hook

If you want to track the users on your app, then you can set up an auth0 Hook. Specifically, you can set up a Post User Registration hook so that when a new user registers on auth0, auth0 will invoke a hook with small piece of code you write to store that info on your app.

There are 2 parts to this. First set up your web hook. In the auth0 website, go to Hooks and then under Post User Registration, click Create New Hook

auth0_hook.png

Then you want to edit the code snippet and add something like this. Make sure to change the URL value

module.exports = function (user, context, cb) {
  var request = require('request');
  
  var url = '##INSERT_YOUR_URL##';
  var data = {
    userid: user.id,
    name: user.username,
    email: user.email
  }
  var options = {
    uri: url,
    method: 'POST',
    json: true,
    headers: {
        "content-type": "application/json",
        },
    body: data
  };
  console.log('logging registration')
  request(options, function(error, response, body) {
    if( error) {
      console.error(error)
    } else {
      console.log('sucessfully logged registration')
    }    
    cb();

  })
};

The second part of this is on your app server’s side. You’ll need to implement the api endpoint called by the hook to receive the user info and store it in your database. I’ve provided a sample in my starter kit but make sure you modify it to store it in your database table as you see fit.

So without further ado, here’s a link to the starter kit with all the pieces of the code you need to start an angular/nodejs project with auth0.
https://github.com/kanesee/ng-node-auth0-kit/tree/master/shared

This blog explains the components of the code but the README hopefully provides enough info for you to start using it.

Tagged , ,

Using supervisord to start Docker services

Automatically start you docker services

Alternative solutions: http://centos-vn.blogspot.com/2014/06/daemon-showdown-upstart-vs-runit-vs.html

supervisord: https://docs.docker.com/engine/articles/using_supervisord/