The power of Shadow DOM

I recently had to inject another site’s webpage into a div in our own site. Back in early 2000, you’d use something like an but this is 2017.The requirements are the following:

  1. Pull the html source of the other site’s webpage
  2. Display it in our own webpage
  3. Make sure all styles are applied
    1. Ensure that styles don’t spill over to our own page’s DOM

Requirement 1: Pull the html source of the other site’s webpage

If you’ve done any web programming in the last decade, you know that simply pulling the html from another site from the frontend javascript is a big no-no, and blocked by cross-site-scripting restrictions. The trick is to have the backend webserver fetch the data and send it to the frontend. In a way, you’re proxying the data. This is a lesson for another time and not the focus of this post.

Requirement 2: Display it in our own webpage

This is actually pretty trivial once your backend webserver sends you the data. If you’re using node, make sure you’re “trusting” the source using $sce.trustAsHtml(htmlString).

Requirement 3: Make sure all styles are applied

But once you do this, you’ll soon realize that while the content is shown, the styling is missing. That’s because the .css files aren’t being fetched, and there’s really no good way of doing that.

You could parse the html and manually fetch the .css. But you’ll run into another problem. That site’s css now conflicts with your own site’s css and you get some weird mixup of styles.

If only there was a way to fetch the css and restrict it to the specific subtree of the DOM…

Shadow DOM

Here’s where this new (yet to be official) W3C standard comes in. As of this writing, Shadow DOM is only supported on Chrome 53.0+ and Safari 10.0+, and Android Webview 53.0+.

The idea is that you can create a subtree within the DOM, but it’s a special shadow subtree. What makes it special is that you can restrict css to just within this DOM.

Why is this such a big deal? Well, it solves the problem at hand. But it also solves a much more general problem.

Have you ever added an id attribute in a page only to find that you or someone else created that same id value on another page? That wasn’t such a big deal back in 2010, but many popular frontend web frameworks (e.g. angularjs, backbone)  consolidate all your pages into one and routes between them. What that means is that id’s can conflict. Even class-based css styles can conflict between two pages (how often have you created a class=”value”, no? just me?)

Anyways, Shadow DOM creates a true separation of DOM subtrees, meaning developers can build webpages as individual modules and not worry about conflicting id’s, classes, and styles. And with websites getting increasingly more complex, a finer more modular approach is needed.

Shadow DOM isn’t official yet, but I would bet my right arm that it will be.

Here’s some angular code to see how simple it is in action.

<div id="shadowSubtree"></div>
...
var shadow = angular.element('#shadowSubtree')[0].attachShadow({mode:'open'});
shadow.innerHTML='<link type="text/css" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css"></link><a class="btn">Bootstrap styled button</a>';

Advertisements
Tagged

When you need things to line up

Suppose you have the following html

<div class="container">
  <div class="thing"></div>
  <div class="thing"></div>
  <div class="thing"></div>
</div>

If you want things to line up horizontally, you could use a tried and true method

.thing {
    float:left;
}

But if the width of the container is not big enough, these things will wrap to the next line.

e.g.

.container {
    width: 50px;
}

Sometimes, you want them to line up and then just cut off when the container ends. For these cases, substitute it with these settings.

.container {
    white-space: nowrap;
}
.thing {
    display: inline-block;
}

If you actually wanted to see the rest of the things, then you could add a horizontal scroll bar

.container {
    overflow-x: scroll;
}
Tagged

Setting up Tomcat for debugging

Short PSA. If you want to set up Tomcat for debugging, set up JPDA.

To do so, simply open the startup.sh file, which is somewhere in TOMCAT/bin folder.

Add these two lines near the top

export JPDA_ADDRESS=7000
export JPDA_TRANSPORT=dt_socket

Then modify the exec line (usually the last line in the file) from this:

exec "$PRGDIR"/"$EXECUTABLE" start "$@"

to this

exec "$PRGDIR"/"$EXECUTABLE" jpda start "$@"
Tagged ,

Upgrading to Jersey 2.x in Tomcat8

Jersey 1.x is still supported. As of this writing, version 1.19.3 was just released on Oct 24th, 2016.

But recently I discovered that Tomcat8 doesn’t play well with Jersey 1.x so we simply have to upgrade to Jersey 2.x. Easy right? (If it were, I wouldn’t be writing this post.)

Jersey 1.x
Let’s review the Jersey 1.x configurations first

As far as maven dependencies, here’s what I used in my pom.xml

<dependency>
  <groupId>com.cedarsoft.rest</groupId>
  <artifactId>jersey</artifactId>
  <version>1.0.0</version>
</dependency>

<dependency>
  <groupId>com.sun.jersey</groupId>
  <artifactId>jersey-json</artifactId>
  <version>1.5</version>
</dependency>

com.cedarsoft.rest:jersey is a bundle that includes these dependencies.

Unfortunately, I could find no such bundle for Jersey 2.x so I had to mix and match until trial-and-error led me to a workable solution. (I save you the trouble.)

Here’s my old web.xml

<web-app version="2.4"
 xmlns="http://java.sun.com/xml/ns/j2ee"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee http://java.sun.com/xml/ns/j2ee/web-app_2_4.xsd">
  <servlet>
    <servlet-name>JerseyREST</servlet-name>
    <servlet-class>com.sun.jersey.spi.container.servlet.ServletContainer</servlet-class>
    <init-param>
      <param-name>com.sun.jersey.config.property.packages</param-name>
      <param-value>INSERT_MY_PACKAGES_AND_CLASSES_HERE</param-value>
    </init-param>
    <load-on-startup>2</load-on-startup>
  </servlet>
  <servlet-mapping>
    <servlet-name>JerseyREST</servlet-name>
    <url-pattern>/rest/*</url-pattern>
  </servlet-mapping>
</web-app>

Jersey 2.x
Now here are the changes for Jersey 2.

   <dependency>
     <groupId>javax.servlet</groupId>
     <artifactId>javax.servlet-api</artifactId>
     <version>3.1.0</version>
     <scope>provided</scope>
   </dependency>
   <dependency>
     <groupId>org.glassfish.jersey.containers</groupId>
     <artifactId>jersey-container-servlet-core</artifactId>
     <version>2.13</version>
   </dependency>
   <dependency>
     <groupId>org.glassfish.jersey.containers</groupId>
     <artifactId>jersey-container-servlet</artifactId>
     <version>2.13</version>
   </dependency>
   <dependency>
     <groupId>com.fasterxml.jackson.core</groupId>
     <artifactId>jackson-databind</artifactId>
     <version>2.8.5</version>
   </dependency>
   <dependency>
     <groupId>org.glassfish.jersey.media</groupId>
     <artifactId>jersey-media-json-jackson</artifactId>
     <version>2.13</version>
   </dependency>
   <!-- jersey file upload dependencies -->
   <dependency>
     <groupId>org.glassfish.jersey.media</groupId>
     <artifactId>jersey-media-multipart</artifactId>
     <version>2.13</version>
   </dependency>

org.glassfish.jersey.media:jersey-media-multipart is only required for file upload capability.

Here’s the new web.xml

<web-app version="3.1"
         metadata-complete="false"
         xmlns="http://java.sun.com/xml/ns/j2ee"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
                             http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd">
  <servlet>
    <servlet-name>JerseyREST</servlet-name>
    <servlet-class>org.glassfish.jersey.servlet.ServletContainer</servlet-class>
    <init-param>
        <param-name>jersey.config.server.provider.classnames</param-name>
        <param-value>org.glassfish.jersey.media.multipart.MultiPartFeature</param-value>
    </init-param>        
    <init-param>
      <param-name>jersey.config.server.provider.packages</param-name>
      <param-value>
        INSERT_MY_PACKAGES_AND_CLASSES_HERE
      </param-value>
    </init-param>
    <load-on-startup>1</load-on-startup>
  </servlet>
  <servlet-mapping>
      <servlet-name>JerseyREST</servlet-name>
      <url-pattern>/rest/*</url-pattern>
  </servlet-mapping>
</web-app>

Notice, the new servlet class.
It’s also important to load the org.glassfish.jersey.media.multipart.MultiPartFeature to support file upload capability.

You might also have to change some of your code.
For example, all your @JsonIgnore annotation classes will have to change from
org.codehaus.jackson.annotate.JsonIgnore
to
import com.fasterxml.jackson.annotation.JsonIgnore

Tagged ,

Quick Ubuntu 16 Setup with java8, mysql 5.7, tomcat7

sudo apt-get update

# get latest java, which is java8 at time of writing
sudo apt-get install default-jdk

# get latest mysql, which is mysql5.7 at time of writing
sudo apt-get install mysql-server

# get tomcat7
sudo apt-get install tomcat7
Tagged , , ,

Meet Watson

I had the opportunity to attend the IBM Watson Conference in San Francisco a few weeks ago. It was an amazing event to showcase the new Watson AI Platform.

IBM Watson is a platform that encompasses many AI and ML (machine learning) services. The so-called Watson Services include API’s to help you understand natural language, convert speech-to-text and text-to-speech, build chatbots, recognize images, perform trade-off analytics and much more.

You can try out demos of some of the services to get a feel of how well they work. Just click on Launch app link next to each Starter Kit.

I believe many of the AI technologies underneath the hood of Watson have been around for many years. I know I have used some of them via different open source toolkits.

However, IBM Watson introduces three game-changing characteristics:

  • Accessibility of knowledge and tools
  • Integration of services
  • Fundamental change in the business model

Accessibility
IBM designed their AI tools in such a way that they are more accessible, not only to non-AI experts, but even to non-developers. Their user interface makes tasks such as training and identifying images very easy to use, without a Phd in computer science  or even programming know-how.

Having a single tool can be useful but is still limiting. More interesting applications will rise from the integration of all your tools. How much can you do with a single screwdriver, right?

Integration
To that effect, Watson supplies you with a workbench of AI tools. The value of the platform is that it can seamlessly integrate and manage all services in one place. There’s a common language between the services so to speak.

Cost Model
Finally, this platform is offered as an on-demand pay-as-you-go service. IBM has historically been known as an enterprise software service. Expect to come with a check with lots of zeros behind it if you want to use something IBM-branded. But Watson has an Amazon Web Services model where you only pay for the service calls and CPU time used on their machines. Instead of millions of dollars to get started, developers can get started for free and small businesses can probably run on a few hundred dollars a month.

Final Thoughts
The knowledge and tools that were once locked in the hands of select companies and experts is now open to the world, and it’s available at a fraction of the cost. Microsoft is moving towards this model and Google has already released free AI tools.  I believe we are entering a new world, where the greatest value will come from those who can blend just the right AI/ML services into the most interesting applications.

As an AI developer, this scares me a bit, but personally, I welcome this new world. It will mean change on my part. I believe my value and that of my company’s will be to offer expert consultation on services like Watson’s and others, to show you the possibilities as well as the boundaries of this new world.

Disclaimer: While I worked for IBM in a previous life, I am not affiliated with them in any way. I am part of a smaller, more agile AI company now that is agnostic to the tools we use. The opinions in this piece are my own, and do not necessarily reflect IBM or my current company’s views.

 

Tagged ,

How to hash mysql varchar/string into different bins

First off, what am I talking about?
Here’s a scenario.

You have a bunch of rows in your table that are URLs. You want to select an evenly distributed random set of them every so often and check if they’re still alive. How do you do this?

You could just take a range.

Get the first 10

SELECT url FROM myTable
ORDER BY url
LIMIT 0, 10

Get the next 10

SELECT url FROM myTable
ORDER BY url
LIMIT 10, 10

And the next 10…

SELECT url FROM myTable
ORDER BY url
LIMIT 20, 10

But what if you wanted to evenly distribute your checks because the URLs from the same site are sequentially ordered.

Well, you could ensure an auto-increment ID and just get chunks of them based on a mod.

So to get every 10th one

SELECT id, url FROM myTable
WHERE id % 10 = 0 

And the next 10th sequence

SELECT id, url FROM myTable
WHERE id % 10 = 1 

And the next 10th sequence

SELECT id, url FROM myTable
WHERE id % 10 = 2 

This is still not that well distributed since you might run into large clusters of site url’s (ie more than 10).
But this also requires a nice integer column value.

Instead, you could also just use the VARCHAR value like the URL itself

So to get the first 10

SELECT id, ... FROM myTable
WHERE CAST(CONV(SUBSTRING(MD5(url), 1, 16), 16, 10) AS SIGNED INTEGER) % 10 = 0

To get the next 10

SELECT id, ... FROM myTable
WHERE CAST(CONV(SUBSTRING(MD5(url), 1, 16), 16, 10) AS SIGNED INTEGER) % 10 = 1

And so on…

Let’s go over what this does.

The MD5() is a hash function that will convert your VARCHAR instead a seemingly random sequence of numbers of letters. It’s not random though. It always converts to the same sequence, but distributes the VARCHAR sequence of characters more uniformly.

The SUBSTRING(…, 1, 16) takes the first 16 digits of the MD5 hash value. I believe this gives you the first 64 bits of it, otherwise there’s a possible overflow error.

The CONV(…, 16, 10) function converts the hash (which is a hex or base-16 value) into a base-10 value.

The CAST(… AS SIGNED INTEGER) function converts it to a signed integer. (If you’re going to read this value into java, you want a signed integer otherwise you’ll get an overflow)

Then simply mod (%) it by the number of bins you want. In my example, I modded it with 10.

Tagged

Formatting DATE and DATETIME in Mysql

Did you ever inherit a table with a VARCHAR for one of the date fields?
Doesn’t seem that bad, except that gives license for people to start putting different date formats into it.

e.g.
2001-May-05 11:30
11-19-2009 23:33
Nov 4, 1998 8:03
3/18/08 3:50
8-15-1999 13:00

You should put these into a DATE or DATETIME column. And here’s how you would parse them

SELECT id, strDate,
  CASE WHEN LENGTH(DATE(STR_TO_DATE(strDate,"%Y-%m-%d %H:%i:%S"))) IS NOT NULL THEN STR_TO_DATE(strDate,"%Y-%m-%d %H:%i:%S")
       WHEN LENGTH(DATE(STR_TO_DATE(strDate,"%Y-%M-%d %H:%i:%S"))) IS NOT NULL THEN STR_TO_DATE(strDate,"%Y-%M-%d %H:%i:%S")
       WHEN LENGTH(DATE(STR_TO_DATE(strDate,"%d-%M-%Y %H:%i:%S"))) IS NOT NULL THEN STR_TO_DATE(strDate,"%d-%M-%Y %H:%i:%S")
  END AS newDate
FROM date_table
WHERE strDate IS NOT NULL

Add as many formats as you like and make sure you test!

Also, if you wanted to update the date_table with this new DATETIME value, you can do this

UPDATE date_table
SET newDate = CASE
  WHEN LENGTH(DATE(STR_TO_DATE(strDate,"%Y-%m-%d %H:%i:%S"))) IS NOT NULL THEN STR_TO_DATE(strDate,"%Y-%m-%d %H:%i:%S")
  WHEN LENGTH(DATE(STR_TO_DATE(strDate,"%Y-%M-%d %H:%i:%S"))) IS NOT NULL THEN STR_TO_DATE(strDate,"%Y-%M-%d %H:%i:%S")
  WHEN LENGTH(DATE(STR_TO_DATE(strDate,"%d-%M-%Y %H:%i:%S"))) IS NOT NULL THEN STR_TO_DATE(strDate,"%d-%M-%Y %H:%i:%S")  
END
WHERE strDate IS NOT NULL
AND newDate IS NULL

One thing to note. If you’re going to CREATE, UPDATE or INSERT into a table with these values, there’s a chance you may run into the following error

 “Incorrect datetime value: ‘XXXX’ for function str_to_date”

It may be that your MySql server is running in strict mode.

To check, run

select @@session.sql_mode

It might produce something like “STRICT_ALL_TABLES” or “STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION”

To set it to a less strict mode, run

set session sql_mode =''

Now your UPDATE, INSERT or CREATE should work.

Once it completes, you may want to set the sql_mode back to the previous value.

Tagged

Dear Mom, Yours Truly, Program

Have you ever wanted to write an email to your mom… sent by your program? Of course! What respectable programmer hasn’t. Well, this tutorial will show you how to do it from Java.

import javax.mail.Message;
import javax.mail.PasswordAuthentication;
import javax.mail.Session;
import javax.mail.Transport;
import javax.mail.internet.InternetAddress;
import javax.mail.internet.MimeMessage;


  private String SMTP_HOST = "smtp.office365.com";
  private String SMTP_PORT = 587;
  private boolean DEBUG = true;

  private static void sendEmail(String contentType, final String login, final String password,
                                String fromEmail, String replyToEmail, String[] a_to, String a_subject,
                                String a_contents) {
    try {
      Properties props = new Properties();
      props.put("mail.smtp.auth", "true");
      props.put("mail.smtp.starttls.enable", "true");
      props.put("mail.smtp.host", SMTP_HOST);
      props.put("mail.smtp.port", SMTP_PORT);
      Session session = Session.getInstance(props,
          new javax.mail.Authenticator() {
        protected PasswordAuthentication getPasswordAuthentication() {
          return new PasswordAuthentication(login, password);
        }
      });
      if( DEBUG ) session.setDebug(true);

      MimeMessage message = new MimeMessage(session);
      message.setFrom(new InternetAddress(fromEmail));
      for (String toTarget : a_to) {
        message.addRecipient(Message.RecipientType.TO, new InternetAddress(
            toTarget));
      }
      message.setFrom(new InternetAddress(fromEmail));
      message.setReplyTo(new InternetAddress[] { new InternetAddress(replyToEmail) });
      message.setSubject(a_subject);
      message.setContent(a_contents, contentType);
      message.setHeader("Content-Transfer-Encoding", "7bit");

      Transport.send(message);
    } catch (Exception e) {
      throw new RuntimeException("Unable to send HTML mail", e);
    }
  }

I won’t explain the code. Just use it. You’re smart. You’ll figure it out.

So let’s say we wanted to send out from a gmail account. Just change SMTP_HOST to “smtp.gmail.com” and we’re good right? No, not quite.

First of all, you need to create an App password as opposed to using your account login password.

Then you’ll test it on your local machine and exclaim “Yay! It works!”. Then you push it to your Amazon server and go to Friday Happy Hour. Then on Sat at 4am, you will get an error that wakes you up in the middle of the night saying Email failed like this

Caused by: javax.mail.MessagingException: [EOF]
        at com.sun.mail.smtp.SMTPTransport.issueCommand(SMTPTransport.java:1481)
        at com.sun.mail.smtp.SMTPTransport.helo(SMTPTransport.java:917)
        at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:417)
        at javax.mail.Service.connect(Service.java:310)
        at javax.mail.Service.connect(Service.java:169)
        at javax.mail.Service.connect(Service.java:118)
        at javax.mail.Transport.send0(Transport.java:188)
        at javax.mail.Transport.send(Transport.java:118)

Why didn’t it work? Seems Google smtp servers have blocked emails from being sent from AWS machines. Why? Because they hate you, and also probably to prevent spam machines from polluting our internet.

So what can you do to send from a Gmail account? You have to use Google’s own brand of code. It looks the same, but take a close look at the imports.

import com.google.code.javax.mail.Message;
import com.google.code.javax.mail.PasswordAuthentication;
import com.google.code.javax.mail.Session;
import com.google.code.javax.mail.Transport;
import com.google.code.javax.mail.internet.InternetAddress;
import com.google.code.javax.mail.internet.MimeMessage;

  private static void sendEmail(String contentType, final String username, final String password,
                                String fromAddr, String replyToEmail, String[] toAddr, String subj,
                                String txt) {
    
    Properties props = new Properties();
    props.put("mail.smtp.auth", "true");
    props.put("mail.smtp.starttls.enable", "true");
    props.put("mail.smtp.host", "smtp.gmail.com");
    props.put("mail.smtp.port", "587");
    Session session = Session.getInstance(props,
        new com.google.code.javax.mail.Authenticator() {
          protected PasswordAuthentication getPasswordAuthentication() {
            return new PasswordAuthentication(username, password);
          }
        });
    try {
      Message message = new MimeMessage(session);
      message.setFrom(new InternetAddress(username, fromAddr));
      for (String ta : toAddr) {
        message.addRecipients(Message.RecipientType.TO,
            InternetAddress.parse(ta));
      }
      message.setSubject(subj);
      message.setContent(txt, "text/html; charset=utf-8");
      Transport.send(message);
    } catch (Exception e) {
      throw new RuntimeException(e);
    }
  }  

Ok, now go write your mom an email.

Tagged , , , ,

Maven is a weirdo when it comes to resolving dependencies

The other day, I was experimenting with maven dependency conflicts. That is, if your project uses the dependency with conflicting versions, maven will resolve/pick one for you depending on your rules and it’s heuristics.

For the record, I’m using maven 3.3

According to maven docs

“by default Maven resolves version conflicts with a nearest-wins strategy”

You’d think these heuristics are simple, but not really. Let’s look at some examples.

Let’s say you have a pom with two conflicting dependencies

<dependencies>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.4</version>
</dependency>

<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.5</version>
</dependency>
</dependencies>

You can run “mvn dependency:tree -Dverbose” to see which of the two commons-codec version it picks.

In this case, maven seems to prefer the last commons-codec in the list of dependencies. 

That makes some sense. Maybe developers have the habit of adding dependencies to the end of the list so maven prefers that one

Let’s suppose we have a dependency, such as hadoop-common, that depends on commons-codec 1.4 and we have a commons-codec 1.5 dependency at the top-level. Which version would it prefer then?

<dependencies>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.5</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>

Maven prefers the top-level commons-codec 1.5 version here.

Even though the commons-codec 1.4 within hadoop-common comes later in the dependency list, it prefers the top-level one that the develop explicitly chose. This makes sense since the top-level dependency is explicitly chosen by the developer while the one within hadoop-common is somewhat more implicit. So maven seems to obey explicit top-level dependencies.

Here’s where it gets a little weird. What happens if we have two dependencies that depend on different versions of commons-codec?

poi depends on commons-codec 1.5 and hadoop-common depends on commons-codec 1.4

<dependencies>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.8-beta5</version>
</dependency>

<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>

Maven will choose the FIRST version it sees, in this case, it will prefer commons-codec 1.5 found in the earlier poi dependency.

This is a bit counter-intuitive. Remember that previously, maven prefers the LAST version of commons-codec when both were listed in the top-level.

Let’s dive deeper. Does the depth at which commons-codec is found matter?

hadoop-client depends on hadoop-common which depends on commons-codec 1.4. And poi depends on commons-codec 1.5

<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.4.0</version>
</dependency>

<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.8-beta5</version>
</dependency>
</dependencies>

Maven prefers poi’s common-codec 1.5 since it is found at the 2nd-level, whereas common-codec 1.4 is found at the 3rd-level of hadoop-client.

It seems that the closer to the top-level the dependency is, the more maven prefers it. This is probably consistent with the fact that maven picks explicit top-level dependencies over sub-dependencies at lower levels. You can try switching the order of hadoop-client and poi and you’ll see that the depth is more important than the dependency order here.

So do you think you have a good handle on how maven resolves dependencies?