Introduction to ml5.js for Beginners


When it comes to machine learning the languages that come to people’s minds are Python or R. But not many we will be aware that there is a growing number of ML libraries in Javascript and ml5.js is one of them. In this article, we will give you a brief introduction of what is ml5.js, why it should be used, and what are its various pre-trained models offerings.

What is ml5.js?

What is ml5.js for Beginners

ml5.js is a javascript library built on top of Tensorflow.js that provides access to various machine learning and deep learning algorithms within the browser. This library is intended to make machine learning accessible to the huge Javascript community.

ml5.js uses Tensorflow.js API layer to easily define, train, and test models and it also supports GPU acceleration to enhance computational efficiency. It has built-in functionalities for other utilities also like visualization, rendering, porting on other platforms which adds to the user’s ease of working.

Why use web-based machine learning?

The ability to make machine learning models in the browser is very useful as it can keep the data within the user’s browser only. The added bonus of being able to use webcams and microphones directly without using any external libraries as is the case with other languages makes it quite convenient.

Why use ml5.js?

Introduction to ml5.js for Beginners

There are other Javascript libraries for machine learning and Tensorflow.js is quite the popular library then why you should use ml5.js?

Well to begin with ml5.js is actually built on Tensorflow.js and it adds a layer of API abstraction. This means you don’t need to struggle with the low-level nitty-gritty of Tensorflow.js yet you can leverage the same power by using the high-level ml5.js library.

Below are some more points that add to ml5.js credibility.

  • Supports acceleration using WebGL and GPUs
  • Native support for converting browser I/O streams to model input data structures
  • Standardization of model format
  • Small model sizes, low-latency, portable model format
  • Out of the box Pretrained models for turnkey use.

How to Install and Use ml5.js 

Since ml5.js is a javascript library it does not require any explicit installations, one can use it easily by importing it using a script tag.

You can use the below code to import the latest version of ml5.js:

<script src="" type="text/javascript"></script>

Or if you are interested in a certain version of ml5.js use:

<script src="<version>/dist/ml5.min.js" type="text/javascript"></script>

Ml5.js pre-trained models

1. Image Based Models

i) Image classifier

The image classifier of ml.js is trained on 15 million images and can successfully classify approximately 1000 different classes ranging from a dog to an aircraft carrier. It can be applied to images as well as videos.

Image Classification
Image classification

ii) Object detector

This ml5.js pre-trained model can detect objects defined within the COCO dataset, which is large-scale object detection, segmentation, and captioning dataset. The model is capable of detecting 90 classes of objects.

iii) Body pose detection

poseNet() another pretrained model of ml5.js that allows real-time pose estimation on image/video feed. It can estimate multiple poses and detect multiple people in a single image or video.

3D human pose estimation
3D human pose estimation

iv) Body Pix

Body Pix is a segmentation model of ml5.js build on top of tensorflow.js. It can classify an image as ‘having a person and not and further extract it from the original image. It detects a human body and 24 different parts(hands, legs, face, torso, etc).

v) U-Net model

U-Net model is developed by the computer science department of the University of Freiburg, Germany. It is a (fully convolutional) image segmentation model used in the biomedical field which can provide can be efficiently trained on a small dataset.

P.S: Currently it can only segment and extract faces from images and videos.

 BodyPix: Real-time Person Segmentation in the Browser with
Person segmentation

vi) Handpose model

Built on top of the tensorflow.js ‘HandPose’ model can successfully detect 21 landmarks of the hands and palm. Obviously, detection models are heavyweight but ml5.js models are lightweight and can be easily used with webcams in the browser for real-time handpose detection.

vii) Face mesh model

Another extension of tensorflow.js pre-trained models is the face-mesh model. It is designed to detect multiple faces and 486 3-D landmarks in a video or image feed. It is an extremely specific model which needs specific environments for confident predictions.

viii) Face-API model

Face-API is a face-detection model of ml5.js and you can see the example below.

Model: FaceMesh
Face mesh model example

2. Generative Based Models

i) Style Transfer

ml.js offers the cools style-transfer pre-trained model. Suppose we have 2 images A and B, the process of imposing the style of image A onto the content of B is known as neural style transfer. It does not come under machine learning per se. But it can be classified as generative learning as it encompasses generating a styled image.

Neural style transfer
Neural style transfer

ii) Conditional Variational Autoencoder(CVAE)

Autoencoders are common architectures for dimensionality reduction which have similar input and output sizes. Example: One can pass a dataset through an autoencoder so it generates similar images with smoother features. CVAE’s are generative autoencoders, able to generate new data by adding random noise to the data and augmenting it.

iii) DCGANs

GANs are fascinating generative algorithm that consists of a generator() and a discriminator() and they can generate stunning fake output. Some common examples of GANs are cartoonGAN (generates new cartoons) or StyleGAN (can generate new faces that don’t actually exist). In our case, ml5.js offers trained DCGAN out of the box in the browser.

Cartoon GAN

iv) SketchRNN

Recurrent neural networks are generally not used for images. But they can be used to generate new adversarial data. This ml5.js model can generate new doodle drawings. It was trained on the millions of images from the quickdraw dataset (3.5 million images of random labeled drawings).

Sketch-RNN Demos
Sketch-RNN demo

v) Pix2Pix

It is a generative adversarial network designed to perform image-to-image translation i.e. the user draws an image and sends it to the model and the model outputs an image that closely resembles the drawn image in the form of a real-world image.

Pix2Pix example
Pix2Pix example

3. Sound Based Models

i) Pitch detection

Basic sound detection algorithm which facilitates estimating the pitch or the frequency usually in the form of a speech recording or musical note. This model returns an array of numeric values each representing a characteristic of the sound recorded. One can use the browser microphone to directly provide input to the model.

ii) Sound classifier

Classifying sounds into predefined categories. This model allows you to add new classes and train the classifier again or just use them to perform classification in your own apps. It can detect the below day-to-day words:

"zero" to "nine", "up", "down", "left", "right", "go", "stop", "yes", "no","unknown word","background noise".
Audio processing using deep learning
Speech and pitch processing models

4. Text Based Models

i) RNN (charRNN)

Built using recurrent neural networks used for text-sequence processing. This model takes as input a body of text and generates new text (based on the input) as output. For example:

A model trained on 'virginia woolf' novels generates the following text when given 'the meaning of life'
as input:

"So long as you write what you wish to write, that is all that matters; and whether it matters for ages or only for hours, nobody can say"
A famous qoute from virginia woolf books.

ii) Sentiment Classifier

Using natural language processing to classify words as either neutral, positive, or negative. Highly applicable to real-life projects like customer satisfaction or product review. For example, reviews on a website are classified into stars (or ‘good’ and ‘bad’) so that they can be used to provide the user/buyer sentiment.

iii) Universal Sentence Encoder (word2vec)

Universal Sentence Encoder model sporting a 512-dimensional word embedding can be used for the sentiment and similarity detection models. Simply, they are numerous words that are grouped together in multiple categories and can be used to train new ML models.

Word embeddings
Word embeddings example

5. Helper Models

i) Neural networks

The basic building block of deep learning, employs the power of backpropagations and gradient descent. In the case of python, one has to sequentially add layers and manually train the model. Whereas in the case of ml5.js you can use NN’s directly as a function without using numerous lines of code.

ii) Feature Extractors

ml5.js accommodates the new wave in ML which is known as transfer learning. Using feature extraction basically means passing your dataset(images) through a pre-trained NN. So that the pre-trained weights can be used as a base for the classifier we are intending to build.

iii) KNN classifier

Commonly used supervised learning algorithm which gives new data points according to the closest data points.

iv) K means clustering

K-means is the classic unsupervised learning algorithm. It helps us group data points into K-groups. We can take any random objects because the initial centroids or the primary K objects are in sequence.

K Means Clustering Simplified
K-means clustering

FAQs about ml5.js

Q. Does ml5.js support node.js like tensorflow.js?

A. No

Q. What version should I use?

A. There is no compatibility difference in different versions of ml5.js. It is just that newer versions have newer functions. Thus one should use the latest version

Q. What are the prerequisites to learn ml5.js?

A. HTML, javascript, and you are good to go!

Q. What level of development does ml5.js support?

A. The motto of ml5.js is to provide accessibility of ML to everyone thus anyone can use it from a student to a seasoned developer.

Q. Is ml5.js portable

A. It is portable in the way that it can be used on any browser or app.

Q. How is ml5.js different from tensorflow.js

A. ml5.js is built as tensorflow.js as its base. It employs tensorflow.js to make its pre-trained models and create new ones.

Important links

  • You can learn more by visiting the official site–> Here
  • The official Github repository–> Here
  • Know more about the pre-trained models–> Here



  • Gaurav Maindola

    I am a machine learning enthusiast with a keen interest in web development. My main interest is in the field of computer vision and I am fascinated with all things that comprise making computers learn and love to learn new things myself.

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *