- 1 Introduction
- 2 What is ml5.js?
- 3 Why use web-based machine learning?
- 4 Why use ml5.js?
- 5 How to Install and Use ml5.js
- 6 Ml5.js pre-trained models
- 6.1 1. Image Based Models
- 6.2 2. Generative Based Models
- 6.3 3. Sound Based Models
- 6.4 4. Text Based Models
- 6.5 5. Helper Models
- 7 FAQs about ml5.js
- 8 Important links
What is ml5.js?
ml5.js uses Tensorflow.js API layer to easily define, train, and test models and it also supports GPU acceleration to enhance computational efficiency. It has built-in functionalities for other utilities also like visualization, rendering, porting on other platforms which adds to the user’s ease of working.
Why use web-based machine learning?
The ability to make machine learning models in the browser is very useful as it can keep the data within the user’s browser only. The added bonus of being able to use webcams and microphones directly without using any external libraries as is the case with other languages makes it quite convenient.
Why use ml5.js?
Well to begin with ml5.js is actually built on Tensorflow.js and it adds a layer of API abstraction. This means you don’t need to struggle with the low-level nitty-gritty of Tensorflow.js yet you can leverage the same power by using the high-level ml5.js library.
Below are some more points that add to ml5.js credibility.
- Supports acceleration using WebGL and GPUs
- Native support for converting browser I/O streams to model input data structures
- Standardization of model format
- Small model sizes, low-latency, portable model format
- Out of the box Pretrained models for turnkey use.
How to Install and Use ml5.js
You can use the below code to import the latest version of ml5.js:
Or if you are interested in a certain version of ml5.js use:
Ml5.js pre-trained models
1. Image Based Models
i) Image classifier
The image classifier of ml.js is trained on 15 million images and can successfully classify approximately 1000 different classes ranging from a dog to an aircraft carrier. It can be applied to images as well as videos.
ii) Object detector
This ml5.js pre-trained model can detect objects defined within the COCO dataset, which is large-scale object detection, segmentation, and captioning dataset. The model is capable of detecting 90 classes of objects.
iii) Body pose detection
poseNet() another pretrained model of ml5.js that allows real-time pose estimation on image/video feed. It can estimate multiple poses and detect multiple people in a single image or video.
iv) Body Pix
Body Pix is a segmentation model of ml5.js build on top of tensorflow.js. It can classify an image as ‘having a person and not and further extract it from the original image. It detects a human body and 24 different parts(hands, legs, face, torso, etc).
v) U-Net model
U-Net model is developed by the computer science department of the University of Freiburg, Germany. It is a (fully convolutional) image segmentation model used in the biomedical field which can provide can be efficiently trained on a small dataset.
P.S: Currently it can only segment and extract faces from images and videos.
vi) Handpose model
Built on top of the tensorflow.js ‘HandPose’ model can successfully detect 21 landmarks of the hands and palm. Obviously, detection models are heavyweight but ml5.js models are lightweight and can be easily used with webcams in the browser for real-time handpose detection.
vii) Face mesh model
Another extension of tensorflow.js pre-trained models is the face-mesh model. It is designed to detect multiple faces and 486 3-D landmarks in a video or image feed. It is an extremely specific model which needs specific environments for confident predictions.
viii) Face-API model
Face-API is a face-detection model of ml5.js and you can see the example below.
2. Generative Based Models
i) Style Transfer
ml.js offers the cools style-transfer pre-trained model. Suppose we have 2 images A and B, the process of imposing the style of image A onto the content of B is known as neural style transfer. It does not come under machine learning per se. But it can be classified as generative learning as it encompasses generating a styled image.
ii) Conditional Variational Autoencoder(CVAE)
Autoencoders are common architectures for dimensionality reduction which have similar input and output sizes. Example: One can pass a dataset through an autoencoder so it generates similar images with smoother features. CVAE’s are generative autoencoders, able to generate new data by adding random noise to the data and augmenting it.
GANs are fascinating generative algorithm that consists of a generator() and a discriminator() and they can generate stunning fake output. Some common examples of GANs are cartoonGAN (generates new cartoons) or StyleGAN (can generate new faces that don’t actually exist). In our case, ml5.js offers trained DCGAN out of the box in the browser.
Recurrent neural networks are generally not used for images. But they can be used to generate new adversarial data. This ml5.js model can generate new doodle drawings. It was trained on the millions of images from the quickdraw dataset (3.5 million images of random labeled drawings).
It is a generative adversarial network designed to perform image-to-image translation i.e. the user draws an image and sends it to the model and the model outputs an image that closely resembles the drawn image in the form of a real-world image.
3. Sound Based Models
i) Pitch detection
Basic sound detection algorithm which facilitates estimating the pitch or the frequency usually in the form of a speech recording or musical note. This model returns an array of numeric values each representing a characteristic of the sound recorded. One can use the browser microphone to directly provide input to the model.
ii) Sound classifier
Classifying sounds into predefined categories. This model allows you to add new classes and train the classifier again or just use them to perform classification in your own apps. It can detect the below day-to-day words:
"zero" to "nine", "up", "down", "left", "right", "go", "stop", "yes", "no","unknown word","background noise".
4. Text Based Models
i) RNN (charRNN)
Built using recurrent neural networks used for text-sequence processing. This model takes as input a body of text and generates new text (based on the input) as output. For example:
A model trained on 'virginia woolf' novels generates the following text when given 'the meaning of life' as input: "So long as you write what you wish to write, that is all that matters; and whether it matters for ages or only for hours, nobody can say" A famous qoute from virginia woolf books.
ii) Sentiment Classifier
Using natural language processing to classify words as either neutral, positive, or negative. Highly applicable to real-life projects like customer satisfaction or product review. For example, reviews on a website are classified into stars (or ‘good’ and ‘bad’) so that they can be used to provide the user/buyer sentiment.
iii) Universal Sentence Encoder (word2vec)
Universal Sentence Encoder model sporting a 512-dimensional word embedding can be used for the sentiment and similarity detection models. Simply, they are numerous words that are grouped together in multiple categories and can be used to train new ML models.
5. Helper Models
i) Neural networks
The basic building block of deep learning, employs the power of backpropagations and gradient descent. In the case of python, one has to sequentially add layers and manually train the model. Whereas in the case of ml5.js you can use NN’s directly as a function without using numerous lines of code.
ii) Feature Extractors
ml5.js accommodates the new wave in ML which is known as transfer learning. Using feature extraction basically means passing your dataset(images) through a pre-trained NN. So that the pre-trained weights can be used as a base for the classifier we are intending to build.
iii) KNN classifier
Commonly used supervised learning algorithm which gives new data points according to the closest data points.
iv) K means clustering
K-means is the classic unsupervised learning algorithm. It helps us group data points into K-groups. We can take any random objects because the initial centroids or the primary K objects are in sequence.
FAQs about ml5.js
Q. Does ml5.js support node.js like tensorflow.js?
Q. What version should I use?
A. There is no compatibility difference in different versions of ml5.js. It is just that newer versions have newer functions. Thus one should use the latest version
Q. What are the prerequisites to learn ml5.js?
Q. What level of development does ml5.js support?
A. The motto of ml5.js is to provide accessibility of ML to everyone thus anyone can use it from a student to a seasoned developer.
Q. Is ml5.js portable
A. It is portable in the way that it can be used on any browser or app.
Q. How is ml5.js different from tensorflow.js
A. ml5.js is built as tensorflow.js as its base. It employs tensorflow.js to make its pre-trained models and create new ones.
- You can learn more by visiting the official site–> Here
- The official Github repository–> Here
- Know more about the pre-trained models–> Here