Artificial Intelligence has numerous ramifications and of those, Natural Language Processing has been widely popular across various domains. In this article, we will be looking at GitHub repositories with some interesting and useful natural language processing projects to inspire you. You can implement these nlp projects on your own or enhance them with more features. So let us go through them.
We are also listing down the stars (★) and the number of forks (⑂) these GitHub repositories have got (at the time of writing this) to give you an idea of their popularity.
Natural Language Processing GitHub Repositories
DeepMoji (⭐ – 1k | ⑂ – 249 )
DeepMoji is a deep learning model that can be used for analyzing sentiment, emotion, sarcasm, etc. DeepMoji is a model trained on 1.2 billion tweets with emojis to draw inferences of how language is used to express emotions. The repository contains the deep learning model along with examples of code snippets, data for training, and tests for evaluating the code.
Me_Bot |⭐ – 610 | ⑂ – 47
This is an interesting NLP GitHub repository that focuses on creating bot “Me_Bot” that can learn from your Whatsapp conversations and then start doing conversations like you. The chats have to be exported from the phone so the bot can be trained on it. This is light weighted fun project but you can build upon this idea to create similar bots on your own
Speech Emotion Analyzer (⭐ – 448 | ⑂ – 171)
The idea behind this project is to create a neural network model for detecting emotions from the conversations we have in our daily life. The neural network model can detect up to five different emotions of male/females. This can be used for personalization in marketing for recommending products based on the emotions. Similarily automotive companies can use this to detect the emotion of drivers and adjust speed to avoid any collision.
Automatic Summarization of Scientific Papers (⭐ – 105 | ⑂ – 29)
This NLP GitHub project tries to make life easier for those people who regularly read research papers always look to summarize their learnings. It creates a supervised learning-based system that can do a summarization of the scientific papers. This can be a good project to learn for beginners or intermediate learners.
Paraphrase Detection (⭐ – 136 | ⑂ – 47)
Paraphrase detection is a popular application of Natural Language Processing is to detect whether two different sentences have the same meaning or not. It has applications in areas like machine translation, question answering, information extraction, summarization, etc. This GitHub repository has the project that identifies paraphrasing and is worth checking for beginners.
Generating research paper titles (⭐ – 46 | ⑂ – 7 )
In this GitHub repository, we will find a very innovative project. Here a GPT-2 is trained on data extracted from arXiv for generating titles of research papers. Along with this, we also get to learn about the web scraper as it is used for extracting text of research papers which is later fed to the model for training. This application also has different versions like generating song lyrics, dialogues, and many other such text generating tasks.
Toxic Comments Classification (⭐ – 3| ⑂ – 7)
A prominent issue in the world of social media has been to eliminate toxic comments. This repository hosts the project that can be used as a starting base for working on the classification of toxic comments. This problem was a part of a competition on Kaggle where the participants had to suggest the solution for classifying the toxic comments in various categories using natural language processing concepts.
Document Similarity (⭐ – 11| ⑂ – 14)
This beginner-level natural language processing Github repository is about document similarity. The idea behind the document similarity application is to find the common topic discussed between the documents. There are various methods for finding the similarity, this repository has used cosine similarity for finding the similarity amongst the words. This kind of application can be used in different domains as well.
Shakespeare Text Generation (⭐ – 3| ⑂ – 7)
This is a fun NLP project which hosts a web app for generating Shakespeare’s text. The implementation has been done by training LSTM on Shakesperian data to create a language model that generates text in Shakespearean style. You may like to explore this repository to create a language model of a different style.
Opinion Summarizer (⭐ – 34 | ⑂ – 9 )
A useful repository for summarizing the reviews/opinions of customers of Amazon and Yelp. The basic idea is to produce abstract summaries that can represent a group of similar reviews. The model built for this task is based on Bayesian AutoEncoding. The repository contains all the relevant data from Amazon and Yelp. Along with this, there are files that help in pre-processing the data and evaluating the model.
Document Similarity Check with RestAPI (⭐ – 34 | ⑂ – 9 )
This NLP project on Github will help you in building a complete application that consists of RESTful API for similarity check of documents using natural language processing. Another impressive part of this repository is that it tells us how to upload this API over docker and use it as a web application.
Reaching the end of another article, here we looked at some more GitHub repositories that comprised of natural language processing projects. These projects covered various topics of NLP. One can learn how to develop such NLP projects by learning from these repositories and also grasping the practices followed to maintain the GitHub repository.