- 1 Introduction
- 2 Natural Language Processing GitHub Repositories
- 2.1 DeepMoji (⭐ – 1k | ⑂ – 249 )
- 2.2 Me_Bot |⭐ – 610 | ⑂ – 47
- 2.3 Speech Emotion Analyzer (⭐ – 448 | ⑂ – 171)
- 2.4 Automatic Summarization of Scientific Papers (⭐ – 105 | ⑂ – 29)
- 2.5 Paraphrase Detection (⭐ – 136 | ⑂ – 47)
- 2.6 Generating research paper titles (⭐ – 46 | ⑂ – 7 )
- 2.7 Toxic Comments Classification (⭐ – 3| ⑂ – 7)
- 2.8 Document Similarity (⭐ – 11| ⑂ – 14)
- 2.9 Shakespeare Text Generation (⭐ – 3| ⑂ – 7)
- 3 Conclusion
Artificial Intelligence has numerous ramifications and of those, Natural Language Processing has been widely popular across various domains. In this article, we will be looking at GitHub repositories with some interesting and useful natural language processing projects to inspire you. You can implement these projects on your own or enhance them with more features. So let us go through them.
We are also listing down the stars (★) and the number of forks (⑂) these GitHub repositories have got (at the time of writing this) to give you an idea of their popularity.
Natural Language Processing GitHub Repositories
1DeepMoji (⭐ – 1k | ⑂ – 249 )
DeepMoji is a deep learning model that can be used for analyzing sentiment, emotion, sarcasm, etc. DeepMoji is a model trained on 1.2 billion tweets with emojis to draw inferences of how language is used to express emotions. The repository contains the deep learning model along with examples of code snippets, data for training, and tests for evaluating the code.
2Me_Bot |⭐ – 610 | ⑂ – 47
This is an interesting NLP GitHub repository that focuses on creating bot “Me_Bot” that can learn from your Whatsapp conversations and then start doing conversations like you. The chats have to be exported from the phone so the bot can be trained on it. This is light weighted fun project but you can build upon this idea to create similar bots on your own
3Speech Emotion Analyzer (⭐ – 448 | ⑂ – 171)
The idea behind this project is to create a neural network model for detecting emotions from the conversations we have in our daily life. The neural network model can detect up to five different emotions of male/females. This can be used for personalization in marketing for recommending products based on the emotions. Similarily automotive companies can use this to detect the emotion of drivers and adjust speed to avoid any collision.
4Automatic Summarization of Scientific Papers (⭐ – 105 | ⑂ – 29)
This NLP GitHub project tries to make life easier for those people who regularly read research papers always look to summarize their learnings. It creates a supervised learning-based system that can do a summarization of the scientific papers. This can be a good project to learn for beginners or intermediate learners.
5Paraphrase Detection (⭐ – 136 | ⑂ – 47)
Paraphrase detection is a popular application of Natural Language Processing is to detect whether two different sentences have the same meaning or not. It has applications in areas like machine translation, question answering, information extraction, summarization, etc. This GitHub repository has the project that identifies paraphrasing and is worth checking for beginners.
6Generating research paper titles (⭐ – 46 | ⑂ – 7 )
In this GitHub repository, we will find a very innovative project. Here a GPT-2 is trained on data extracted from arXiv for generating titles of research papers. Along with this, we also get to learn about the web scraper as it is used for extracting text of research papers which is later fed to the model for training. This application also has different versions like generating song lyrics, dialogues, and many other such text generating tasks.
7Toxic Comments Classification (⭐ – 3| ⑂ – 7)
A prominent issue in the world of social media has been to eliminate toxic comments. This repository hosts the project that can be used as a starting base for working on the classification of toxic comments. This problem was a part of a competition on Kaggle where the participants had to suggest the solution for classifying the toxic comments in various categories using natural language processing concepts.
8Document Similarity (⭐ – 11| ⑂ – 14)
This beginner-level natural language processing Github repository is about document similarity. The idea behind the document similarity application is to find the common topic discussed between the documents. There are various methods for finding the similarity, this repository has used cosine similarity for finding the similarity amongst the words. This kind of application can be used in different domains as well.
9Shakespeare Text Generation (⭐ – 3| ⑂ – 7)
This is a fun NLP project which hosts a web app for generating Shakespeare’s text. The implementation has been done by training LSTM on Shakesperian data to create a language model that generates text in Shakespearean style. You may like to explore this repository to create a language model of a different style.
Reaching the end of another article, here we looked at some more GitHub repositories that comprised of natural language processing projects. These projects covered various topics of NLP. One can learn how to develop such NLP projects by learning from these repositories and also grasping the practices followed to maintain the GitHub repository.