What is LangChain | Introduction to LangChain for Beginners with Examples

Introduction

Ever since OpenAI released ChatGPT last year, the buzz around LLMs (Large Language Models) has grown exponentially, with many new LLMs getting released every other day. As if so many LLMs are not confusing enough for people, there is another growing buzz in the community around an LLM framework called LangChain. In this article,  we will understand what is LangChain, why is it needed, and how it works. We will also show some examples, but will not deep dive into low-level aspects as this is more of an introduction to LangChain for beginners.

What is LangChain

LangChain is an open-source framework that streamlines, simplifies, and standardizes the development of LLM Applications which is possible due to its integrations with various LLMs, data sources, APIs, etc. It supports only two programming languages – Python and Javascript, however, it also exposes a rich set of APIs, making it possible to work with any other programming framework.

Quick History

LangChain was created by Harrison Chase in October 2022, where it got early traction in the GitHub community. The timing could not have been more perfect than this because soon OpenAI launched ChatGPT and the euphoria that followed around LLMs meant the LangChain project found more backers and it soon incorporated as a formal startup in April 2023. It has since raised funding and is rapidly growing the LangChain framework with more integration & features. It also recently announced LangSmith, a managed platform to test, debug, and monitor production-grade LLM applications

Why LangChain is Useful

LangChain is trying to bring a variety of LLMs under a single umbrella framework and provide integrations with multiple data sources, external APIs, and tools in a simplified workflow. But let us understand the underlying problems that LangChain is trying to resolve –

Standardization in LLMs Framework

We all know that OpenAI has a framework to programmatically interact with their GPT models to build applications. However, this OpenAI framework can’t be used to work with other LLM models. To work with another LLM, we will have to learn their own framework, but with so many LLMs available it is not possible to learn a new framework every other time.

LangChain offers a standard framework, that you have to learn only once and you would be able to work with any LLM from the huge list of LLMs that it supports.

Standard Integration with Data Source

Multiple LLMs are just one end of the problem, the other end of the problem is the variety of data sources. The data that you might like to work with may reside in a simple file like CSV, Excel, PDF, or it may be in a database, or in the cloud like AWS, GCP. In fact, you may even like to fetch data from Google, Wikipedia, websites, or 3rd party platforms during runtime. However, connecting with respective data sources requires its own integrations which come with their own complexity.

LangChain resolves this problem by offering ready-to-use integrations with most of the popular data sources that you might like to work with. So you are able to connect with any of its data sources with minimal hassle

Plug & Play with Integrations

While creating an LLM application it may so happen that you want to change your LLM as it is not giving proper results or you may like to change the data source, say for example from Google to Wikipedia. With LangChain’s standard framework, it is just a matter of plug-and-play with these integrations of LLMs and data sources. This flexibility is what makes LangChain such a powerful tool.

Simplifies Complex Workflows

A production or enterprise-grade LLM application may have multiple data sources. It may also require multiple LLMs to work together in synergy to produce the desired output. This is where LangChain really shines as it can chain these multiple data sources and LLMs as per your requirement with minimal complexities. Without LangChain writing custom integrations for multiple data sources & LLMs, and tying them together would have been a very difficult job, if not impossible. Hope you can now appreciate the true magic of LangChain as it allows you to focus on designing your application instead of worrying about custom integrations.

Quick Learning Curve

In spite of having a rich set of integrations with various data sources & LLMs, the learning curve to get started with LangChain is really small. You can actually make a meaningful start with LangChain with just 5-6 lines of code.

 

LangChain Modules

LangChain consists of the following modules –

1. Model I/O

The Model I/O module of Langchain helps you interact with any supported Large Language Models allowing you to send prompts to these LLMs and receive output in return.

The main features of  LangChain Model I/O are –

  • Support for multiple LLMs through a common standard interface.
  • Ability to plug and play with different LLMs with the same code.
  • Support for both Chat APIs and Text Completion APIs.
  • Allows to create prompt templates for ease of use.
  • Allows to save prompts for later reuse.

2. Retrieval

The retrieval module of LangChain helps you connect your LLM with various data sources such as documents, databases,  vector stores, cloud, internet, etc.

The main features of the LangChain Retrieval module are –

  • Support for multiple data sources through a common standard interface.
  • Flexibility to plug and play with different data sources.
  • Ability to apply transformation to documents like chunking, translation, filter redundant docs, etc.
  • Support for text embeddings of data through integration with more than 25+ embedding providers.
  • Support for more than 50 various vector stores ranging from open-source ones to proprietary ones.

3. Chains

The Chains module of LangChain lets you integrate multiple LLMs in chains for creating complex applications.

The main features of the LangChain Chains module are –

  • Multiple LLMs can be chained together to carry out complex workflows.
  • The LLMs in the chain can be easily changed like plug and play.
  • LangChain offers two high-level frameworks to work with Chains –
    • First is the Chain interface, which is now a legacy approach as per LangChain.
    • Second is the new approach of  LangChain Expression Language (LCEL).
  • Legacy Chain interface will continue to be supported by LangChain, and can also be used from within LCEL.

4. Memory

The Memory module of LangChain lets you store users’ interactions with LLMs that can be referred to for context in future interactions. This is useful for designing a conversational agent like ChatGPT that maintains the history of user interactions.

The main features of the LangChain Memory module are –

  • Flexibility to store all or limited interactions.
  • It can also limit memory storage in terms of tokens.

 

5. Agents

Agents module of LangChain lets you intelligently reason with LLMs and take appropriate actions in a chain. This helps to avoid hardcoding actions and take an intelligent approach in complex workflows.

The main features of the LangChain Agents module are –

  • LangChain offers different types of Agents that have their own style of reasoning and work with inputs & outputs.
  • An agent can be provided with Tools that are nothing but utility functions. The agent can use the tool along with user input or intermediate input to decide on the next action.
  • LangChain provides a wide range of ready-to-use tools out of the box. However, you can also write your own tools quite easily.
  • LangChain also has a concept of Toolkit which is a group of related sets of tools for a particular task for Agent. Again, there are several built-in Toolkit integrations that are available for use.

 

LangChain Disadvantages

No framework can be hundred percent perfect, especially if it is new and undergoing active changes. Let us understand some disadvantages of using the LangChain framework.

Difficulty in Debugging

With so many layers of abstractions, LangChain kind of becomes a black box where it becomes difficult to debug what is happening in the background.

Integration Issues

Due to integrations with so many LLMs, Tools, Data Sources, and platforms, LangChain has to make sure they are up to date with changes within this big ecosystem. Recently OpenAI made some major changes in their APIs which broke LangChain integrations.

Lack of Good Documentation

LangChain documentation needs to be more clear, and self-explanatory with good examples, this is one area in which they are lacking. To their credit, they are making continuous improvements in documentation at the time of writing this article.

Legacy Interfaces

Recently LangChain has marked many of its interfaces as legacy and is pushing for its new LangChain Expression Language (LCEL). This means many LangChain interfaces that we learned are becoming obsolete and on the other hand LCEL documentation still lacks clarity with good examples.

Inability to Customize

With LangChain you are limited with the sets integrations & features that it supports. If you want to integrate with a new LLM for example, you cannot do it unless LangChain supports it officially.

Lock In

You may design a complex application with LangChain only to realize at a later point in time that you are stuck with its limitations and you cannot move away from LangChain that easily due to technical lock-in. On the other hand, designing an application having native integrations with LLMs & data sources can have better control and flexibility. Hence while designing an enterprise-grade application one should consider lock-in with frameworks and technologies and its long-term implications.

 

LangChain Examples

In this section, we will show some of the basic functionalities of LangChain with examples so that beginners can understand it better. Throughout the examples. we will work with two LLMs – OpenAI’s GPT model and Google’s Flan t5 model. This will help us to showcase how easily we can interchange LLMs with the help of the standard interface offered by LangChain.

Required Installations

In this section, we will do some pre-requisite installations for our examples.

Install LangChain

The latest version of LangChain can be installed with pip as shown below.

In [0]:

pip install langchain

Install OpenAI

OpenAI package has to be installed because LangChain internally uses it to work with GPT LLMs.

In [1]:

pip install openai

Install Hugging Face Hub

The Hugging Face hub is required for LangChain to internally connect and work with Google Flan T5 model

In [2]:

pip install huggingface_hub

Setting Up OpenAI GPT LLM

API Keys

You should have an OpenAI account and generate an API key, but remember it is a paid API. However, if you do a new sign-up then OpenAI provides some free credits to start with.
Once you have the API key, it needs to be set up as an environment variable as shown below.
In [3]:
import os

os.environ["OPENAI_API_KEY"] = '<your api key>'

Using OpenAI LLM with LangChain

Next, we import the OpenAI module through LangChain and initialize its instance. To quickly check if it is working fine we pass a prompt through the above object and print its output.

In [4]:

from langchain.llms import OpenAI

openai_llm = OpenAI()
print(openai_llm("Name any one movie of Will Smith "))
Out[4]:
Independence Day

Setting Up Google Flan T5 LLM

API Keys

You should make an account in Hugging Face and then generate an API key to connect and work with Google Flan t5 LLM which is free to use. In fact, in Hugging Face you can find many free LLMs that can be used if you don’t wish to spend on LLMs like OpenAI.

Once you have the API key, it needs to be set up as an environment variable as shown below.
In [5]:
import os

os.environ["HUGGINGFACEHUB_API_TOKEN"] = '<your api key>'

Using Hugging Face with LangChain

Next, we import HuggingFaceHub module of LangChain, and by using the repo id of the Google flan-t5 model and instantiate its instance. To quickly check if it is working fine we pass a prompt through the above object and print its output.

In [6]:

from langchain.llms import HuggingFaceHub

repo_id = "google/flan-t5-xxl"
flan_llm = HuggingFaceHub( repo_id=repo_id, model_kwargs={"temperature": 0.5, "max_length": 512})

print(flan_llm("Name any one movie of Will Smith "))

Out[6]:

independance day

LangChain Templates Examples

Single Input Template

The templates for prompts can be created by using PromptTemplate module. Here a variable {movie} is used as a placeholder in the prompt template. To run the prompt we just have to pass the value of the movie without writing the complete prompt again and again as shown in the example below.

In [7]:

from langchain.prompts import PromptTemplate

prompt_template = PromptTemplate(input_variables = ["movie"], template = "The Director of {movie} is ")

prompt = prompt_template.format(movie = "Jurrasic Park")

print("Input Prompt: " + prompt)
print("Output: " + flan_llm(prompt))

prompt = prompt_template.format(movie = "Titanic")

print("Input Prompt: " + prompt)
print("Output: " + flan_llm(prompt))

Out[7]:

Input Prompt: The Director of Jurrasic Park is 
Output: Steven Spielberg
Input Prompt: The Director of Titanic is 
Output: James Cameron

 

Multiple Input Template

This example extends the above example by including two variables inside the prompt template – {crew_type} and {movie}. This shows the type of versatile templates that you can design for maximum reuse.

In [8]:

from langchain.prompts import PromptTemplate

multi_input_template = PromptTemplate(input_variables = ["crew_type","movie"], template = "The {crew_type} of {movie} is ")

prompt = multi_input_template.format(movie = "Interstellar", crew_type = "Director")

print("Input Prompt: " + prompt)
print("Output: " + flan_llm(prompt))

prompt = multi_input_template.format(movie = "Iron Man", crew_type = "Lead Actor")

print("Input Prompt: " + prompt)
print("Output: " + flan_llm(prompt))

Out[8]:

Input Prompt: The Director of Interstellar is 
Output: Christopher Nolan
Input Prompt: The Lead Actor of Iron Man is 
Output: Robert Downey Jr.

LangChain Chain Examples

In this example, for LangChain Chain we are using their latest recommended LangChain Expression Language (LCEL) instead of their legacy Chain interface.

With the help of Prompt Templates, we have designed two prompts. The first prompt gets you the popular dish of a given place. The second prompt takes this place from first prompt as input and generates the recipe for it. This works like a chain and you can also add more prompts for a longer complex chain.

In [9]:

from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from langchain.prompts import PromptTemplate

place_prompt_template = PromptTemplate(input_variables = ["place_name"], template = "A popular dish from the place {place_name} is ")
dish_recipie_prompt_template = PromptTemplate(input_variables = ["dish_name"], template = "Summarize the Recipie for {dish_name}")

place_name_chain = place_prompt_template | flan_llm | StrOutputParser()
dish_recipie_chain = dish_recipie_prompt_template | flan_llm | StrOutputParser()
chain = {"dish_name": place_name_chain} | RunnablePassthrough.assign(recipie=dish_recipie_chain)
chain.invoke({"place_name": "Bangalore"})
Out[9]:
{'dish_name': 'neer dosa',
 'recipie': 'Preheat the tawa on medium heat. Make the batter by mixing the gram flour, salt, red chilli powder and coriander powder. Add the water and mix it to a smooth batter. Add the grated coconut and mix it well. Pour the batter in the hot tawa. Spread it in a circular motion.'}

Interchanging LLMs (Plug and Play Example)

In the above example, we used Google flan-t5 LLM for both prompts in the chain. However, the recipe generated by flan-t5 LLM is not elaborate. In this example, we simply plug out the flan-t5 LLM from the second prompt and plug-in OpenAI LLM to get better recipe results. This really shows how easy it is to plug and play with multiple LLMS with LangChain’s standard interface.

In [10]:

place_name_chain = place_prompt_template | flan_llm | StrOutputParser()
dish_recipie_chain = dish_recipie_prompt_template | openai_llm | StrOutputParser()

chain = {"dish_name": place_name_chain} | RunnablePassthrough.assign(recipie=dish_recipie_chain)
chain.invoke({"place_name": "Bangalore"})
Out[10]:
{'dish_name': 'neer dosa',
 'recipie': '\n\nNeer dosa is a popular dish from the South Indian region of Karnataka. It is a thin, crepe-like pancake made from a batter of ground rice and coconut milk. The batter is seasoned with turmeric, salt, and cumin, and is typically served with a vegetable curry or chutney. To make neer dosa, mix together 1 cup of ground rice, 1 cup of coconut milk, 1/2 teaspoon of turmeric, 1/2 teaspoon of salt, and 1/2 teaspoon of cumin. Heat a non-stick pan over medium-high heat and lightly grease it with a few drops of oil. Pour 1/4 cup of the batter into the pan and spread it out to form a thin, round dosa. Cook for 1-2 minutes, or until the edges start to turn golden brown. Flip the dosa and cook for another minute, then transfer it to a plate. Repeat with the remaining batter. Serve neer dosa with a vegetable curry or chutney.'}

Chat Model Example in LangChain

In this example, we are using the OpenAI chat model on which the popular ChatGPT platform is based.  Also, we import AIMessage, HumanMessage, SystemMessage modules of LangChain.

Next, we set the message to LLM which has two parts – System Message and Human Message –

  • System Message is set to define its role to Chat Model.
  • Human Message is created which is nothing but the prompt to LLM.

Finally, we pass this message to the Chat Model and get the AI Message response as output.

In [11]:

from langchain.chat_models import ChatOpenAI
from langchain.schema import AIMessage, HumanMessage, SystemMessage

chat = ChatOpenAI(temperature=0)

messages = [
SystemMessage(
content="You are a helpful assistant that explains scientific concepts in one line for childrens."
),

HumanMessage(
content="Can you please explain concept of Thermodynamics for 10 years kid in one line?"
),
]

chat(messages)
Out[11]:
AIMessage(content='Thermodynamics is the study of how heat and energy move and change, like when you cook food or ride a bike.')

Chat Model Prompt Template Example in LangChain

Here we convert prompts of the previous example into templates for generic reuse. This time we import the following modules – ChatPromptTemplate, SystemMessagePromptTemplate, AIMessagePromptTemplate, and HumanMessagePromptTemplate.

First, we create a system prompt template that has a {subject} variable. Next, the human prompt template is created with {topic} and {age} as variables. Using these two templates the final chat prompt template is created.
For final execution, we pass the values of the respective variables to the chat prompt and get a reply back in AI Message.
In [12]:
from langchain.prompts.chat import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
AIMessagePromptTemplate,
HumanMessagePromptTemplate,
)

system_template = ("You are a helpful assistant that explains {subject} concepts in one lines.")
system_message_prompt = SystemMessagePromptTemplate.from_template(system_template)

human_template = "Can you please explain concept of {topic} for {age} years old in one line?"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])

chat(chat_prompt.format_prompt(subject="Geography", topic="Rain Forest", age="10").to_messages())
Out[12]:
AIMessage(content='A rainforest is a dense and diverse ecosystem found in tropical regions, characterized by heavy rainfall and a wide variety of plant and animal species.')

 

  • MLK

    MLK is a knowledge sharing community platform for machine learning enthusiasts, beginners and experts. Let us create a powerful hub together to Make AI Simple for everyone.

Follow Us

Leave a Reply

Your email address will not be published. Required fields are marked *