Table of Contents

Introduction

In this article, we will take you through the tutorial for Part of Speech or POS Tagging in Spacy library of Python. We will first understand what is POS tagging and why it is used and finally, see some examples of it in Spacy.

What is POS Tagging?

The Part of speech tagging or POS tagging is the process of marking a word in the text to a particular part of speech based on both its context and definition. In simple language, we can say that POS tagging is the process of identifying a word as nouns, pronouns, verbs, adjectives, etc.

Spacy Part of Speech (POS) Tagging

Why POS tag is used

Some words can function in more than one way when used in different circumstances. The POS Tagging here plays a crucial role to understand in what context the word is used in the sentence. POS Tagging is useful in sentence parsing, information retrieval, sentiment analysis, etc.

Also Read – Tutorial on POS Tagging and Chunking in NLTK Python

POS Tagging in Spacy Library

Spacy POS Tags List

Every token is assigned a POS Tag in Spacy from the following list:

POS	DESCRIPTION	EXAMPLES
ADJ	adjective	big, old, green, incomprehensible, first
ADP	adposition	in, to, during
ADV	adverb	very, tomorrow, down, where, there
AUX	auxiliary	is, has (done), will (do), should (do)
CONJ	conjunction	and, or, but
CCONJ	coordinating conjunction	and, or, but
DET	determiner	a, an, the
INTJ	interjection	psst, ouch, bravo, hello
NOUN	noun	girl, cat, tree, air, beauty
NUM	numeral	1, 2017, one, seventy-seven, IV, MMXIV
PART	particle	’s, not,
PRON	pronoun	I, you, he, she, myself, themselves, somebody
PROPN	proper noun	Mary, John, London, NATO, HBO
PUNCT	punctuation	., (, ), ?
SCONJ	subordinating conjunction	if, while, that
SYM	symbol	$, %, §, ©, +, −, ×, ÷, =, :), 😝
VERB	verb	run, runs, running, eat, ate, eating
X	other	sfpksdpsxmsa
SPACE	space

Spacy POS Tagging Example

POS Tagging in Spacy library is quite easy as seen in the below example. We just instantiate a Spacy object as doc. We iterate over doc object and use pos_ , tag_, to print the POS tag. Spacy also lets you access the detailed explanation of POS tags by using spacy.explain() function which is also printed in the same iteration along with POS tags.

In [1]:

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Get busy living or get busy dying.")

print(f"{'text':{8}} {'POS':{6}} {'TAG':{6}} {'Dep':{6}} {'POS explained':{20}} {'tag explained'} ")
for token in doc:
print(f'{token.text:{8}} {token.pos_:{6}} {token.tag_:{6}} {token.dep_:{6}} {spacy.explain(token.pos_):{20}} {spacy.explain(token.tag_)}')

[Out] :

text     POS    TAG    Dep    POS explained        tag explained 
Get      AUX    VB     ROOT   auxiliary            verb, base form
busy     ADJ    JJ     amod   adjective            adjective
living   NOUN   NN     dobj   noun                 noun, singular or mass
or       CCONJ  CC     cc     coordinating conjunction conjunction, coordinating
get      AUX    VB     conj   auxiliary            verb, base form
busy     ADJ    JJ     acomp  adjective            adjective
dying    VERB   VBG    xcomp  verb                 verb, gerund or present participle
.        PUNCT  .      punct  punctuation          punctuation mark, sentence closer

Fine Grained POS Tag

Spacy also provides a fine-grained tag that further categorizes a token in different sub-categories. For example, when a word is an adjective it further categorizes it as JJR (comparative adjective), JJS (superlative adjective), or AFX (affix adjective). We can get the list of fine grained tags in Spacy by using nlp.pipe_labels[‘tagger’] as shown in the below example.

In [2]

import spacy

nlp = spacy.load("en_core_web_sm")
tag_lst = nlp.pipe_labels['tagger']

print(len(tag_lst))
print(tag_lst)

[Out] :

50
['$', "''", ',', '-LRB-', '-RRB-', '.', ':', 'ADD', 'AFX', 'CC', 'CD', 'DT', 'EX', 'FW', 'HYPH', 'IN', 'JJ', 'JJR', 'JJS', 'LS', 'MD', 'NFP', 'NN', 'NNP', 'NNPS', 'NNS', 'PDT', 'POS', 'PRP', 'PRP$', 'RB', 'RBR', 'RBS', 'RP', 'SYM', 'TO', 'UH', 'VB', 'VBD', 'VBG', 'VBN', 'VBP', 'VBZ', 'WDT', 'WP', 'WP$', 'WRB', 'XX', '_SP', '``']

Fine Grained POS Tag list

Below is a POS tag list, their description, Fine-grained Tag, their description, Morphology, and some examples.

	POS	POS_Description	Fine-grained Tag	Description	Morphology	EXAMPLE
0	ADJ	adjective	AFX	affix	Hyph=yes	The Flintstones were a pre-historic family.
1	ADJ	adjective	JJ	adjective	Degree=pos	This is a good sentence.
2	ADJ	adjective	JJR	adjective, comparative	Degree=comp	This is a better sentence.
3	ADJ	adjective	JJS	adjective, superlative	Degree=sup	This is the best sentence.
4	ADJ	adjective	PDT	predeterminer	AdjType=pdt PronType=prn	Waking up is half the battle.
5	ADJ	adjective	PRP$	pronoun, possessive	PronType=prs Poss=yes	His arm hurts.
6	ADJ	adjective	WDT	wh-determiner	PronType=int rel	It’s blue, which is odd.
7	ADJ	adjective	WP$	wh-pronoun, possessive	Poss=yes PronType=int rel	We don’t know whose it is.
8	ADP	adposition	IN	conjunction, subordinating or preposition		It arrived in a box.
9	ADV	adverb	EX	existential there	AdvType=ex	There is cake.
10	ADV	adverb	RB	adverb	Degree=pos	He ran quickly.
11	ADV	adverb	RBR	adverb, comparative	Degree=comp	He ran quicker.
12	ADV	adverb	RBS	adverb, superlative	Degree=sup	He ran fastest.
13	ADV	adverb	WRB	wh-adverb	PronType=int rel	When was that?
14	CONJ	conjunction	CC	conjunction, coordinating	ConjType=coor	The balloon popped and everyone jumped.
15	DET	determiner	DT	determiner		This is a sentence.
16	INTJ	interjection	UH	interjection		Um, I don’t know.
17	NOUN	noun	NN	noun, singular or mass	Number=sing	This is a sentence.
18	NOUN	noun	NNS	noun, plural	Number=plur	These are words.
19	NOUN	noun	WP	wh-pronoun, personal	PronType=int rel	Who was that?
20	NUM	numeral	CD	cardinal number	NumType=card	I want three things.
21	PART	particle	POS	possessive ending	Poss=yes	Fred’s name is short.
22	PART	particle	RP	adverb, particle		Put it back!
23	PART	particle	TO	infinitival to	PartType=inf VerbForm=inf	I want to go.
24	PRON	pronoun	PRP	pronoun, personal	PronType=prs	I want you to go.
25	PROPN	proper noun	NNP	noun, proper singular	NounType=prop Number=sign	Kilroy was here.
26	PROPN	proper noun	NNPS	noun, proper plural	NounType=prop Number=plur	The Flintstones were a pre-historic family.
27	PUNCT	punctuation	-LRB-	left round bracket	PunctType=brck PunctSide=ini	rounded brackets (also called parentheses)
28	PUNCT	punctuation	-RRB-	right round bracket	PunctType=brck PunctSide=fin	rounded brackets (also called parentheses)
29	PUNCT	punctuation	,	punctuation mark, comma	PunctType=comm	I,me and myself.
30	PUNCT	punctuation	:	punctuation mark, colon or ellipsis		colon : is a punctuation mark
31	PUNCT	punctuation	.	punctuation mark, sentence closer	PunctType=peri	Punctuation at the end of sentence.
32	PUNCT	punctuation	”	closing quotation mark	PunctType=quot PunctSide=fin	“machine learning”
33	PUNCT	punctuation	“”	closing quotation mark	PunctType=quot PunctSide=fin	””
34	PUNCT	punctuation	“	opening quotation mark	PunctType=quot PunctSide=ini	”machine learning”
35	PUNCT	punctuation	HYPH	punctuation mark, hyphen	PunctType=dash	ML site - machinelearningknowledge.ai
36	PUNCT	punctuation	LS	list item marker	NumType=ord
37	PUNCT	punctuation	NFP	superfluous punctuation
38	SYM	symbol	#	symbol, number sign	SymType=numbersign	This is hash# symbol.
39	SYM	symbol	$	symbol, currency	SymType=currency	Dollar $ is the name of more than 20 curre…
40	SYM	symbol	SYM	symbol		this is a symbol $
41	VERB	verb	BES	auxiliary “be”		Let it be.
42	VERB	verb	HVS	forms of “have”		I’ve seen the Queen
43	VERB	verb	MD	verb, modal auxiliary	VerbType=mod	This could work.
44	VERB	verb	VB	verb, base form	VerbForm=inf	I want to go.
45	VERB	verb	VBD	verb, past tense	VerbForm=fin Tense=past	This was a sentence.
46	VERB	verb	VBG	verb, gerund or present participle	VerbForm=part Tense=pres Aspect=prog	I am going.
47	VERB	verb	VBN	verb, past participle	VerbForm=part Tense=past Aspect=perf	The treasure was lost.
48	VERB	verb	VBP	verb, non-3rd person singular present	VerbForm=fin Tense=pres	I want to go.
49	VERB	verb	VBZ	verb, 3rd person singular present	VerbForm=fin Tense=pres Number=sing Person=3	He wants to go.
50	X	other	ADD	email		[email protected]
51	X	other	FW	foreign word	Foreign=yes	Hello in spanish is Hola
52	X	other	GW	additional word in multi-word expression
53	X	other	XX	unknown
54	SPACE	space	_SP	space
55		NIL	missing tag

Spacy POS Tagging Example

POS Tagging in Spacy library is quite easy as seen in the below example. We just instantiate a Spacy object as doc. We iterate over doc object and use pos_ , tag_, to print the POS tag. Spacy also lets you access the detailed explanation of POS tags by using spacy.explain() function which is also printed in the same iteration along with POS tags.

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Get busy living or get busy dying.")

print(f"{'text':{8}} {'POS':{6}} {'TAG':{6}} {'Dep':{6}} {'POS explained':{20}} {'tag explained'} ")
for token in doc:
print(f'{token.text:{8}} {token.pos_:{6}} {token.tag_:{6}} {token.dep_:{6}} {spacy.explain(token.pos_):{20}} {spacy.explain(token.tag_)}')

text     POS    TAG    Dep    POS explained        tag explained 
Get      AUX    VB     ROOT   auxiliary            verb, base form
busy     ADJ    JJ     amod   adjective            adjective
living   NOUN   NN     dobj   noun                 noun, singular or mass
or       CCONJ  CC     cc     coordinating conjunction conjunction, coordinating
get      AUX    VB     conj   auxiliary            verb, base form
busy     ADJ    JJ     acomp  adjective            adjective
dying    VERB   VBG    xcomp  verb                 verb, gerund or present participle
.        PUNCT  .      punct  punctuation          punctuation mark, sentence closer

Fine Grained POS Tag

Spacy also provides a fine-grained tag that further categorizes a token in different sub-categories. For example, when a word is an adjective it further categorizes it as JJR (comparative adjective), JJS (superlative adjective), or AFX (affix adjective). We can get the list of fine grained tags in Spacy by using nlp.pipe_labels[‘tagger’] as shown in the below example.

In [2]

import spacy

nlp = spacy.load("en_core_web_sm")
tag_lst = nlp.pipe_labels['tagger']

print(len(tag_lst))
print(tag_lst)

[Out] :

50
['$', "''", ',', '-LRB-', '-RRB-', '.', ':', 'ADD', 'AFX', 'CC', 'CD', 'DT', 'EX', 'FW', 'HYPH', 'IN', 'JJ', 'JJR', 'JJS', 'LS', 'MD', 'NFP', 'NN', 'NNP', 'NNPS', 'NNS', 'PDT', 'POS', 'PRP', 'PRP$', 'RB', 'RBR', 'RBS', 'RP', 'SYM', 'TO', 'UH', 'VB', 'VBD', 'VBG', 'VBN', 'VBP', 'VBZ', 'WDT', 'WP', 'WP$', 'WRB', 'XX', '_SP', '``']

Tutorial on Spacy Part of Speech (POS) Tagging

Introduction

What is POS Tagging?

Why POS tag is used

POS Tagging in Spacy Library

Spacy POS Tags List

Spacy POS Tagging Example

Fine Grained POS Tag

Fine Grained POS Tag list

Morphology

Counting POS Tags in Spacy

Counting fine-grained tags

Visualizing the POS Tags in Spacy

Parameters

Visualizing POS Tags in Long Texts in Spacy

2 Responses

Leave a Reply Cancel reply

Latest Posts

Follow US