## Introduction

We have seen in earlier tutorials how useful Pandas dataFrames are in Data Science or machine learning projects. In this tutorial, we will be learning about some new pandas operations – **copy(), cut() and query().** The tutorial will look into the syntax of each function and also the examples which are used in real-world scenarios.

### Importing Pandas Library

Starting the tutorial by importing the Pandas library.

```
import pandas as pd
import numpy as np
```

**Pandas Copy : Copy()**

The pandas copy() function is used for creating a copy of the object’s indices and data.

### Syntax

**DataFrame.copy(deep=True)**

**deep : bool** : After passing the object to the function, we have to decide whether a deep copy of the specified object should be created or not. The default value of **deep** parameter is True.

If set as **True**, then a new object will be created with a copy of the calling object’s data and indices. Modifications to the data or indices of the copy will not be reflected in the original object

If specified as **False**, then a new object will be created without copying the calling object’s data or index. Any changes to the original object will be reflected in the copy as well.

This function returns the copy of the passed object.

### Example 1: Simple example of Pandas Copy Function

Using copy() function we can generate a copy of the series object.

```
s = pd.Series([7, 9], index=["p", "q"])
```

```
s_copy = s.copy()
```

```
s_copy
```

p 7 q 9 dtype: int64

### Example 2: Showing difference in Pandas Shallow and Deep copy

In this example, we will look at the difference between shallow and deep copy created using the copy() function of pandas.

```
s = pd.Series([7, 9], index=["p", "q"])
```

For creating deep copy, we have to use copy() function whereas for creating a shallow copy, we pass the deep parameter value of False.

```
deep = s.copy()
```

```
deep
```

p 7 q 9 dtype: int64

```
shallow = s.copy(deep=False)
```

```
shallow
```

p 7 q 9 dtype: int64

Checking whether the series object is shallow or deep.

```
s is shallow
```

False

Since the values and indices of the original series is copied in shallow copy, thus we get **True** as the output.

```
s.values is shallow.values and s.index is shallow.index
```

True

Here we know that the original object is not deep copy and than the values and indices are also not copied in the original object in case of a deep copy.

```
s is deep
```

False

```
s.values is deep.values or s.index is deep.index
```

False

### Example 3: Main difference in Pandas Shallow and Deep Copy

Since in the shallow copy, the changes made in the original object are reflected, we can see those changes. Whereas in case of deep copy, the changes made in the original copy are not shown. So this is the main difference between shallow and deep copy.

```
s[0] = 3
```

```
shallow[1] = 4
```

```
s
```

p 3 q 4 dtype: int64

```
shallow
```

p 3 q 4 dtype: int64

```
deep
```

p 7 q 9 dtype: int64

**Pandas Cut : Cut()**

Pandas cut() function is used for creating bins with the help of discrete intervals. The cut() function can be used when we are looking to segment and sort the data values into bins.

### Syntax

**pandas.cut(x, bins, right = True, labels=None, retbins = False, precision = 3, include_lowest = False, duplicates = ‘raise’)**

**x : array-like** – This takes the array that has to be binned

**bins : int,sequence of scalars** – Here the desired kind of bins are

**right : bool** – It tells whether the rightmost edge is included or not

**Labels : array or False** – Using this parameter we can specify the labels for the bins returned.

**retbins : bool** – This parameter is used to tell the function whether the bins have to be retunrned or not.

**precision : int** – The precision at which to store and display the bins labels.

**include_lowest : bool** – This decides whether the first interval should be left-inclusive or not

**duplicates : {default ‘raise’, ‘drop’}, optional** – It checks that if bin edges are not unique, raise ValueError or drop non-uniques

The function returns an array-like object and bins which were desired or specified.

### Example 1: Simple example of Pandas Cut Function

Segmenting the values into three equal-sized bins. Here the complete array is divided into three bins of equal size and then the resulting array is displayed as output.

```
pd.cut(np.array([2, 8, 3, 9, 6, 7]), 3)
```

[(1.993, 4.333], (6.667, 9.0], (1.993, 4.333], (6.667, 9.0], (4.333, 6.667], (6.667, 9.0]] Categories (3, interval[float64]): [(1.993, 4.333] < (4.333, 6.667] < (6.667, 9.0]]

### Example 2: Using series as an input

```
s = pd.Series(np.array([1, 3, 5, 7, 9]),index=['p', 'q', 'r', 's', 't'])
```

```
pd.cut(s, 3)
```

p (0.992, 3.667] q (0.992, 3.667] r (3.667, 6.333] s (6.333, 9.0] t (6.333, 9.0] dtype: category Categories (3, interval[float64]): [(0.992, 3.667] < (3.667, 6.333] < (6.333, 9.0]]

**Pandas Query : Query()**

The pandas query() function is used to query the columns of a dataframe with the help of boolean expression.

### Syntax

**DataFrame.query(expr,inplace=False,**kwargs)**

**expr : str** – It contains the query string to evaluate

**inplace : bool** – It decides whether the query should modify the data in place or return a modified copy.

**kwargs** – For additional arguments.

### Example 1: Simple example of Pandas Query Function

Here a dataframe is created using range() function.

```
df = pd.DataFrame({'A': range(2, 7),
'B': range(20, 0, -4),
'C': range(20, 10, -2)})
```

```
df
```

A | B | C | |
---|---|---|---|

0 | 2 | 20 | 20 |

1 | 3 | 16 | 18 |

2 | 4 | 12 | 16 |

3 | 5 | 8 | 14 |

4 | 6 | 4 | 12 |

As we can see the 4th index row has a value which is greater in column ‘A’ than column ‘B’ and thus we get the output.

```
df.query('A > B')
```

A | B | C | |
---|---|---|---|

4 | 6 | 4 | 12 |

### Example 2: Checking equal condition

Clearly the first or 0th index row satisfies the condition and we get the output.

```
df.query('B == C')
```

A | B | C | |
---|---|---|---|

0 | 2 | 20 | 20 |

## Conclusion

We have reached the end of this article, through this article we learned about some new pandas functions, namely **pandas copy(), cut() and query()**. These functions are helpful in applying operations over a Pandas DataFrame. We also looked at the syntax of these functions and their examples which helps in understanding the usage of functions.

- Also Read – Tutorial – Pandas Drop, Pandas Dropna, Pandas Drop Duplicate
- Also Read – Pandas Visualization Tutorial – Bar Plot, Histogram, Scatter Plot, Pie Chart
- Also Read – Tutorial – Pandas Concat, Pandas Append, Pandas Merge, Pandas Join
- Also Read – Pandas DataFrame Tutorial – Selecting Rows by Value, Iterrows and DataReader

*Reference –* https://pandas.pydata.org/docs/