How to do a vLookup in Python using Pandas
vLookup is a operation in Excel that searches for a value in a table and returns related data from another column. In Python, we can achieve this by merging two data tables using the pandas.merge() function. This method allows us to combine data from two DataFrames based on a common column (or index), similar to how SQL joins work.
In this article, we are going to cover:
- What vLookup means in a data context
- How to merge DataFrames using different join types
- Examples of inner, left, right and outer joins using Pandas
Syntax
In Pandas, the merge() function is used to join two DataFrames. The syntax is:
dataframe.merge(dataframe1, dataframe2, how, on, copy, indicator, suffixes, validate)
Parameters:
- datafram1: dataframe object to be merged with.
- dataframe2: dataframe object to be merged.
- how: {left, right, inner, outer} specifies how merging will be done
- on: specifies column or index names used for performing join.
- suffixes: suffix used for overlapping columns.For exception use values (False, False).
- validate: If specified, checks the kind of merging.The type of merge could be (one-one, one-many, many-one, many-many).
Different Types of Join
- Inner Join: Returns only the rows with matching values in both DataFrames.
- Left Join: Returns all rows from the left DataFrame and matching rows from the right DataFrame. Non-matching rows in the right DataFrame are replaced with NaN.
- Right Join: Returns all rows from the right DataFrame and matching rows from the left DataFrame. Non-matching rows in the left DataFrame are replaced with NaN.
- Outer Join: Returns all rows from both DataFrames, filling NaN where there are no matches.
Setting Up the Data
We are goin to work on two csv files n this article:
- Student_data.csv â Contains student information.
- Course_enrolled.csv â Contains details of the courses students are enrolled in.
For demonstration, let's read these CSV files into Pandas DataFrames:
import pandas as pd
# read csv data
df1 = pd.read_csv('path_to_Student_data.csv')
df2 = pd.read_csv('path_to_Course_enrolled.csv')
print(df1)
print(df2)
Output:


Example 1: Inner Join
Inner join returns only the rows that have matching values in both DataFrames.
import pandas as pd
df1 = pd.read_csv('path_to_Student_data.csv')
df2 = pd.read_csv('path_to_Course_enrolled.csv')
inner_join = pd.merge(df1, df2, on='Name', how='inner')
print(inner_join)
Output:
Example 2: Left Join
Left join returns all rows from the first DataFrame and only matching rows from the second DataFrame.
import pandas as pd
df1 = pd.read_csv('path_to_Student_data.csv')
df2 = pd.read_csv('path_to_Course_enrolled.csv')
left_join = pd.merge(df1, df2, on='Name', how='left')
print(left_join)
Output:
Example 3: Right Join
Right join returns all rows from the second DataFrame and only matching rows from the first DataFrame.
import pandas as pd
df1 = pd.read_csv('path_to_Student_data.csv')
df2 = pd.read_csv('path_to_Course_enrolled.csv')
right_join = pd.merge(df1, df2, on='Name', how='right')
print(right_join)
Output:
Example 4: Outer Join
Outer join returns all rows from both DataFrames, filling in NaN for missing matches.
import pandas as pd
df1 = pd.read_csv('path_to_Student_data.csv')
df2 = pd.read_csv('path_to_Course_enrolled.csv')
outer_join = pd.merge(df1, df2, on='Name', how='outer')
print(outer_join)
Output: