Handling duplicate data is a common task in data processing and cleaning. Pandas provides a convenient method called drop_duplicates()
to remove duplicate rows from a DataFrame. This method can be customized to drop duplicates based on specific columns, keep particular occurrences, or drop all duplicates.
The simplest way to use drop_duplicates()
is to call it without any parameters, which will drop all duplicate rows, keeping the first occurrence.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie'], 'Age': [25, 30, 25, 35]}
df = pd.DataFrame(data)
# Drop duplicates
df_unique = df.drop_duplicates()
print(df_unique)
In this example, the duplicate row for 'Alice' is removed, keeping only the first occurrence. The drop_duplicates()
method by default keeps the first occurrence of each duplicate row.
Console Output:
Name Age 0 Alice 25 1 Bob 30 3 Charlie 35
You can drop duplicates based on specific columns by passing the column names to the subset
parameter.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie'], 'Age': [25, 30, 26, 35]}
df = pd.DataFrame(data)
# Drop duplicates based on 'Name'
df_unique_name = df.drop_duplicates(subset=['Name'])
print(df_unique_name)
In this example, duplicates are removed based on the 'Name' column. Even though the ages differ, the second occurrence of 'Alice' is considered a duplicate and is removed.
Console Output:
Name Age 0 Alice 25 1 Bob 30 3 Charlie 35
You can specify whether to keep the first or last occurrence of each duplicate row using the keep
parameter.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie'], 'Age': [25, 30, 25, 35]}
df = pd.DataFrame(data)
# Drop duplicates, keep the last occurrence
df_keep_last = df.drop_duplicates(keep='last')
print(df_keep_last)
In this example, the method keeps the last occurrence of each duplicate row. The first occurrence of 'Alice' is removed, keeping the second one.
Console Output:
Name Age 1 Bob 30 2 Alice 25 3 Charlie 35
To drop all duplicates and not keep any occurrence, set the keep
parameter to False
.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie'], 'Age': [25, 30, 25, 35]}
df = pd.DataFrame(data)
# Drop all duplicates
df_no_duplicates = df.drop_duplicates(keep=False)
print(df_no_duplicates)
In this case, all duplicate occurrences of 'Alice' are removed, leaving only unique rows in the DataFrame.
Console Output:
Name Age 1 Bob 30 3 Charlie 35
After dropping duplicates, the index might not be continuous. You can reset the index using the reset_index()
method.
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Alice', 'Charlie'], 'Age': [25, 30, 25, 35]}
df = pd.DataFrame(data)
# Drop duplicates and reset index
df_unique_reset = df.drop_duplicates().reset_index(drop=True)
print(df_unique_reset)
This example demonstrates how to reset the index after dropping duplicates, resulting in a DataFrame with a continuous index.
Console Output:
Name Age 0 Alice 25 1 Bob 30 2 Charlie 35
Newsletter
Subscribe to our newsletter for weekly updates and promotions.
Wiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWiki E-Learning
E-LearningComputer Science and EngineeringMathematicsNatural SciencesSocial SciencesBusiness and ManagementHumanitiesHealth and MedicineEngineeringWikiCode
Programming LanguagesWeb DevelopmentMobile App DevelopmentData Science and Machine LearningDatabase ManagementDevOps and Cloud ComputingSoftware EngineeringCybersecurityGame DevelopmentWikiCode
Programming LanguagesWeb DevelopmentMobile App DevelopmentData Science and Machine LearningDatabase ManagementDevOps and Cloud ComputingSoftware EngineeringCybersecurityGame DevelopmentWikiCode
Programming LanguagesWeb DevelopmentMobile App DevelopmentData Science and Machine LearningDatabase ManagementDevOps and Cloud ComputingSoftware EngineeringCybersecurityGame DevelopmentWikiCode
Programming LanguagesWeb DevelopmentMobile App DevelopmentData Science and Machine LearningDatabase ManagementDevOps and Cloud ComputingSoftware EngineeringCybersecurityGame DevelopmentWikiCode
Programming LanguagesWeb DevelopmentMobile App DevelopmentData Science and Machine LearningDatabase ManagementDevOps and Cloud ComputingSoftware EngineeringCybersecurityGame DevelopmentWikiCode
Programming LanguagesWeb DevelopmentMobile App DevelopmentData Science and Machine LearningDatabase ManagementDevOps and Cloud ComputingSoftware EngineeringCybersecurityGame DevelopmentWiki News
World NewsPolitics NewsBusiness NewsTechnology NewsHealth NewsScience NewsSports NewsEntertainment NewsEducation NewsWiki News
World NewsPolitics NewsBusiness NewsTechnology NewsHealth NewsScience NewsSports NewsEntertainment NewsEducation NewsWiki News
World NewsPolitics NewsBusiness NewsTechnology NewsHealth NewsScience NewsSports NewsEntertainment NewsEducation NewsWiki News
World NewsPolitics NewsBusiness NewsTechnology NewsHealth NewsScience NewsSports NewsEntertainment NewsEducation NewsWiki News
World NewsPolitics NewsBusiness NewsTechnology NewsHealth NewsScience NewsSports NewsEntertainment NewsEducation NewsWiki News
World NewsPolitics NewsBusiness NewsTechnology NewsHealth NewsScience NewsSports NewsEntertainment NewsEducation NewsWiki Tools
JPEG/PNG Size ReductionPDF Size CompressionPDF Password RemoverSign PDFPower Point to PDFPDF to Power PointJPEG to PDF ConverterPDF to JPEG ConverterWord to PDF ConverterWiki Tools
JPEG/PNG Size ReductionPDF Size CompressionPDF Password RemoverSign PDFPower Point to PDFPDF to Power PointJPEG to PDF ConverterPDF to JPEG ConverterWord to PDF ConverterWiki Tools
JPEG/PNG Size ReductionPDF Size CompressionPDF Password RemoverSign PDFPower Point to PDFPDF to Power PointJPEG to PDF ConverterPDF to JPEG ConverterWord to PDF ConverterWiki Tools
JPEG/PNG Size ReductionPDF Size CompressionPDF Password RemoverSign PDFPower Point to PDFPDF to Power PointJPEG to PDF ConverterPDF to JPEG ConverterWord to PDF ConverterWiki Tools
JPEG/PNG Size ReductionPDF Size CompressionPDF Password RemoverSign PDFPower Point to PDFPDF to Power PointJPEG to PDF ConverterPDF to JPEG ConverterWord to PDF ConverterWiki Tools
JPEG/PNG Size ReductionPDF Size CompressionPDF Password RemoverSign PDFPower Point to PDFPDF to Power PointJPEG to PDF ConverterPDF to JPEG ConverterWord to PDF ConverterCompany
About usCareersPressCompany
About usCareersPressCompany
About usCareersPressLegal
TermsPrivacyContactAds PoliciesLegal
TermsPrivacyContactAds PoliciesLegal
TermsPrivacyContactAds PoliciesCompany
About usCareersPressCompany
About usCareersPressCompany
About usCareersPressLegal
TermsPrivacyContactAds PoliciesLegal
TermsPrivacyContactAds PoliciesLegal
TermsPrivacyContactAds PoliciesLegal
TermsPrivacyContactAds PoliciesAds Policies