๐งน Python Remove List Duplicates โ Fast and Simple Methods
๐งฒ Introduction โ Why Remove Duplicates?
Python lists often contain repeated elements, especially when aggregating or merging data. Removing duplicates is important for:
- โ Data cleaning
- ๐ Unique value extraction
- ๐ Optimizing lookups
- โป๏ธ Deduplicating logs, emails, IDs
Python offers several elegant, fast ways to remove duplicates from a list, whether or not you need to preserve order.
๐ฏ In this guide, you’ll learn:
- How to remove duplicates with
set
,dict
, and loops - Preserve or ignore order depending on your use case
- Use
collections
,pandas
, and list comprehension - Best practices for handling duplicates
โ
Quickest Way โ Use set()
to Remove Duplicates
numbers = [1, 2, 2, 3, 4, 1]
unique = list(set(numbers))
print(unique) # [1, 2, 3, 4] (order not preserved)
โ Removes duplicates fast, but order may change.
๐ง Preserve Order โ Use dict.fromkeys()
names = ["Alice", "Bob", "Alice", "Eve"]
unique = list(dict.fromkeys(names))
print(unique) # ['Alice', 'Bob', 'Eve']
โ Python 3.7+ maintains insertion order in dicts.
๐ Loop-Based Approach (Manual Deduplication)
data = ["x", "y", "x", "z", "y"]
unique = []
for item in data:
if item not in unique:
unique.append(item)
print(unique) # ['x', 'y', 'z']
โ
Works in all versions, preserves order.
โ ๏ธ Slower on large lists due to repeated lookups.
๐ Using collections.OrderedDict
(Python <3.7)
from collections import OrderedDict
items = [3, 2, 3, 1, 2]
unique = list(OrderedDict.fromkeys(items))
print(unique) # [3, 2, 1]
โ Preserves order before Python 3.7.
๐งช Use pandas
for List-Like Series
import pandas as pd
values = [5, 3, 5, 2, 1, 2]
unique = pd.Series(values).drop_duplicates().tolist()
print(unique) # [5, 3, 2, 1]
โ Useful in data science and analytics workflows.
๐ List Comprehension + Set for Fast Deduplication
seen = set()
unique = [x for x in [1, 1, 2, 3, 2] if not (x in seen or seen.add(x))]
print(unique) # [1, 2, 3]
โ
Short and efficient. Preserves order.
๐ก Uses set.add()
inside a comprehension for side effect.
๐งฎ Remove Duplicates from List of Tuples
pairs = [(1, 2), (2, 3), (1, 2)]
unique = list(dict.fromkeys(pairs))
print(unique) # [(1, 2), (2, 3)]
โ Works because tuples are hashable.
๐ Remove Duplicates from List of Dictionaries
import json
people = [{"name": "Alice"}, {"name": "Bob"}, {"name": "Alice"}]
seen = set()
unique = []
for p in people:
j = json.dumps(p, sort_keys=True)
if j not in seen:
unique.append(p)
seen.add(j)
print(unique)
โ
Use json.dumps()
to hash dicts safely.
๐ Best Practices
โ Do This | โ Avoid This |
---|---|
Use set() for fast deduplication | Expecting order preservation with set() |
Use dict.fromkeys() to preserve order | Relying on set() for ordered results |
Use json.dumps() for list of dicts | Using dict as keys directly |
Profile speed on large datasets | Overusing slow if-in-list in loops |
๐ Summary โ Recap & Next Steps
Python offers fast and flexible tools to remove duplicates from lists. Your choice depends on whether you need to preserve order, maintain performance, or deduplicate complex structures.
๐ Key Takeaways:
- โ
Use
set()
to remove duplicates quickly (unordered) - โ
Use
dict.fromkeys()
to preserve order in Python 3.7+ - โ
Use list comprehension +
set.add()
for fast and readable code - โ
Use
json.dumps()
to handle unhashable dicts
โ๏ธ Real-World Relevance:
Used in data preprocessing, user input cleaning, log file parsing, deduplicating API responses, and data deduplication pipelines.
โ FAQ โ Python Remove List Duplicates
โ How do I remove duplicates from a list while keeping order?
โ Use:
list(dict.fromkeys(your_list))
โ Why does set()
not preserve order?
Because sets are unordered collections (until Python 3.7 dicts preserved insertion order, but sets donโt guarantee it).
โ Can I remove duplicates from a list of dictionaries?
โ
Yes, convert dicts to strings using json.dumps()
and track uniqueness with a set
.
โ Whatโs the fastest way to remove duplicates?
โ Use:
list(set(your_list))
โ ๏ธ Not suitable if you need to preserve order.
โ Can I remove duplicates from nested lists?
โ Not directly. Lists are unhashable. Convert to tuple
first:
nested = [[1,2], [3,4], [1,2]]
unique = [list(t) for t in dict.fromkeys(map(tuple, nested))]
Share Now :