NumPy Random Permutation – Shuffle Arrays for Data Sampling
Introduction – Why Learn Permutations in NumPy?
Random permutation is crucial when you need to randomly rearrange data—be it shuffling rows of a dataset, creating randomized training batches, or simulating probability scenarios. NumPy’s random.permutation() and random.shuffle() make this fast and flexible.
By the end of this guide, you’ll:
- Use
np.random.permutation()to randomly reorder elements - Understand the difference between
permutation()andshuffle() - Apply permutations to 1D and 2D arrays
- Build reproducible experiments using random seeds
Step 1: Shuffle with np.random.permutation() (Returns a Copy)
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
shuffled = np.random.permutation(arr)
print("Original:", arr)
print("Shuffled:", shuffled)
Explanation:
permutation()returns a new array with shuffled values- The original array remains unchanged
Useful for random sampling without modifying the source
Step 2: Shuffle with np.random.shuffle() (In-Place)
arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr)
print("Shuffled in-place:", arr)
Explanation:
shuffle()modifies the array in-place- You lose the original order
Best when you don’t need the original array afterward
Step 3: Permute Rows of a 2D Array
matrix = np.array([[10, 20], [30, 40], [50, 60]])
shuffled = np.random.permutation(matrix)
print(shuffled)
Explanation:
permutation()shuffles rows, not individual elements- Each subarray (row) stays intact
Ideal for row-based data like datasets or matrices
What if You Shuffle Columns?
matrix = np.array([[10, 20], [30, 40], [50, 60]])
shuffled = np.random.permutation(matrix.T).T
print(shuffled)
Explanation:
matrix.Ttransposes the matrix (rows ↔ columns)- Then, shuffle rows (which were columns originally)
- Transpose back using
.Tagain
This shuffles columns instead of rows
Step 4: Get Permutation Indices
indices = np.random.permutation(5)
print(indices)
Explanation:
np.random.permutation(n)returns a shuffled array of indices from 0 to n-1- Useful when you want to reorder another array manually:
arr = np.array([100, 200, 300, 400, 500])
print(arr[indices]) # Reordered based on permutation
Step 5: Reproducibility with Seed
np.random.seed(42)
print(np.random.permutation([1, 2, 3, 4, 5]))
Explanation:
- Setting a seed ensures the same shuffled output every time
- Perfect for ML reproducibility, experiments, or debugging
Permutation vs Shuffle – What’s the Difference?
| Feature | np.random.permutation() | np.random.shuffle() |
|---|---|---|
| Modifies Original? | No (returns a copy) | Yes (in-place) |
| Works on multi-dimensional? | Yes (shuffles rows) | Yes (shuffles rows) |
| Use Case | When you need both original & shuffled | When only shuffled version is needed |
Real-World Use Cases
- Shuffle datasets before splitting into train/test sets
- Randomize orders in quizzes or games
- Reorder rows in image, audio, or sensor data
- Create unique permutations for simulations
Summary – Recap & Next Steps
Permutation is a fast and safe way to randomize data in NumPy. Whether you need to shuffle rows, generate random index orders, or reorder datasets without overwriting originals, np.random.permutation() is your best tool.
Key Takeaways:
- Use
permutation()when you need a shuffled copy - Use
shuffle()when in-place modification is okay - Shuffle rows in 2D arrays, or use transpose to shuffle columns
- Use seeds to make results reproducible
Real-world relevance: Random permutations are at the heart of training ML models, building fair testing conditions, and running simulations.
FAQs – NumPy Random Permutation
What’s the difference between shuffle() and permutation()?
shuffle() changes the array in-place; permutation() returns a new array.
Can I shuffle a 2D array by columns instead of rows?
Yes, transpose → shuffle → transpose back:
np.random.permutation(arr.T).T
How do I ensure the same shuffle every time?
Set a seed:
np.random.seed(0)
Can I shuffle strings or objects?
Yes, as long as they’re in a NumPy array.
Does permutation() work with lists?
Yes. You can pass a list or NumPy array.
Share Now :
