Let’s start with a basic DataFrame with a few columns.
>>> import pandas as pd >>> import numpy as np >>> >>> df = pd.DataFrame(np.random.rand(5,5), columns=['a', 'b', 'c', 'd', 'e']) >>> df['max'] = df.max(axis=1) >>> >>> df a b c d e max 0 0.067423 0.058920 0.999309 0.440547 0.572163 0.999309 1 0.384196 0.732857 0.138881 0.764242 0.096347 0.764242 2 0.900311 0.662776 0.223959 0.903363 0.349328 0.903363 3 0.988267 0.852733 0.913800 0.106388 0.864908 0.988267 4 0.830644 0.647775 0.596375 0.631442 0.907743 0.907743
First, let’s just review the basics. Without moving or dropping columns, we can view any column we want in any order by just selecting them.
>>> df['max'] 0 0.999309 1 0.764242 2 0.903363 3 0.988267 4 0.907743 Name: max, dtype: float64
Or any set of columns, including viewing the column more than once, and in any order.
>>> df[['d', 'a', 'max', 'b', 'd']] d a max b d 0 0.440547 0.067423 0.999309 0.058920 0.440547 1 0.764242 0.384196 0.764242 0.732857 0.764242 2 0.903363 0.900311 0.903363 0.662776 0.903363 3 0.106388 0.988267 0.988267 0.852733 0.106388 4 0.631442 0.830644 0.907743 0.647775 0.631442
So assigning back to our variable will make this reordering permanent.
df = df[['d', 'a', 'b', 'max', 'e']]
Since the columns are just an Index, they can be converted to a list and manipulated, then you can also use the reindex method to change the columns ordering. Note that you don’t want to just assign the sorted names to columns, this won’t move them, but will rename them!
>>> df.reindex(columns=sorted(df.columns)) a b d e max 0 0.067423 0.058920 0.440547 0.572163 0.999309 1 0.384196 0.732857 0.764242 0.096347 0.764242 2 0.900311 0.662776 0.903363 0.349328 0.903363 3 0.988267 0.852733 0.106388 0.864908 0.988267 4 0.830644 0.647775 0.631442 0.907743 0.907743
Also, when you are first creating a column, you can just insert
it in the order that you want it to appear. By default, adding a column using the []
operator will put it at the end.
>>> df.insert(3, "min", df.min(axis=1)) >>> df d a b min max e 0 0.440547 0.067423 0.058920 0.058920 0.999309 0.572163 1 0.764242 0.384196 0.732857 0.096347 0.764242 0.096347 2 0.903363 0.900311 0.662776 0.349328 0.903363 0.349328 3 0.106388 0.988267 0.852733 0.106388 0.988267 0.864908 4 0.631442 0.830644 0.647775 0.631442 0.907743 0.907743
Finally, you can pop the column, then re-insert it. Popping a column removes it and returns it, as you’d expect.
>>> col_e = df.pop("e") >>> df.insert(3, "e", col_e) >>> >>> df d a b e min max 0 0.440547 0.067423 0.058920 0.572163 0.058920 0.999309 1 0.764242 0.384196 0.732857 0.096347 0.096347 0.764242 2 0.903363 0.900311 0.662776 0.349328 0.349328 0.903363 3 0.106388 0.988267 0.852733 0.864908 0.106388 0.988267 4 0.631442 0.830644 0.647775 0.907743 0.631442 0.907743
So as you can see, there are a number of ways to manipulate your column ordering in your dataframe.