I have dataset that consist of hundreds of column, and thousands of row
In [119]:
df.columns
Out[119]:
Index(['column 1', 'column2',
...
'column 100'],
dtype='object', name='var_name')
Usually I did value_counts() for every single column to see the distribution.
In [121]:
a = df['column1'].value_counts()
In [122]:
a
Out[122]:
1 77494
2 5389
0 2016
3 878
Name: column 1, dtype: int64
But for this dataframe, if I did this for every columns, this will make my notebook very messy, how to automate this? Is there any function that help?
If you have other information, all my data is int64, but I hope the best answer can give solution that works in every cases. I want to make the solution answer in pandas dataframe.
Based on @MaxU suggestion, this is my version of simplified dataframe
df
id column1 column2 column3
1 3 1 7
2 3 2 8
3 2 3 7
4 2 1 8
5 1 2 7
and my expected output is:
column 1 count
1 1
2 2
3 2
column 2 count
1 2
2 2
3 1
column 3 count
7 3
8 2
3 1