Reformat value counts analysis in Pandas for large number of columns

Question

I have dataset that consist of hundreds of column, and thousands of row

In [119]:
df.columns
Out[119]:
Index(['column 1', 'column2',
       ...
       'column 100'],
      dtype='object', name='var_name')

Usually I did value_counts() for every single column to see the distribution.

In [121]:
a = df['column1'].value_counts()
In [122]:
a
Out[122]:
1     77494
2      5389
0      2016
3       878
Name: column 1, dtype: int64

But for this dataframe, if I did this for every columns, this will make my notebook very messy, how to automate this? Is there any function that help?

If you have other information, all my data is int64, but I hope the best answer can give solution that works in every cases. I want to make the solution answer in pandas dataframe.

Based on @MaxU suggestion, this is my version of simplified dataframe

df

id  column1  column2 column3
1         3        1       7
2         3        2       8
3         2        3       7
4         2        1       8
5         1        2       7

and my expected output is:

column 1   count
1          1
2          2
3          2
column 2   count
1          2
2          2
3          1
column 3   count
7          3
8          2
3          1

Priyaj · Answer 1 · Sep 25, 2018

I'd do it this way:

In [83]: df.drop('id',1).apply(lambda c: c.value_counts().to_dict())
Out[83]:
column1    {3: 2, 2: 2, 1: 1}
column2    {2: 2, 1: 2, 3: 1}
column3          {7: 3, 8: 2}
dtype: object

or:

In [84]: for c in df.drop('id',1):
    ...:     print(df[c].value_counts())
    ...:
3    2
2    2
1    1
Name: column1, dtype: int64   # <----- column name
2    2
1    2
3    1
Name: column2, dtype: int64
7    3
8    2
Name: column3, dtype: int64