1. I have a folder with multiple csv files stores. i am trying to fetch the common companies and parameters based on the year Column, and based on the Risk Return column.
1. The condition for Risk return >0 followed-by company name is it there in all the 10 years if the 2 conditions are satisfied then it has to save in separate file.
Sample Data view:
Like this i have data csv files for 10 years
Company Name |
Total Reserves/Total Assets |
Total Reserves/Total Current Liabilities |
Total Reserves/Net Assets |
Closing Cash & Cash Equivalent Growth |
Year |
Risk Adjusted Return |
Archies Ltd. |
0.609629 |
3.420958 |
0.922252 |
-13.2027 |
2009-10 |
6.963131 |
Shoppers Stop Ltd. |
0.279474 |
0.751713 |
0.860496 |
-53.3125 |
2009-10 |
5.017301 |
Bata India Ltd. |
0.405643 |
0.828716 |
0.910929 |
101.2264 |
2009-10 |
3.676464 |
Vaibhav Global Ltd. |
0.252348 |
2.246791 |
0.775817 |
96.97768 |
2009-10 |
0.718685 |
Trent Ltd. |
0.480978 |
1.881055 |
0.965781 |
239.456 |
2009-10 |
0.075448 |
1. Like this there are 10 csv files im trying to extract the common company names based on the number of years it has repeated in 10 years, randomly and continuously more than 5 years.
import pandas as pd
import glob
path = r'C:/Users/harik/Success company' # use your path
all_files = glob.glob(path + "/*.csv")
li = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None, header=0)
li.append(df)
frame = pd.concat(li, axis=0, ignore_index=True)
frame.head()
frame.drop(['Unnamed: 0'], axis=1)
pd.value_counts(frame['Company Name'].values)
Output: company names that are common in 10 yearly wise csv files
> Trent Ltd. 8
Vaibhav Global Ltd. 8
Bata India Ltd. 6
Archies Ltd. 6
V-Mart Retail Ltd. 6
V2 Retail Ltd. 6
SORIL Infra Resources Ltd. 5
Shoppers Stop Ltd. 5
Future Enterprises Ltd. 4
Avenue Supermarts Ltd. 2
Future Retail Ltd. 2
SRS Ltd. 2
Brandhouse Retails Ltd. 1
Olympia Industries Ltd. 1
eDynamics Solutions Ltd. 1
dtype: int64
- But i want the out with along with its relevant feature in the dataset.
- I Did this, Now how to write the condition for this all the data.
1. Consider only those companies whose risk adjusted return >0
2. Should not consider all the companies across industries for any comparison. Comparison should be within industry
3. Should remove the inconsistent companies for finding the commonalities
4. Yes you should have at least 4-5 Years to retrieve the commonalities. Look into the industries across 117 to find the consistency between companies. Those industries didn't have the consistency for the companies let's make a list.
How this can be applied, could some one help me how this points are applied in the form of code.
Thanks