I want to replicate the values from a folder of numerous excel files that has values in columns a through g into a single excel file.
The following piece of code has been tested by me:
` import os
import pandas as pd
# Specify the input folder containing Excel files
input_folder = 'C:/Users/User/Desktop/FilesToProcess'
# Create an empty DataFrame to store the selected data
selected_data = pd.DataFrame()
# Loop through all Excel files in the input folder
for file_name in os.listdir(input_folder):
if file_name.endswith('.xlsx'):
# Read in the Excel file and select the desired columns
df = pd.read_excel(os.path.join(input_folder, file_name), usecols='A:G')
# Append the selected data to the main DataFrame
selected_data = selected_data.append(df)[df.columns.tolist()]
# Specify the output file name and sheet name
output_file = 'C:/Users/User/Desktop/output1.xlsx'
output_sheet_name = 'Sheet1'
# Write the selected columns to a new Excel file
with pd.ExcelWriter(output_file) as writer:
selected_data.to_excel(writer, sheet_name=output_sheet_name, index=False)`
Although it initially functions as intended, after a few files, the columns become messed up. It starts copying values in columns > G specifically. Due to the nature of the data, I am unable to provide a result example.