Ponder with Pandas — Text to Excel and Feature Engineering

Shalabh Bhatnagar
2 min readMay 19, 2022

--

Of all the things we do in machine learning, changing data, transforming it and making it feature ready takes lots of time. Extracting from a database, a CSV file or am XML file is easy stuff as they all exhibit a schema and many libraries make our lives easy.

In a series of mini fragments, I will share code snippets that I hope make your machine learning tasks easier.

I make a point by implementing a real-world use case. If for some reason, you have not seen a use case I implemented, no worries. If and when you encounter a matching one, you can at least back.

All of this is of course free.

Applies to

Read any Text File and Converting it to a Spreadsheet Cell Representation in Microsoft Excel

Benefits:

1. Rapidly convert text to an Excel

2. Layout text as if they are structured data set = features

When to use

When you have loads of text files and want to flatten them into a structure at a character level.

import pandas as pd# Create a Dataframe from a text file named fox.txt and transpose it
# Please ensure that text file is in the same folder as this code or use the -
# paths as per your needs
df = pd.DataFrame.from_records(data=open(“fox.txt”,“rt”)
.read()).transpose()
# Write to Excel file and viola you are done
df.to_excel(“some.xlsx”, merge_cells=True, index=False)
Disclaimer: All copyrights and trademarks belong to their respective companies and owners. The purpose of this article of educational only and the views herein are my own.

--

--

No responses yet