How to Import a CSV File into Python using Pandas
The most widely used file format for sharing and storing data is CSV (comma-separated value) files. Any data scientist or business analyst must be able to read, manipulate, and write data to and from CSV files using Python. We’ll discuss different ways to read csv files into Pandas DataFrames
1. Pandas’ “read csv” is the most basic process of loading data from a CSV file into a Pandas DataFrame which each one of us are aware of.
2. Read Specific Columns from CSV File
To import a specific column in the CSV, use the argument usecols and pass the columns in a list
3. Loading data from a TSV file into a Pandas dataframe
We can see the below fruits.tsv is separated using tab, in order to load the tsv file into pandas dataframe we need to make use of sep as shown in screenshot 2
4. Index_cols
Index_cols is used to specify which columns will be used as the data frame’s index. Pandas by default create a new column with a start value of 0 to indicate the index column.
column name PassengerId is used as Index Column
5. Add Header While Reading CSV into pandas DataFrame
The pandas read csv() method allows you to add custom header while reading a CSV into a pandas DataFrame using names
Dataframe without the header
6. Performing skip rows operation while importing
Skip rows ignores certain rows while loading csv file into dataframe
7. Parse_dates
parse dates argument can be used to convert objects into valid datetime types.
In the below screenshot we can see pandas takes dates as string hence the data type for Joining date field is object. In order to convert the data type for date column we need to use argument parse_dates