Data transfer between files,sql databases and dataframes
INTRODUCTION CSV(COMMA SEPARATED VALUES)is a format that stores data in separated forms or it refers to a tabular data saved as plaintext where data values are separated by commas. This chapter relates to transferring a CSV file from/into a dataframe and also to/from a database table from/into a dataframe. For example:converting a tabular data into CSV data. ROLL.NO NAME MARKS 101 TIA 67.8 102 RADHA 78.9 Roll no.,name,marks 101,tia,67.8 102,radha,78.9
1.Loading data from csv to dataframes 1.Reading from a csv file to a dataframe EXAMPLE:import pandas as pd df=pd.read_csv(‘d:\\data\\sample.csv’) In the above program read_csv() function is being used to read the data from sample.csv file in dataframe.And then its(df) contents are being displayed with the help of print ().
1.1 Reading csv file and specifying own column names EXAMPLE: df=pd.read_csv(‘d:\sample.csv’,names=[‘A’, ‘B’ ,‘C’]) (CSV files only contain data,not the column names) A B C 1 SARAH KAPUR 2 REET KAUR 3 ROBERT DOUGLAS 4 SHAHIDA ALI 5 JIA NEPALAN 1,SARAH,KAPUR 2,REET,KAUR 3,ROBERT,DOUGLAS 4,SHAHHIDA,ALI 5,JIA,NEPALAN
1.2 Reading csv files with no headers EXAMPLE:df=pd.read_csv(‘d:\sample.csv’,header=none) 1 2 sarah Kapur reet Kaur 3 robert douglas 4 jia nepalan
1.3 Reading a csv file while skipping the rows Example:df=pd.read_csv(‘d:\sample.csv’,names=‘A’, ‘B’ , ‘C’,skiprows=1) (This time it has skipped 1 row and taken data from 2nd row onwards.) A B C 1 SARAH KAPUR 2 REET KAUR 3 ROBERT DOUGLAS 4 SHAHIDA ALI 5 JIA NEPALAN
1.4 Reading specified no. of rows from csv file Example:df=pd.read_csv(‘d:\sample.csv,names=[‘A’, ‘B’, ‘C’],nrows=1) (because of argument nrows=1,only one row have been read from csv file) A B C 1 SARAH KAPUR
Reading from csv files having separator different from commas Sep=<separator character> Example:pd.csv_read(‘d:\sample.csv,sep=‘;’) denmark;beat;peru Mexico;beat;germany Australia;draw;denmark
2.STORING DATAFRAMES DATA TO CSV FILE Example: Import pandas as pd df={‘roll no.’:[101,102,103], ‘name’=[‘ani’, ‘amna’, ‘arjit’]} df1=pd.DataFrame(df,columns=[‘roll no.’, ‘names’]) df1.to_csv(‘d:\\sample.csv’) Above program creates a dataframe and then with the help of to_csv(),this dataframe(df1) is being stored as in ‘d:\\sample.csv’.
Csv file: ,roll no.,names 0,101,ani 1,102,amna 2,103,arjit (the data from dataframe df1 is stored in this csv file with separator character as comma by default) ,roll no.,names 0,101,ani 1,102,amna 2,103,arjit
2.1 Handling NaN Values with to_csv() Example:df1.loc[2, ‘Roll no.’]=np.NaN df1.loc[0, ‘names’]=np.NaN ROLL NO. NAMES 101 NaN 1 102 AMNA 2 ARJIT
3.TRANSFERRING DATA BETWEEN DATAFRAMES AND SQL DATABASES BRIEF INTRODUTION TO SQLite DATABASE SQLite is an embedded SQL database engine which implement RDBMS that is a self-contain,serverless and requires zero configuration. 3.1BRINGING DATA FROM SQL DATABASE TABLE INTO A DATAFRAME Example:import pandas as pd import sqlite3 as sq conn=sq.connect(‘d:\\sqlite3\\new.db’) df=pd.read_sql(‘select*from st;’,conn)
3.2 STORING A DATAFRAME’S DATA AS A TABLE IN AN SQL DATABASE EXAMPLE:import pandas as pd import sqlite3 as sq conn=sqlite3.connect (‘d:\\data\\new.db’) df.to_sql(‘st’, conn) NOTE:if you run to_sql() that has a table which already exist then you must spcify argument if_exists=‘append’ or if_exists=‘replace’ otherwise phyton,if you set the value as append,then new data will be appended to existing table and if you set replace then new data will replace the old in the given table.
Assignments 1.which argument would you give to read.csv() if you only want to read top ten rows of data? 2.write a command to store data of dataframe mdf into a csv file ,with separator character as ‘@’. 3.What additional argument do you need to specify in to_sql() so that old data of sql table is retained? 4.By default, read_csv() uses the value of 1st row as column headers in dataframes.which argument will you give to ensure that the top/first row’s data is used as data and not as column headers?