Recent Posts

Python: Reading data from local CSV file and Processing the data


In this post,
We will load the data from the csv file kept on local machine and then do some basic processing and then display the output.


Programming Language: Python 3
Software: Anaconda Jupyter Notebook


Here, csv file contains temperature data and the format of the the csv file is as follows:

1.csv:
created_at,entry_id,field1
2017-12-14 18:34:48 UTC,1,22.42
2017-12-14 18:35:04 UTC,2,22.42
2017-12-14 18:35:21 UTC,3,22.42
2017-12-14 18:35:37 UTC,4,26.67
2017-12-14 18:35:54 UTC,5,27.27
2017-12-14 18:36:10 UTC,6,24.85
2017-12-14 18:36:27 UTC,7,24.24
2017-12-14 18:36:44 UTC,8,23.64
2017-12-14 18:37:00 UTC,9,23.33
2017-12-14 18:37:17 UTC,10,23.33
2017-12-14 18:37:34 UTC,11,23.33
2017-12-14 18:37:50 UTC,12,23.33
2017-12-14 18:38:07 UTC,13,23.33
2017-12-14 18:38:24 UTC,14,23.03
2017-12-14 18:38:40 UTC,15,23.33
2017-12-14 18:38:58 UTC,16,23.03
2017-12-14 18:39:15 UTC,17,22.73
2017-12-19 14:59:38 UTC,18,123
2017-12-19 15:01:28 UTC,19,12
2017-12-20 18:49:02 UTC,20,0.61
2017-12-20 18:49:19 UTC,21,0.61
2017-12-20 18:49:35 UTC,22,0.61
2017-12-20 18:49:52 UTC,23,0.61
2017-12-20 18:50:09 UTC,24,0.61
2017-12-20 18:50:25 UTC,25,0.61
2017-12-20 18:50:42 UTC,26,0.61
2017-12-20 18:50:58 UTC,27,0.30
2017-12-20 18:51:15 UTC,28,0.61
2017-12-20 18:51:31 UTC,29,0.30
2017-12-20 18:51:48 UTC,30,0.30
2017-12-20 18:52:04 UTC,31,0.61
2017-12-20 18:52:21 UTC,32,0.61
2017-12-20 18:52:37 UTC,33,0.61
2017-12-20 18:52:54 UTC,34,0.61
2017-12-20 18:53:10 UTC,35,0.61
2017-12-20 18:53:27 UTC,36,0.30
2017-12-31 11:09:14 UTC,38,25
2017-12-31 11:10:22 UTC,39,30
2017-12-31 11:10:59 UTC,40,20
2017-12-31 11:40:32 UTC,41,125
2017-12-31 12:51:59 UTC,42,0.61
2017-12-31 12:52:20 UTC,43,0.30
2017-12-31 12:52:40 UTC,44,0.61
2017-12-31 12:53:00 UTC,45,0.61
2017-12-31 12:53:20 UTC,46,0.30
2017-12-31 12:53:40 UTC,47,0.61
2017-12-31 12:54:00 UTC,48,0.61
2017-12-31 12:54:20 UTC,49,0.61
2017-12-31 12:54:40 UTC,50,0.30
2017-12-31 12:55:00 UTC,51,1.21
2017-12-31 12:55:20 UTC,52,0.30
2017-12-31 12:55:40 UTC,53,0.61
2017-12-31 12:56:00 UTC,54,0.30
2017-12-31 12:56:20 UTC,55,0.61
2017-12-31 12:56:40 UTC,56,0.30
2017-12-31 12:57:00 UTC,57,0.91
2017-12-31 12:57:20 UTC,58,0.30
2017-12-31 12:57:40 UTC,59,1.21
2017-12-31 12:58:00 UTC,60,1.52
2017-12-31 12:58:20 UTC,61,0.61
2017-12-31 12:58:40 UTC,62,0.91




Run the following code in Anaconda Jupyter Notebook.
code block and its expected output is shown below:

Code In[1]:
# coding: utf-8
# In[1]:
## EXAMPLE: Reading data from CSV file kept on local computer

## Read data from local csv file > clean the data > Process the Data (Find out Minimum, Maximum & Average, etc)

## DATA FORMAT in csv file is as follows:
#
#  created_at,entry_id,field1
#  2018-01-04 17:00:54 UTC,1,21.52
#  2018-01-04 17:02:44 UTC,2,21.52
#  2018-01-04 17:03:05 UTC,3,21.82


## Run this code in Anaconda Jupyter (Python 3)
import csv
exampleFile = open('/Users/apdaga/Downloads/1.csv') # csv file downloaded from thinkspeak server
print (exampleFile)

exampleReader = csv.reader(exampleFile)
print (exampleReader)

exampleData = list(exampleReader)                   # create list of the data
print (exampleData)
Output In[1]:
<_io.TextIOWrapper name='/Users/apdaga/Downloads/1.csv' mode='r' encoding='UTF-8'>
<_csv.reader object at 0x10ad377b8>         
[['created_at', 'entry_id', 'field1'], ['2017-12-14 18:34:48 UTC', '1', '22.42'], ['2017-12-14 18:3         
5:04 UTC', '2', '22.42'], ['2017-12-14 18:35:21 UTC', '3', '22.42'], ['2017-12-14 18:35:37 UTC',         
'4', '26.67'], ['2017-12-14 18:35:54 UTC', '5', '27.27'], ['2017-12-14 18:36:10 UTC', '6', '24.85'],         
['2017-12-14 18:36:27 UTC', '7', '24.24'], ['2017-12-14 18:36:44 UTC', '8', '23.64'], ['2017-12-14 1         
8:37:00 UTC', '9', '23.33'], ['2017-12-14 18:37:17 UTC', '10', '23.33'], ['2017-12-14 18:37:34 UTC',         
'11', '23.33'], ['2017-12-14 18:37:50 UTC', '12', '23.33'], ['2017-12-14 18:38:07 UTC', '13', '23.3         
3'], ['2017-12-14 18:38:24 UTC', '14', '23.03'], ['2017-12-14 18:38:40 UTC', '15', '23.33'], ['2017-         
12-14 18:38:58 UTC', '16', '23.03'], ['2017-12-14 18:39:15 UTC', '17', '22.73'], ['2017-12-19 14:59:         
38 UTC', '18', '123'], ['2017-12-19 15:01:28 UTC', '19', '12'], ['2017-12-20 18:49:02 UTC', '20',         
'0.61'], ['2017-12-20 18:49:19 UTC', '21', '0.61'], ['2017-12-20 18:49:35 UTC', '22', '0.61'], ['201         
7-12-20 18:49:52 UTC', '23', '0.61'], ['2017-12-20 18:50:09 UTC', '24', '0.61'], ['2017-12-20 18:50:         
25 UTC', '25', '0.61'], ['2017-12-20 18:50:42 UTC', '26', '0.61'], ['2017-12-20 18:50:58 UTC', '27',         
'0.30'], ['2017-12-20 18:51:15 UTC', '28', '0.61'], ['2017-12-20 18:51:31 UTC', '29', '0.30'], ['201         
7-12-20 18:51:48 UTC', '30', '0.30'], ['2017-12-20 18:52:04 UTC', '31', '0.61'], ['2017-12-20 18:52:         
21 UTC', '32', '0.61'], ['2017-12-20 18:52:37 UTC', '33', '0.61'], ['2017-12-20 18:52:54 UTC', '34',         
'0.61'], ['2017-12-20 18:53:10 UTC', '35', '0.61'], ['2017-12-20 18:53:27 UTC', '36', '0.30'], ['201         
7-12-31 11:09:14 UTC', '38', '25'], ['2017-12-31 11:10:22 UTC', '39', '30'], ['2017-12-31 11:10:59 U         
TC', '40', '20'], ['2017-12-31 11:40:32 UTC', '41', '125'], ['2017-12-31 12:51:59 UTC', '42', '0.6         
1'], ['2017-12-31 12:52:20 UTC', '43', '0.30'], ['2017-12-31 12:52:40 UTC', '44', '0.61'], ['2017-12         
-31 12:53:00 UTC', '45', '0.61'], ['2017-12-31 12:53:20 UTC', '46', '0.30'], ['2017-12-31 12:53:40 U         
TC', '47', '0.61'], ['2017-12-31 12:54:00 UTC', '48', '0.61'], ['2017-12-31 12:54:20 UTC', '49', '0.         
61'], ['2017-12-31 12:54:40 UTC', '50', '0.30'], ['2017-12-31 12:55:00 UTC', '51', '1.21'], ['2017-1         
2-31 12:55:20 UTC', '52', '0.30'], ['2017-12-31 12:55:40 UTC', '53', '0.61'], ['2017-12-31 12:56:00         
UTC', '54', '0.30'], ['2017-12-31 12:56:20 UTC', '55', '0.61'], ['2017-12-31 12:56:40 UTC', '56',         
'0.30'], ['2017-12-31 12:57:00 UTC', '57', '0.91'], ['2017-12-31 12:57:20 UTC', '58', '0.30'], ['201         
7-12-31 12:57:40 UTC', '59', '1.21'], ['2017-12-31 12:58:00 UTC', '60', '1.52'], ['2017-12-31 12:58:         
20 UTC', '61', '0.61'], ['2017-12-31 12:58:40 UTC', '62', '0.91'], [], []]


Code In[2]:
# In[2]:
HLsData = exampleData[1:len(exampleData)-2]         # remove 1st and last 2 records
print (HLsData)
Output In[2]:
[['2017-12-14 18:34:48 UTC', '1', '22.42'], ['2017-12-14 18:35:04 UTC', '2', '22.42'], ['2017-12-14
18:35:21 UTC', '3', '22.42'], ['2017-12-14 18:35:37 UTC', '4', '26.67'], ['2017-12-14 18:35:54 UTC',
'5', '27.27'], ['2017-12-14 18:36:10 UTC', '6', '24.85'], ['2017-12-14 18:36:27 UTC', '7', '24.24'],
['2017-12-14 18:36:44 UTC', '8', '23.64'], ['2017-12-14 18:37:00 UTC', '9', '23.33'], ['2017-12-14 1
8:37:17 UTC', '10', '23.33'], ['2017-12-14 18:37:34 UTC', '11', '23.33'], ['2017-12-14 18:37:50 UT
C', '12', '23.33'], ['2017-12-14 18:38:07 UTC', '13', '23.33'], ['2017-12-14 18:38:24 UTC', '14', '2
3.03'], ['2017-12-14 18:38:40 UTC', '15', '23.33'], ['2017-12-14 18:38:58 UTC', '16', '23.03'], ['20
17-12-14 18:39:15 UTC', '17', '22.73'], ['2017-12-19 14:59:38 UTC', '18', '123'], ['2017-12-19 15:0
1:28 UTC', '19', '12'], ['2017-12-20 18:49:02 UTC', '20', '0.61'], ['2017-12-20 18:49:19 UTC', '21',
'0.61'], ['2017-12-20 18:49:35 UTC', '22', '0.61'], ['2017-12-20 18:49:52 UTC', '23', '0.61'], ['201
7-12-20 18:50:09 UTC', '24', '0.61'], ['2017-12-20 18:50:25 UTC', '25', '0.61'], ['2017-12-20 18:50:
42 UTC', '26', '0.61'], ['2017-12-20 18:50:58 UTC', '27', '0.30'], ['2017-12-20 18:51:15 UTC', '28',
'0.61'], ['2017-12-20 18:51:31 UTC', '29', '0.30'], ['2017-12-20 18:51:48 UTC', '30', '0.30'], ['201
7-12-20 18:52:04 UTC', '31', '0.61'], ['2017-12-20 18:52:21 UTC', '32', '0.61'], ['2017-12-20 18:52:
37 UTC', '33', '0.61'], ['2017-12-20 18:52:54 UTC', '34', '0.61'], ['2017-12-20 18:53:10 UTC', '35',
'0.61'], ['2017-12-20 18:53:27 UTC', '36', '0.30'], ['2017-12-31 11:09:14 UTC', '38', '25'], ['2017-
12-31 11:10:22 UTC', '39', '30'], ['2017-12-31 11:10:59 UTC', '40', '20'], ['2017-12-31 11:40:32 UT
C', '41', '125'], ['2017-12-31 12:51:59 UTC', '42', '0.61'], ['2017-12-31 12:52:20 UTC', '43', '0.3
0'], ['2017-12-31 12:52:40 UTC', '44', '0.61'], ['2017-12-31 12:53:00 UTC', '45', '0.61'], ['2017-12
-31 12:53:20 UTC', '46', '0.30'], ['2017-12-31 12:53:40 UTC', '47', '0.61'], ['2017-12-31 12:54:00 U
TC', '48', '0.61'], ['2017-12-31 12:54:20 UTC', '49', '0.61'], ['2017-12-31 12:54:40 UTC', '50', '0.
30'], ['2017-12-31 12:55:00 UTC', '51', '1.21'], ['2017-12-31 12:55:20 UTC', '52', '0.30'], ['2017-1
2-31 12:55:40 UTC', '53', '0.61'], ['2017-12-31 12:56:00 UTC', '54', '0.30'], ['2017-12-31 12:56:20
UTC', '55', '0.61'], ['2017-12-31 12:56:40 UTC', '56', '0.30'], ['2017-12-31 12:57:00 UTC', '57',
'0.91'], ['2017-12-31 12:57:20 UTC', '58', '0.30'], ['2017-12-31 12:57:40 UTC', '59', '1.21'], ['201
7-12-31 12:58:00 UTC', '60', '1.52'], ['2017-12-31 12:58:20 UTC', '61', '0.61'], ['2017-12-31 12:58:
40 UTC', '62', '0.91']]




Code In[3]:
# In[3]:
a,b,tData = zip(*HLsData)                           # To unzip the file
#print (a)
#print (b)
print (tData)
Output In[3]:
('22.42', '22.42', '22.42', '26.67', '27.27', '24.85', '24.24', '23.64', '23.33', '23.33', '23.33',
'23.33', '23.33', '23.03', '23.33', '23.03', '22.73', '123', '12', '0.61', '0.61', '0.61', '0.61',
'0.61', '0.61', '0.61', '0.30', '0.61', '0.30', '0.30', '0.61', '0.61', '0.61', '0.61', '0.61', '0.3
0', '25', '30', '20', '125', '0.61', '0.30', '0.61', '0.61', '0.30', '0.61', '0.61', '0.61', '0.30',
'1.21', '0.30', '0.61', '0.30', '0.61', '0.30', '0.91', '0.30', '1.21', '1.52', '0.61', '0.91')


Code In[4]:
# In[4]:
sData = list(tData)                                 # Tupple to list conversion
print (sData)
Output In[4]:
['22.42', '22.42', '22.42', '26.67', '27.27', '24.85', '24.24', '23.64', '23.33', '23.33', '23.33',
'23.33', '23.33', '23.03', '23.33', '23.03', '22.73', '123', '12', '0.61', '0.61', '0.61', '0.61',
'0.61', '0.61', '0.61', '0.30', '0.61', '0.30', '0.30', '0.61', '0.61', '0.61', '0.61', '0.61', '0.3
0', '25', '30', '20', '125', '0.61', '0.30', '0.61', '0.61', '0.30', '0.61', '0.61', '0.61', '0.30',
'1.21', '0.30', '0.61', '0.30', '0.61', '0.30', '0.91', '0.30', '1.21', '1.52', '0.61', '0.91']


Code In[5]:
# In[5]:
data = [float(i) for i in sData]                    # String to float conversion
print (data)
Output In[5]:
[22.42, 22.42, 22.42, 26.67, 27.27, 24.85, 24.24, 23.64, 23.33, 23.33, 23.33, 23.33, 23.33, 23.03, 
23.33, 23.03, 22.73, 123.0, 12.0, 0.61, 0.61, 0.61, 0.61, 0.61, 0.61, 0.61, 0.3, 0.61, 0.3, 0.3, 0.61,
0.61, 0.61, 0.61, 0.61, 0.3, 25.0, 30.0, 20.0, 125.0, 0.61, 0.3, 0.61, 0.61, 0.3, 0.61, 0.61, 0.61,
0.3, 1.21, 0.3, 0.61, 0.3, 0.61, 0.3, 0.91, 0.3, 1.21, 1.52, 0.61, 0.91]


Code In[6]:
# In[6]:
## Prompt for lower and upper bounds convert to float
## Take Input from user: 
lower = float(input("Enter Lower Bound: "))
upper = float(input("Enter Upper Bound: "))
Output In[6]:
Enter Lower Bound: 0
Enter Upper Bound: 23




Code In[7]:
# In[7]:
## create and initialize variable for min, max, sum and count
max=0
min=0
sum=0
count=0

## Processing on the Data:
# Use a for loop to examine each element in the data set
for v in data:
    # Only process data if in [lower, upper]
    if v &gt;= lower and v&lt;=upper: 
        # update maximum in larger 
        if v &gt; max:
            max = v
        # update minimum is smaller
        if v &lt; min:
            min = v
            
        sum+=v
        count+=1

# Print data, list, max, min, and average of the values in the range
#print (data)
print ("Maximum: ",max)
print ("Minimum: ",min)
print ("Average: ",sum/count)
Output In[7]:
Maximum:  22.73
Minimum:  0
Average:  3.283409090909096


--------------------------------------------------------------------------------

Click here to see more codes for Raspberry Pi 3 and similar Family.
&
Click here to see more codes for NodeMCU ESP8266 and similar Family.
&
Click here to see more codes for Arduino Mega (ATMega 2560) and similar Family.

Feel free to ask doubts in the comment section. I will try my best to solve it.
If you find this helpful by any mean like, comment and share the post.
This is the simplest way to encourage me to keep doing such work.

Thanks and Regards,
-Akshay P. Daga


No comments