2020年11月22日 星期日

[Python]Dataframe GruopBy sum

In [19]:
import pandas as pd
In [20]:
df = pd.read_csv('weather_by_cities.csv')
Out[20]:
day city temperature windspeed event
0 1/1/2017 new york 32 6 Rain
1 1/2/2017 new york 36 7 Sunny
2 1/3/2017 new york 28 12 Snow
3 1/4/2017 new york 33 7 Sunny
4 1/1/2017 mumbai 90 5 Sunny
5 1/2/2017 mumbai 85 12 Fog
6 1/3/2017 mumbai 87 15 Fog
7 1/4/2017 mumbai 92 5 Rain
8 1/1/2017 paris 45 20 Sunny
9 1/2/2017 paris 50 13 Cloudy
10 1/3/2017 paris 54 8 Cloudy
11 1/4/2017 paris 42 10 Cloudy
In [55]:
g = df.groupby('city')
g
Out[55]:
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x00000200CAE08588>

Group BY

Group BY

In [56]:
for city, city_df in g:
    print(city)
    print(city_df)
mumbai
        day    city  temperature  windspeed  event
4  1/1/2017  mumbai           90          5  Sunny
5  1/2/2017  mumbai           85         12    Fog
6  1/3/2017  mumbai           87         15    Fog
7  1/4/2017  mumbai           92          5   Rain
new york
        day      city  temperature  windspeed  event
0  1/1/2017  new york           32          6   Rain
1  1/2/2017  new york           36          7  Sunny
2  1/3/2017  new york           28         12   Snow
3  1/4/2017  new york           33          7  Sunny
paris
         day   city  temperature  windspeed   event
8   1/1/2017  paris           45         20   Sunny
9   1/2/2017  paris           50         13  Cloudy
10  1/3/2017  paris           54          8  Cloudy
11  1/4/2017  paris           42         10  Cloudy
In [48]:
g = df.groupby('city').sum()
Out[48]:
temperature windspeed
city
mumbai 354 37
new york 129 32
paris 191 51
In [23]:
# SELECT * from city_data GROUP BY city
g.get_group('mumbai')
Out[23]:
day city temperature windspeed event
4 1/1/2017 mumbai 90 5 Sunny
5 1/2/2017 mumbai 85 12 Fog
6 1/3/2017 mumbai 87 15 Fog
7 1/4/2017 mumbai 92 5 Rain
In [33]:
g.max()
Out[33]:
day temperature windspeed event
city
mumbai 1/4/2017 92 15 Sunny
new york 1/4/2017 36 12 Sunny
paris 1/4/2017 54 20 Sunny
In [35]:
g.mean()
Out[35]:
temperature windspeed
city
mumbai 88.50 9.25
new york 32.25 8.00
paris 47.75 12.75
In [38]:
g.describe()
Out[38]:
temperature windspeed
count mean std min 25% 50% 75% max count mean std min 25% 50% 75% max
city
mumbai 4.0 88.50 3.109126 85.0 86.50 88.5 90.50 92.0 4.0 9.25 5.057997 5.0 5.00 8.5 12.75 15.0
new york 4.0 32.25 3.304038 28.0 31.00 32.5 33.75 36.0 4.0 8.00 2.708013 6.0 6.75 7.0 8.25 12.0
paris 4.0 47.75 5.315073 42.0 44.25 47.5 51.00 54.0 4.0 12.75 5.251984 8.0 9.50 11.5 14.75 20.0

[Python]Datafram merge (inner join, outer join, left, right)

Pandas Merge Tutorial

Basic Merge Using a Dataframe Column

In [28]:
import pandas as pd
df1 = pd.DataFrame({
    "city": ["new york","chicago","orlando"],
    "temperature": [21,14,35],
})
df1
Out[28]:
city temperature
0 new york 21
1 chicago 14
2 orlando 35
In [29]:
df2 = pd.DataFrame({
    "city": ["chicago","new york","orlando"],
    "humidity": [65,68,75],
})
df2
Out[29]:
city humidity
0 chicago 65
1 new york 68
2 orlando 75
In [30]:
df3 = pd.merge(df1, df2, on="city")
df3
Out[30]:
city temperature humidity
0 new york 21 68
1 chicago 14 65
2 orlando 35 75

Type Of DataBase Joins

<img src="db_joins.jpg" height="800", width="800">

In [31]:
df1 = pd.DataFrame({
    "city": ["new york","chicago","orlando", "baltimore"],
    "temperature": [21,14,35, 38],
})
df1
Out[31]:
city temperature
0 new york 21
1 chicago 14
2 orlando 35
3 baltimore 38
In [32]:
df2 = pd.DataFrame({
    "city": ["chicago","new york","san diego"],
    "humidity": [65,68,71],
})
df2
Out[32]:
city humidity
0 chicago 65
1 new york 68
2 san diego 71
In [33]:
df3=pd.merge(df1,df2,on="city",how="inner")
df3
Out[33]:
city temperature humidity
0 new york 21 68
1 chicago 14 65
In [34]:
df3=pd.merge(df1,df2,on="city",how="outer")
df3
Out[34]:
city temperature humidity
0 new york 21.0 68.0
1 chicago 14.0 65.0
2 orlando 35.0 NaN
3 baltimore 38.0 NaN
4 san diego NaN 71.0
In [35]:
df3=pd.merge(df1,df2,on="city",how="left")
df3
Out[35]:
city temperature humidity
0 new york 21 68.0
1 chicago 14 65.0
2 orlando 35 NaN
3 baltimore 38 NaN
In [36]:
df3=pd.merge(df1,df2,on="city",how="right")
df3
Out[36]:
city temperature humidity
0 new york 21.0 68
1 chicago 14.0 65
2 san diego NaN 71

indicator flag

In [37]:
df3=pd.merge(df1,df2,on="city",how="outer",indicator=True)
df3
Out[37]:
city temperature humidity _merge
0 new york 21.0 68.0 both
1 chicago 14.0 65.0 both
2 orlando 35.0 NaN left_only
3 baltimore 38.0 NaN left_only
4 san diego NaN 71.0 right_only

suffixes

In [38]:
df1 = pd.DataFrame({
    "city": ["new york","chicago","orlando", "baltimore"],
    "temperature": [21,14,35,38],
    "humidity": [65,68,71, 75]
})
df1
Out[38]:
city humidity temperature
0 new york 65 21
1 chicago 68 14
2 orlando 71 35
3 baltimore 75 38
In [39]:
df2 = pd.DataFrame({
    "city": ["chicago","new york","san diego"],
    "temperature": [21,14,35],
    "humidity": [65,68,71]
})
df2
Out[39]:
city humidity temperature
0 chicago 65 21
1 new york 68 14
2 san diego 71 35
In [40]:
df3= pd.merge(df1,df2,on="city",how="outer", suffixes=('_first','_second'))
df3
Out[40]:
city humidity_first temperature_first humidity_second temperature_second
0 new york 65.0 21.0 68.0 14.0
1 chicago 68.0 14.0 65.0 21.0
2 orlando 71.0 35.0 NaN NaN
3 baltimore 75.0 38.0 NaN NaN
4 san diego NaN NaN 71.0 35.0

join

In [58]:
df1 = pd.DataFrame({
    "city": ["new york","chicago","orlando"],
    "temperature": [21,14,35],
})
df1.set_index('city',inplace=True)
df1
Out[58]:
temperature
city
new york 21
chicago 14
orlando 35
In [59]:
df2 = pd.DataFrame({
    "city": ["chicago","new york","orlando"],
    "humidity": [65,68,75],
})
df2.set_index('city',inplace=True)
df2
Out[59]:
humidity
city
chicago 65
new york 68
orlando 75
In [60]:
df1.join(df2,lsuffix='_l', rsuffix='_r')
Out[60]:
temperature humidity
city
new york 21 68
chicago 14 65
orlando 35 75

2020年11月5日 星期四

python-dotenv

安裝python-dotenv套件
$ pip install python-dotenv

.env檔 內容↓↓↓
IP=10.15.10.47
PORT=8080
USER_NAME="ADMIN"
PASSWORD=P@ssw0rd


test_ditenv.py
import os
from dotenv import load_dotenv
load_dotenv()
IP = os.getenv('IP')
PORT = os.getenv('PORT')
USER_NAME = os.getenv('USER_NAME')
PASSWORD = os.getenv('PASSWORD')

print(f'IP => {IP}   ', type(IP))
print(f'PORT => {PORT}   ', type(PORT))
print(f'USER_NAME => {USER_NAME}   ', type(USER_NAME))
print(f'PASSWORD => {PASSWORD}   ', type(PASSWORD))
IP => 10.15.10.47 <class 'str'>
PORT => 8080 <class 'str'>
USER_NAME => ADMIN <class 'str'>
PASSWORD => P@ssw0rd <class 'str'>