#biverie
Explore tagged Tumblr posts
Text
Gestión y visualización de datos
Semana 4: Creación de gráficos para sus datos
Código
Univeriate plots
Biveriate plots
0 notes
Text
Data Management & Visualization: Week 4
Output / Figures:
Code
Import Libraries
import pandas import numpy import seaborn import matplotlib.pyplot as plt
bug fix for display formats to avoid run time errors
pandas.set_option('display.float_format', lambda x:'%f'%x)
Set Pandas to show all colums and rows in Dataframes
pandas.set_option('display.max_columns', None) pandas.set_option('display.max_rows', None)
Import gapminder.csv
data = pandas.read_csv('gapminder.csv', low_memory=False)
Replace all empty entries with 0
data = data.replace(r'^\s*$', numpy.NaN, regex=True)
Extract relevant variables from original dataset and save it in subdata set
print('List of extracted variables in subset') subdata = data[['incomeperperson', 'lifeexpectancy', 'suicideper100th']]
Safe backup file of reduced dataset
subdata2 = subdata.copy()
Convert all entries to numeric format
subdata2['incomeperperson'] = pandas.to_numeric(subdata2['incomeperperson']) subdata2['lifeexpectancy'] = pandas.to_numeric(subdata2['lifeexpectancy']) subdata2['suicideper100th'] = pandas.to_numeric(subdata2['suicideper100th'])
All rows containing value 0 / previously had no entry are deleted from the subdata set
subdata2 = subdata2.dropna() print(subdata2)
Describe statistical distribution of variable values
print('Statistics on "Income per Person"') desc_income = subdata2['incomeperperson'].describe() print(desc_income) print('Statistics on "Life Expectancy"') desc_lifeexp = subdata2['lifeexpectancy'].describe() print(desc_lifeexp) print('Statistics on "Suicide Rate per 100th"') desc_suicide = subdata2['suicideper100th'].describe() print(desc_suicide)
Identify min & max values within each column
print('Minimum & Maximum Income') min_income = min(subdata2['incomeperperson']) print(min_income) max_income = max(subdata2['incomeperperson']) print(max_income) print('')
print('Minimum & Maximum Life Expectancy') min_lifeexp = min(subdata2['lifeexpectancy']) print(min_lifeexp) max_lifeexp = max(subdata2['lifeexpectancy']) print(max_lifeexp) print('')
print('Minimum & Maximum Suicide Rate') min_srate = min(subdata2['suicideper100th']) print(min_srate) max_srate = max(subdata2['suicideper100th']) print(max_srate) print('')
Split up income into percentiles
subdata2['INCGROUPS10']=pandas.qcut(subdata2.incomeperperson, 11, labels=["1=0%tile","2=10%tile","3=20%tile","4=30%tile","5=40%tile","6=50%tile","7=60%tile","8=70%tile","9=80%tile","10=90%tile","11=100%tile"]) inc_dist_percent = subdata2['INCGROUPS10'].value_counts(sort=False, normalize=True, dropna=True) subdata2['INCGROUPS10'] = subdata2['INCGROUPS10'].astype('category') print(inc_dist_percent) print(subdata2)
subdata2['INCGROUPS5K'] = pandas.cut(subdata2.incomeperperson, [0, 5000, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000, 55000, 60000]) subdata2['INCGROUPS5K'] = subdata2['INCGROUPS5K'].astype('category') subdata2['INCGROUPS5K'] = subdata2['INCGROUPS5K'].cat.rename_categories(["2.5k", "7.5k", "12.5k", "17.5k", "22.5k", "27.5k", "32.5k", "37.5k", "42.5k", "47.5k", "52.5k", "57.5k"]) inc_dist_dollar = subdata2['INCGROUPS5K'].value_counts(sort=False, normalize=True, dropna=True) subdata2['INCGROUPS5K'] = subdata2['INCGROUPS5K'].astype('category') print(inc_dist_dollar)
subdata2['LIFEEXPGROUPS5Y'] = pandas.cut(subdata2.lifeexpectancy, [45, 50, 55, 60, 65, 70, 75, 80, 85, 90]) lifeexp_dist = subdata2['LIFEEXPGROUPS5Y'].value_counts(sort=False, normalize=True, dropna=True) subdata2['LIFEEXPGROUPS5Y'] = subdata2['LIFEEXPGROUPS5Y'].astype('category') print(lifeexp_dist)
The following cross table compares income and life expectancy of different groups
print('First, simplified comparison of income and life expectancy') comparison = pandas.crosstab(subdata2['INCGROUPS5K'], subdata2['LIFEEXPGROUPS5Y']) print(comparison)
Univeriate plots
seaborn.countplot(x='INCGROUPS5K', data=subdata2) plt.xlabel('Average Income of Income Group') plt.title('Count distribution of Income per Person')
Biveriate plots
seaborn.catplot(x='INCGROUPS5K', y='lifeexpectancy', data=subdata2, kind="bar", ci=None) plt.xlabel('Income Group') plt.ylabel('Life Expectancy')
seaborn.catplot(x='INCGROUPS5K', y='suicideper100th', data=subdata2, kind="bar", ci=None) plt.xlabel('Income Group') plt.ylabel('Suicide Rate per 100 persons')
0 notes
Text
someonE PLEASE SEND ME SOME BEVCHIE OR BILLVERLY OR BICHIE OR BIVERIE SMUT ID APPRECIATE YOU FOREVER
33 notes
·
View notes
Text
Week four assignment
SAS code for assignment week four.
LIBNAME mydata "/courses/d1406ae5ba27fe300 " access=readonly;
data new; set mydata.gapminder; keep country Hivrate lifeexpectancy employrate emp hiv le; /* data management foe lifeexpectancy*/ if lifeexpectancy < 45 then le= "0-45"; if lifeexpectancy >= 45 and lifeexpectancy < 55 then le="1-55"; if lifeexpectancy >=55 and lifeexpectancy < 65 then le= "2-65"; if lifeexpectancy >= 65 and lifeexpectancy < 75 then le= "3-75"; if lifeexpectancy >=75 then le="5-80"; /* data management foe hivrate*/ if hivrate < 0.1then hiv= "1"; if hivrate >= 1 and hivtate < 5 then hiv="2"; if hivrate >=5 and hivrate < 10 then hiv= "3"; if hivrate >= 10 and hivrate < 30 then hiv= "4"; if hivrate >=30 then hivrate ="5"; /* data management foe employrate*/ if employrate < 48 then emp = "1"; if employrate >= 68 and employrate < 80 then emp="2"; if employrate >=80 then emp ="3"; run;
/*frequency tables*/ proc freq; table le hiv emp; run; proc gchart; vbar le /Discrete type=pct width= 8; run; proc gchart; vbar hiv /Discrete type=pct width= 8; run; proc gchart; vbar emp /Discrete type=pct width= 8; run; /*biveriate graphs*/ proc gplot; plot employrate*lifeexpectancy; run; run;
Please click this link below to see a clearer look of my output thanks
https://drive.google.com/file/d/0B_yd05qSDRPEMkFaVVh1ZnNhb28/view
The univariate graph of life expectancy rate:
The graph above is unimodal, with its highest peak at the median category of 65 to 70% life expectancy rate. It seems to be skewed to the right as there are higher frequencies in higher categories than the lower categories.
The univariate graph of HIV rate:
The graph above is unimodal, with its highest peak at the median category of 0.1 to 1 % of the HIV rate. It seems to be skewed to the right as there are higher frequencies in lower categories than the higher categories.
The univariate graph of employment rate:
This graph is unimodal, with its highest peak at the category of 50 to 60% employment. It seems to be skewed the left however the skewness is cannot be said.
The graph below plots the HIV rate of a country to the country’s corresponding life expectancy. We can see that the scatter graph does show a clear relationship/trend between the two variables.
The graph below plots the employment rate of a country to the country��s corresponding life expectancy. We can see that the scatter graph does not show a clear relationship/trend between the two variables.
For a clearer view of the output you can follow this link
Thank in advance for your cooperation.
0 notes
Photo
#Repost @julia__brauer with @repostapp ・・・ Итаааак, как говорится "пока кто-то тормозит, мы вступаем в затакт") сутки в родном Новосибирске и отправляюсь греть свои ласты на солнышке!☀️🌴 пусть эти 24 часа в дороге пролетят мигом и я окажусь рядышком с @bivery !) а ещё я поняла, как же хорошо, когда у твоих любимых подруг руки растут оттуда откуда должны! Благодаря @smirnova_muah и @olga_beauty_lashes весь отпуск могу не краситься, потому что их золотые ручки сделали для меня чудесные бровки и реснички!!! Спасибо вам, мои волшебницы❤💋 . . . Не за что, дорогуша моя😘 Отдыхай с удовольствием❤️ #smirnova_brows #smirnova_отзывы #browmaster #бровинск (at Ploshchad Lenina)
0 notes