Data Management
4 posts
Don't wanna be here? Send us removal request.
jeremypjwtest · 5 years ago
Week 4
LIBNAME mydata "/courses/d1406ae5ba27fe300" access=readonly; DATA new; set mydata.gapminder;
LABEL country="Country" incomeperperson="Income (per person)" suicideper100th="Suicide Rate" relectricperperson="Electricity (per person)";
Country_Category="                    "; Income_Category="                    "; Electricity_Category="                    "; Suicide_Category="                    ";
/* Asian Countries */ IF country="China" or country="India" or country="Indonesia" or country="Pakistan" or country="Bangladesh" or country="Japan" or country="Philippines" or country="Vietnam" or country="Turkey" or country="Iran" or country="Thailand" or country="Myanmar" or country="Korea, Dem. Rep." or country="Iraq" or country="Afghanistan" or country="Saudi Arabia" or country="Uzbekistan" or country="Malaysia" or country="Yemen" or country="Nepal" or country="Korea, Rep." or country="Sri Lanka" or country="Kazakhstan" or country="Syria" or country="Cambodia" or country="Jordan" or country="Azerbaijan" or country="United Arab Emirates" or country="Tajikistan" or country="Israel" or country="Laos" or country="Lebanon" or country="Krygyzstan" or country="Turkmenistan" or country="Singapore" or country="Oman" or country="State of Palestine" or country="Kuwait" or country="Georgia" or country="Mongolia" or country="Armenia" or country="Qatar" or country="Bahrain" or country="Timor-Leste" or country="Cyprus" or country="Bhutan" or country="Maldives" or country="Brunei" THEN Country_Category="Asian Countries"; ELSE Country_Category="Other Countries";
/* Income Category */ if incomeperperson=. then Income_Category="1. Unknown"; else if incomeperperson LE 1035 then Income_Category= "2. $0 - $1,035"; else if incomeperperson LE 4045 then Income_Category= "3. $1,036 - $4,045 "; else if incomeperperson LE 12535 then Income_Category= "4. $4,046 - $12,535"; else if incomeperperson GT 12535 then Income_Category= "5. Over $12,536";
/* Suicide Category */ if suicideper100th=. then Suicide_Category="1. Unknown"; else if suicideper100th LE 10 then Suicide_Category="2. Very Low"; else if suicideper100th LE 20 then Suicide_Category="3. Low"; else if suicideper100th LE 30 then Suicide_Category="4. High"; else if suicideper100th GT 30 then Suicide_Category="5. Very High";
/* Electricity_Category */ if relectricperperson=. then Electricity_Category="1. Unknown"; else if relectricperperson LE 2000 then Electricity_Category="2. Very Low"; else if relectricperperson LE 4000 then Electricity_Category="3. Low"; else if relectricperperson LE 6000 then Electricity_Category="4. Moderate"; else if relectricperperson LE 8000 then Electricity_Category="5. High"; else if relectricperperson GT 8000 then Electricity_Category="6. Very High";
PROC SORT; by Country_Category; PROC SORT; by Income_Category; PROC SORT; by Suicide_Category; PROC SORT; by Electricity_Category;
PROC FREQ; TABLES Country_Category Income_Category Suicide_Category Electricity_Category; PROC UNIVARIATE; VAR incomeperperson suicideper100th relectricperperson;
/* Univariate Graphs */ PROC GCHART; VBAR Income_Category Suicide_Category Electricity_Category;
/* Bivariate Graphs */ PROC GCHART; VBAR Income_Category/DISCRETE TYPE=MEAN SUMVAR=suicideper100th; PROC GCHART; VBAR Income_Category/DISCRETE TYPE=MEAN SUMVAR=relectricperperson; RUN;
LOG: No error
Tumblr media
Univariate Graphs
1. Income Category
Tumblr media
As seen in the graph, there are 50 something countries that have lower-middle income economy and low income economy respectively, with the highest number being lower-middle income. Then, there are about 40 something countries for higher-middle income economy and high income economy respectively.
2. Suicide Category
Tumblr media
This graph highlights the fact that most countries have very low and low suicide rate, whereas a minimum number of countries have very high suicide rate.
3. Electricity Category
Tumblr media
The graph highlights the fact that majority of the countries have very low electricity consumption.
Bivariate graphs
1. Income (per person) & Suicide rate
Tumblr media
My hypothesis was that there is a negative relationship between these variables. The chart helps to confirm my hypothesis, in that there is indeed a negative relationship between a person’s income and suicide rate. Nonetheless, the relationship is not very strong. As seen in the chart, the suicide rate is highest for the low income category, the lower-middle income category is second to that and the suicide rate decreases as income increases. However, there isn’t a huge difference in percentages of suicide rate between different income levels.
2. Income (per person) & Electricity consumption (per person)
Tumblr media
My hypothesis was that there is a positive relationship between these variables. This is proven to be true as there is a strong positive relationship between a person’s income and the amount of electricity they consume. As seen in the chart, as the income level increases, the amount of electricity consumed increases as well. 
0 notes
jeremypjwtest · 5 years ago
Week 3
I continuously updated and made improvements to the codes as I learned more from the course. 
My codes:
LIBNAME mydata "/courses/d1406ae5ba27fe300" access=readonly;
DATA new; set mydata.gapminder;
incomeperperson="Income (per person)"
suicideper100th="Suicide Rate"
relectricperperson="Electricity (per person)";
Country_Category="                    ";
Income_Category="                    ";
Electricity_Category="                    ";
Suicide_Category="                    ";
/* Asian Countries */
IF country="China" or country="India" or country="Indonesia" or country="Pakistan" or country="Bangladesh" or country="Japan" or country="Philippines" or country="Vietnam" 
or country="Turkey" or country="Iran" or country="Thailand" or country="Myanmar" or country="Korea, Dem. Rep." or country="Iraq" or country="Afghanistan" or country="Saudi Arabia"
or country="Uzbekistan" or country="Malaysia" or country="Yemen" or country="Nepal" or country="Korea, Rep." or country="Sri Lanka" or country="Kazakhstan" or country="Syria" or country="Cambodia"
or country="Jordan" or country="Azerbaijan" or country="United Arab Emirates" or country="Tajikistan" or country="Israel" or country="Laos" or country="Lebanon" or country="Krygyzstan"
or country="Turkmenistan" or country="Singapore" or country="Oman" or country="State of Palestine" or country="Kuwait" or country="Georgia" or country="Mongolia" or country="Armenia" or country="Qatar"
or country="Bahrain" or country="Timor-Leste" or country="Cyprus" or country="Bhutan" or country="Maldives" or country="Brunei" THEN Country_Category="Asian Countries";
ELSE Country_Category="Other Countries";
/* Income Category */
if incomeperperson=. then Income_Category="1. Unknown";
else if incomeperperson LE 1035 then Income_Category= "2. $0 - $1,035";
else if incomeperperson LE 4045 then Income_Category= "3. $1,036 - $4,045 ";
else if incomeperperson LE 12535 then Income_Category= "4. $4,046 - $12,535";
else if incomeperperson GT 12535 then Income_Category= "5. Over $12,536";
/* Suicide Category */
if suicideper100th=. then Suicide_Category="1. Unknown";
else if suicideper100th LE 10 then Suicide_Category="2. Very Low";
else if suicideper100th LE 20 then Suicide_Category="3. Low";
else if suicideper100th LE 30 then Suicide_Category="4. High";
else if suicideper100th GT 30 then Suicide_Category="5. Very High";
/* Electricity_Category */
if relectricperperson=. then Electricity_Category="1. Unknown";
else if relectricperperson LE 2000 then Electricity_Category="2. Very Low";
else if relectricperperson LE 4000 then Electricity_Category="3. Low";
else if relectricperperson LE 6000 then Electricity_Category="4. Moderate";
else if relectricperperson LE 8000 then Electricity_Category="5. High";
else if relectricperperson GT 8000 then Electricity_Category="6. Very High";
PROC SORT; by Country_Category;
PROC SORT; by Income_Category;
PROC SORT; by Suicide_Category;
PROC SORT; by Electricity_Category;
PROC FREQ; TABLES Country_Category Income_Category Suicide_Category Electricity_Category;
LOG: No Error 
Tumblr media
Tumblr media
Adding on to my previous category, I created a new country category called “Other Countries” to represent countries outside of Asia.
Income (per person)
Tumblr media
As seen in the table, there are 54 countries (25.35%) that are low income economy and 55 countries (25.82%) that are lower-middle income economy. 41 countries (19.25%) are upper-middle income economy and 40 countries (18.78%) are high income economy. The data for income per person are missing for 23 countries (10.80%).
Suicide Rate
Tumblr media
Majority of the countries enjoy a relatively low rate of suicide. Specifically, there are 114 countries (53.52%) with very low suicide rate and 63 countries (29.58%) with low suicide rate. On the other hand, only 12 countries (5.63%) have high suicide rate and 2 countries (0.94%) have very high suicide rate. For the other 22 countries (10.33%), their data for suicide rate is missing.
Electricity Consumption (per person)
Tumblr media
113 countries (53.05%) have very low electricity consumption. 14 countries (6.57%) have low electricity consumption. 5 countries (2.35%) have moderate electricity consumption. 2 countries (0.94%) have high electricity consumption.  2 countries (0.94%) have very high electricity consumption. Surprisingly, a high number of countries (36.15%) do not possess data for electricity consumption.
0 notes
jeremypjwtest · 5 years ago
Week 2
Upon learning both SAS and python, I have decided to use SAS to run my codes.
My codes:
LIBNAME mydata "/courses/d1406ae5ba27fe300" access=readonly;
DATA new; set mydata.gapminder;
incomeperperson="Income (per person)"
suicideper100th="Suicide Rate"
relectricperperson="Electricity (per person)";
Income_Category="                        ";
Electricity_Category="                    ";
Suicide_Category="                    ";
/* Asian Countries */
IF country="China" or country="India" or country="Indonesia" or country="Pakistan" or country="Bangladesh" or country="Japan" or country="Philippines" or country="Vietnam" or country="Turkey" or country="Iran" or country="Thailand" or country="Myanmar" or country="Korea, Rep." or country="Iraq" or country="Afghanistan" or country="Saudi Arabia" or country="Uzbekistan" or country="Malaysia" or country="Yemen" or country="Nepal" or country="Korea, Dem. Rep." or country="Sri Lanka" or country="Kazakhstan" or country="Syria" or country="Cambodia" or country="Jordan" or country="Azerbaijan" or country="United Arab Emirates" or country="Tajikistan" or country="Israel" or country="Laos" or country="Lebanon" or country="Krygyzstan" or country="Turkmenistan" or country="Singapore" or country="Oman" or country="State of Palestine" or country="Kuwait" or country="Georgia" or country="Mongolia" or country="Armenia" or country="Qatar" or country="Bahrain" or country="Timor-Leste" or country="Cyprus" or country="Bhutan" or country="Maldives" or country="Brunei";
/* Income Category */
if incomeperperson LE 1035 then Income_Category= "1. $0 - $1,035";
else if incomeperperson LE 4045 then Income_Category= "2. $1,036 - $4,045 ";
else if incomeperperson LE 12535 then Income_Category= "3. $4,046 - $12,535";
else if incomeperperson GT 12535 then Income_Category= "4. Over $12,536";
/* Suicide Category */
if suicideper100th LE 10 then Suicide_Category="1. Low";
else if suicideper100th LE 20 then Suicide_Category="2. Moderate";
else if suicideper100th GT 20 then Suicide_Category="3. High";
/* Electricity_Category */
if relectricperperson LE 2000 then Electricity_Category="1. Very Low";
else if relectricperperson LE 4000 then Electricity_Category="2. Low";
else if relectricperperson LE 6000 then Electricity_Category="3. Moderate";
else if relectricperperson LE 8000 then Electricity_Category="4. High";
else if relectricperperson GT 8000 then Electricity_Category="5. Very High";
PROC SORT; by country;
PROC SORT; by Income_Category;
PROC SORT; by Suicide_Category;
PROC SORT; by Electricity_Category;
PROC FREQ; TABLES country Income_Category Suicide_Category Electricity_Category;
LOG: No Error
Tumblr media
Asian Countries
Tumblr media
Officially, there are actually 48 countries in Asia, therefore I defined all 48 countries in my code. However, only 45 countries were shown in the results, which indicate that there are 3 Asian countries that did not participate in the survey.
Income (per person)
Tumblr media
As displayed in the table above, there are 16 countries (35.56%) with an income per person of less than $1,035 while 15 countries have an income per person of less than $4,045. According to World Bank, these countries are classified as low income economy and lower-middle income economy respectively. On the other hand, 14 countries have better economy conditions where 6 of them are of upper-middle income economy and 8 of them with high income economy. 
Suicide Rate
Tumblr media
Based on the table, majority of the Asian countries have a low suicide rate, 27 countries (60%) to be specific. Meanwhile, another 13 countries (28.89%) have moderate suicide rate and only 5 countries (11.11%) have high suicide rate. 
Electricity Consumption (per person)
Tumblr media
As seen in the table, a vast majority (77.78%) of the Asian countries have very low electricity consumption. There are 6 countries (13.33%) with low electricity consumption, 1 country (2.22%) with moderate  electricity consumption, 1 country (2.22%) with high electricity consumption and 2 countries (4.44%) with very high electricity consumption.
0 notes
jeremypjwtest · 5 years ago
Week 1
The relationship between Income and Suicide rate
Topic Scope 
After reviewing the codebooks provided in the course, I chose the Gapminder codebook. Upon reading through the codebook, I realized I am most keen to see if there is an association between a person’s income and the suicide rate. Then, I also wanted to determine if there is any relationship between a person’s income and their usage of electricity.
These are the variables as provided in the Gapminder codebook that I chose to include in my studies:
incomeperperson : 2010 Gross Domestic Product per capita in constant 2000 US$. The inflation but not the differences in the cost of living between countries has been taken into account.
suicideper100TH : 2005 Suicide, age adjusted, per 100 000 Mortality due to self-inflicted injury, per 100 000 standard population, age adjusted
relectricperperson : 2008 residential electricity consumption, per person (kWh) The amount of residential electricity consumption per person during the given year, counted in kilowatt-hours (kWh) 
Existing Research Conducted and Hypothesis
A research conducted by Cantor, Slater and Najman in 2010 found that there is a positive association between socioeconomic indices and the suicide rate in Queensland, Australia. This research provides the foundation to believe that my hypothesis is correct. However, further analysis should be done to strengthen my stand.
Furthermore, Barnes, Khandker and Samad (2012) collected data sets to compare and contrasts income poverty with energy poverty in the context of India. This study displays evidence from urban India that further support my hypothesis on the relationship between the income of a person and their electric consumption.
Barnes, D.F., Khandker, S.R., & Samad, H.A. (2012). Are the energy poor also income poor? Evidence from India. Retrieved from
Cantor, C.H., Najman, J.M., & Slater, P.J. (2010). Socioeconomic indices and suicide rate in Queensland. Retrieved from
0 notes