#sub2
Explore tagged Tumblr posts
Note
SUBNAUTICA 2 TRAILER CLYDE
I KNOW I SAW!!!!!!!!! Just casually browsing my feed when that lovely alert popped up~
In all seriousness, that teaser makes me really hopeful/hyped for Sub2. I never joined the hate bandwagon for Below Zero, but I found it to just be a solid game, not one that re-captured the magic of the original. This though? This feels like og Subnautica. Even with the arrival of a friend (which might just be a nod to the promised co-op?), the whole teaser has that isolated, alienated feel to it. The world is gorgeous, but every beautiful thing posses a hidden danger. You are just a spec in the deep, dark, terrifying hole that is a foreign ocean. You're alone. The only voice for miles around is the semi-cold AI system trying to keep you alive. You're vulnerable.
There are just so many details here for a minute of info. The loud sound of your breathing. The blinking red alert light. The increasingly frantic beeping. The soft edges as your groggy brain remembers the pretty, safe surface. The fact that the crab-thing appears harmless but can still kill you with its presence alone. The "Oh shit if all the fish are swimming away there's something COMING" callback. The relief of seeing a friend show up before they're snatched by a GIANT BIO-LUMINESCENT CREATURE THAT'S JUST PERSUING ITS NATURAL INSTINCTS BUT ALSO IT'S NATURAL INSTINCTS ARE TERRIFYING." Perfect for spooky season.
(Although I'm ngl, my first thought was that Purple Dude might secretly be a friend like the Emperor Leviathan. It would fit the other parallels of that ending :D)
So yeah I'm PUMPED. Haven't been this excited for a sequel since Hades II, which admittedly hadn't been that long since EA released but STILL. I do have to decide though if I'm going to experience Sub2's EA. A part of me really wants to because a) I missed the initial EA experience and b) I'm impatient, but on the other hand the Subnautica series doesn't lend itself to incomplete play like a rogue-like does and I suspect it will be a lot more bug ridden than Hades II was when it came out... but we'll see. I'll probably join EA if only to avoid spoilers lol
6 notes
·
View notes
Text
The way I need the new character in Sub2 to be connected to the below zero. Listen give me weird ailen child from below zero. Since it seems like instead of "oh update your machine" it more "update yourself evolution 2".
Its not gonna happen but im actually super excited for subnatica 2!
4 notes
·
View notes
Text
I love to see a Women Win
(Amanda Nunes)
September 19, 2024 Rookie's Playbook
I’m back at it again deep diving into things I have no concept of, and today it’s the world of the UFC! I like for my first read-through something to be an explanation of sorts so I can try and figure out exactly what is happening, and today I'm diving into the beauty of women champions.
From my understanding there are four types of championships, there is feather-weight, bantam-weight, fly-weight, and straw-weight. I’m not gonna lie, half of these to me sound like the same thing with AI generated names. After doing further research it all depends on the weight of the fighter, which when I was younger used to confuse the hell out of me because I thought it was so rude to segregate people into weight categories, now however I know it’s so you don’t get absolutely murdered.
Feather-weight(126 pounds) has no current champion, but they do have three past champions. Amanda Nunes won in 2018 with two defenses and a TKO of 1 over her opponent Cris Cyborg. I’m not going to sit here and lie to you either if my last name was Cyborg no one could talk to me. That is such badass. The only knowledge I have of a TKO(Total Knockout) is when executed the fighter is left unable to complete the fight, so that is insanely hardcore to me. Currently, the brazilian fighter has insane stats ranging from 13-2 TKO, SUB 4-2, and W-L-D 23-5-0. Now, again I will not lie to you I only know what the TKO is.
Sub is when a fighter ‘taps out’, yielding to their opponent so that they win. So, that takes the badass levels down some, but hey who knows this could actually be the lowest for someone and she is the dominant woman to have ever lived. Then the W-L-D is win, loss, or draw. The thing that I'm getting at for Nunes is that she doesn’t believe in draws and eats up her opponent every time, 23 wins! Insane!
Now that I know the terms let's move onto bantamweight, which is the 135 pound limit. The current champion for this category is Raquel Pennington, with an outcome of a UD over Mayra Bueno Silva. Now comes my next question, the hell is a UD? This is a term I have heard of and often yell out when I want to look knowledgeable, it means unanimous decision. This is when all the judges are in agreement of which fighter is the winner, so good for Miss Pennington!
Pennington’s stats are W-L-D 16-9-0, TKO 1-1, and sub is 1-1, not bad but not my new favorite fighter Amanda Nunes good.
Women’s flyweight is next on our list and their poundage (?) is 125. I’m going to be honest, with all the searching and rules about the fighters weight is insane to me, what is a one pound difference in the grand scheme of it all you know? Anyway, the current champion of this category is Valentina Shevchenko who won in a unanimous decision against Alexas Grasso! Look at me knowing my terms, this is what we call development guys. Shevchenko’s stats are W-L-D 24-4-1, TKO 8-1, and Sub 7-1. I’m gonna be honest, Miss Nunes has my heart, but Schevchenko scares the shit outta me the most.
The final category is straw-weight and the weight is 115 pounds, again the names just all seem too similar for my taste and I find them very much confusing but non the less I digress. The current champion is the lady with the best cheekbones I have ever seen Zhang Weili who won against Carla Esparza with a Sub2. Her stats rival Nunes to me with a W-L-D 25-3-0, TKO 11-1, and Sub 8-0. She is a Chinese fighter whose career I feel I will be following closely.
After all that research I can confirm that this is the most intriguing of the sports I have covered so far, and I simply cannot wait to learn more about it.
2 notes
·
View notes
Text
okay sub legacy thoughts w spoilers (by 'spoilers' i mean comments on changes made from the OG not plot spoilers) (i am a little way into the lab)
keeping the coin but moving the code to get the fuse elsewhere? DICK move. took me ages to find it. not sure if it's randomised but if it is then i think i got unlucky with the second 2 digits being 11 bcos it wasn't actually obvious that they were numbers when i found them
oughhhhh took me till sub2 to recognise the dimensional co-ordinates i was running into that's really neat and im so hype to see where we're going with that
where is the spoon and the fork... where is my classic submachine cutlery puzzle :(
i appreciate that there's more explanation for what you're doing w the chemicals in the lab (that's always been a bit of a ?? puzzle as it seems to hinge on knowledge that the player doesn't actually have so you're basically just combining items till you hit on the right combo). however i always liked the detail that you're straight up breaking in rather than hunting around for the code for entry so i wish that had been kept!!
!!!!! when re-opening the game and spotting sub 0 added to the level map
ok i need to stop now bcos it's late and also im supposed to be reducing screen time bcos of my eyes. off to bed shortly zzz good night.
14 notes
·
View notes
Text
i remember like. literally two years ago talking to someone about how sub2:30 seemed almost impossible for AA and now fein has a sub2:15. that is fucking insane but also if anyone could do it it would be feinberg.
#sorry i have mcsrtism i just dont talk about it very much these days#vodwatching his 2:14 rn because i didnt see it live. and i am just so in awe of his talents
3 notes
·
View notes
Text
it is a good day bc i got a sub2 time in set yayyyyyyy
7 notes
·
View notes
Text
TWIN PEAKS
s01 - sUB
02 - sUB1 / sUB2
Fuego Camina Conmigo (1992)
03 - sUB
5 notes
·
View notes
Link
This is an example of religious rules/laws made humorous.
#atheism#christianity#judaism#islam#catholicism#mormonism#baptist#jehovah witness#scientology#Religion
9 notes
·
View notes
Text
Get your drive back and feel great again
Testogen boosts your testosterone naturally and reverses the symptoms of low .
So you can feel better, every day.
Complete testosterone support for male health and wellness
100% safe and natural ingredients backed by clinical studies
Improves energy, performance, muscle growth, libido and fat loss .
https://testogen.com/?_ef_transaction_id=&oid=10&affid=8664&source_id=Facebook%20&sub1=Tumblr%20&sub2=Instagram%20&sub3=Alnnamir
9 notes
·
View notes
Text
https://opphustle.com/?utm_campaign={replace}&sub2=&sub3=&sub4=171725351713&sub5=720901565753&sub6=21901344475&sub7=m&sub8=&sub9=&sub10=&utm_source=Google&customsource={acc2-vb-AMZNPR}&wbraid=&gbraid=&ref_id=Cj0KCQiA_9u5BhCUARIsABbMSPui--op1Cz1y6xDLcJa0L5lpIW5dBKJzElKgD7hsoTtR4K_uKUhH9oaAlxNEALw_wcB&gclid=Cj0KCQiA_9u5BhCUARIsABbMSPui--op1Cz1y6xDLcJa0L5lpIW5dBKJzElKgD7hsoTtR4K_uKUhH9oaAlxNEALw_wcB
0 notes
Text
Exploring Statistical Interactions
This assignment aims to statistically assess the evidence, provided by NESARC codebook, in favour of or against the association between cannabis use and major depression, in U.S. adults. More specifically, I examined the statistical interaction between frequency of cannabis use (10-level categorical explanatory, variable ”S3BD5Q2E”) and major depression diagnosis in the last 12 months (categorical response, variable ”MAJORDEP12”), moderated by variable “S1Q231“ (categorical), which indicates the total number of the people who lost a family member or a close friend in the last 12 months. This effect is characterised statistically as an interaction, which is a third variable that affects the direction and/or the strength of the relationship between the explanatory and the response variable and help us understand the moderation. Since I have a categorical explanatory variable (frequency of cannabis use) and a categorical response variable (major depression), I ran a Chi-square Test of Independence (crosstab function) to examine the patterns of the association between them (C->C), by directly measuring the chi-square value and the p-value. In addition, in order visualise graphically this association, I used factorplot function (seaborn library) to produce a bivariate graph. Furthermore, in order to determine which frequency groups are different from the others, I performed a post hoc test, using Bonferroni Adjustment approach, since my explanatory variable has more than 2 levels. In the case of ten groups, I actually need to conduct 45 pair wise comparisons, but in fact I examined indicatively two and compared their p-values with the Bonferroni adjusted p-value, which is calculated by dividing p=0.05 by 45. By this way it is possible to identify the situations where null hypothesis can be safely rejected without making an excessive type 1 error.
Regarding the third variable, I examined if the fact that a family member or a close friend died in the last 12 months, moderates the significant association between cannabis use frequency and major depression diagnosis. Put it another way, is frequency of cannabis use related to major depression for each level of the moderating variable (1=Yes and 2=No), that is for those whose a family member or a close friend died in the last 12 months and for those whose they did not? Therefore, I set new data frames (sub1 and sub2) that include either individuals who fell into each category (Yes or No) and ran a Chi-square Test of Independence for each subgroup separately, measuring both chi-square values and p-values. Finally, with factorplot function (seaborn library) I created two bivariate line graphs, one for each level of the moderating variable, in order to visualise the differences and the effect of the moderator upon the statistical relationship between frequency of cannabis use and major depression diagnosis. For the code and the output I used Spyder (IDE).
The moderating variable that I used for the statistical interaction is:
FOLLOWING IS AN PYTHON PROGRAM
import pandas import numpy import seaborn import scipy import matplotlib.pyplot as plt
nesarc = pandas.read_csv ('nesarc_pds.csv', low_memory=False)
Set PANDAS to show all columns in DataFrame
pandas.set_option('display.max_columns' , None)
Set PANDAS to show all rows in DataFrame
pandas.set_option('display.max_rows' , None)
nesarc.columns = map(str.upper , nesarc.columns)
pandas.set_option('display.float_format' , lambda x:'%f'%x)
Change my variables to numeric
nesarc['AGE'] = nesarc['AGE'].convert_objects(convert_numeric=True) nesarc['MAJORDEP12'] = nesarc['MAJORDEP12'].convert_objects(convert_numeric=True) nesarc['S1Q231'] = nesarc['S1Q231'].convert_objects(convert_numeric=True) nesarc['S3BQ1A5'] = nesarc['S3BQ1A5'].convert_objects(convert_numeric=True) nesarc['S3BD5Q2E'] = nesarc['S3BD5Q2E'].convert_objects(convert_numeric=True)
Subset my sample
subset1 = nesarc[(nesarc['AGE']>=18) & (nesarc['AGE']<=30) & nesarc['S3BQ1A5']==1] # Ages 18-30, cannabis users subsetc1 = subset1.copy()
Setting missing data
subsetc1['S1Q231']=subsetc1['S1Q231'].replace(9, numpy.nan) subsetc1['S3BQ1A5']=subsetc1['S3BQ1A5'].replace(9, numpy.nan) subsetc1['S3BD5Q2E']=subsetc1['S3BD5Q2E'].replace(99, numpy.nan) subsetc1['S3BD5Q2E']=subsetc1['S3BD5Q2E'].replace('BL', numpy.nan)
recode1 = {1: 9, 2: 8, 3: 7, 4: 6, 5: 5, 6: 4, 7: 3, 8: 2, 9: 1} # Frequency of cannabis use variable reverse-recode subsetc1['CUFREQ'] = subsetc1['S3BD5Q2E'].map(recode1) # Change the variable name from S3BD5Q2E to CUFREQ
subsetc1['CUFREQ'] = subsetc1['CUFREQ'].astype('category')
Raname graph labels for better interpetation
subsetc1['CUFREQ'] = subsetc1['CUFREQ'].cat.rename_categories(["2 times/year","3-6 times/year","7-11 times/year","Once a month","2-3 times/month","1-2 times/week","3-4 times/week","Nearly every day","Every day"])
Contingency table of observed counts of major depression diagnosis (response variable) within frequency of cannabis use groups (explanatory variable), in ages 18-30
contab1 = pandas.crosstab(subsetc1['MAJORDEP12'], subsetc1['CUFREQ']) print (contab1)
Column percentages
colsum=contab1.sum(axis=0) colpcontab=contab1/colsum print(colpcontab)
Chi-square calculations for major depression within frequency of cannabis use groups
print ('Chi-square value, p value, expected counts, for major depression within cannabis use status') chsq1= scipy.stats.chi2_contingency(contab1) print (chsq1)
Bivariate bar graph for major depression percentages with each cannabis smoking frequency group
plt.figure(figsize=(12,4)) # Change plot size ax1 = seaborn.factorplot(x="CUFREQ", y="MAJORDEP12", data=subsetc1, kind="bar", ci=None) ax1.set_xticklabels(rotation=40, ha="right") # X-axis labels rotation plt.xlabel('Frequency of cannabis use') plt.ylabel('Proportion of Major Depression') plt.show()
recode2 = {1: 10, 2: 9, 3: 8, 4: 7, 5: 6, 6: 5, 7: 4, 8: 3, 9: 2, 10: 1} # Frequency of cannabis use variable reverse-recode subsetc1['CUFREQ2'] = subsetc1['S3BD5Q2E'].map(recode2) # Change the variable name from S3BD5Q2E to CUFREQ2
sub1=subsetc1[(subsetc1['S1Q231']== 1)] sub2=subsetc1[(subsetc1['S1Q231']== 2)]
print ('Association between cannabis use status and major depression for those who lost a family member or a close friend in the last 12 months') contab2=pandas.crosstab(sub1['MAJORDEP12'], sub1['CUFREQ2']) print (contab2)
Column percentages
colsum2=contab2.sum(axis=0) colpcontab2=contab2/colsum2 print(colpcontab2)
Chi-square
print ('Chi-square value, p value, expected counts') chsq2= scipy.stats.chi2_contingency(contab2) print (chsq2)
Line graph for major depression percentages within each frequency group, for those who lost a family member or a close friend
plt.figure(figsize=(12,4)) # Change plot size ax2 = seaborn.factorplot(x="CUFREQ", y="MAJORDEP12", data=sub1, kind="point", ci=None) ax2.set_xticklabels(rotation=40, ha="right") # X-axis labels rotation plt.xlabel('Frequency of cannabis use') plt.ylabel('Proportion of Major Depression') plt.title('Association between cannabis use status and major depression for those who lost a family member or a close friend in the last 12 months') plt.show()
#
print ('Association between cannabis use status and major depression for those who did NOT lose a family member or a close friend in the last 12 months') contab3=pandas.crosstab(sub2['MAJORDEP12'], sub2['CUFREQ2']) print (contab3)
Column percentages
colsum3=contab3.sum(axis=0) colpcontab3=contab3/colsum3 print(colpcontab3)
Chi-square
print ('Chi-square value, p value, expected counts') chsq3= scipy.stats.chi2_contingency(contab3) print (chsq3)
Line graph for major depression percentages within each frequency group, for those who did NOT lose a family member or a close friend
plt.figure(figsize=(12,4)) # Change plot size ax3 = seaborn.factorplot(x="CUFREQ", y="MAJORDEP12", data=sub2, kind="point", ci=None) ax3.set_xticklabels(rotation=40, ha="right") # X-axis labels rotation plt.xlabel('Frequency of cannabis use') plt.ylabel('Proportion of Major Depression') plt.title('Association between cannabis use status and major depression for those who did NOT lose a family member or a close friend in the last 12 months') plt.show()
OUTPUT:
A Chi Square test of independence revealed that among cannabis users aged between 18 and 30 years old (subsetc1), the frequency of cannabis use (explanatory variable collapsed into 9 ordered categories) and past year depression diagnosis (response binary categorical variable) were significantly associated, X2 =29.83, 8 df, p=0.00022.
In the bivariate graph (C->C) presented above, we can see the correlation between frequency of cannabis use (explanatory variable) and major depression diagnosis in the past year (response variable). Obviously, we have a left-skewed distribution, which indicates that the more an individual (18-30) smoked cannabis, the better were the chances to have experienced depression in the last 12 months.
In the first place, for the moderating variable equal to 1, which is those whose a family member or a close friend died in the last 12 months (sub1), a Chi Square test of independence revealed that among cannabis users aged between 18 and 30 years old, the frequency of cannabis use (explanatory variable) and past year depression diagnosis (response variable) were not significantly associated, X2 =4.61, 9 df, p=0.86. As a result, since the chi-square value is quite small and the p-value is significantly large, we can assume that there is no statistical relationship between these two variables, when taking into account the subgroup of individuals who lost a family member or a close friend in the last 12 months.
In the bivariate line graph (C->C) presented above, we can see the correlation between frequency of cannabis use (explanatory variable) and major depression diagnosis in the past year (response variable), in the subgroup of individuals whose a family member or a close friend died in the last 12 months (sub1). In fact, the direction of the distribution (fluctuation) does not indicate a positive relationship between these two variables, for those who experienced a family/close death in the past year.
Subsequently, for the moderating variable equal to 2, which is those whose a family member or a close friend did not die in the last 12 months (sub2), a Chi Square test of independence revealed that among cannabis users aged between 18 and 30 years old, the frequency of cannabis use (explanatory variable) and past year depression diagnosis (response variable) were significantly associated, X2 =37.02, 9 df, p=2.6e-05 (p-value is written in scientific notation). As a result, since the chi-square value is quite large and the p-value is significantly small, we can assume that there is a positive relationship between these two variables, when taking into account the subgroup of individuals who did not lose a family member or a close friend in the last 12 months.
In the bivariate line graph (C->C) presented above, we can see the correlation between frequency of cannabis use (explanatory variable) and major depression diagnosis in the past year (response variable), in the subgroup of individuals whose a family member or a close friend did not die in the last 12 months (sub2). Obviously, the direction of the distribution indicates a positive relationship between these two variables, which means that the frequency of cannabis use directly affects the proportions of major depression, regarding the individuals who did not experience a family/close death in the last 12 months.
Summary
It seems that both the direction and the size of the relationship between frequency of cannabis use and major depression diagnosis in the last 12 months, is heavily affected by a death of a family member or a close friend in the same period. In other words, when the incident of a family/close death is present, the correlation is considerably weak, whereas when it is absent, the correlation is significantly strong and positive. Thus, the third variable moderates the association between cannabis use frequency and major depression diagnosis.
0 notes
Text
Variable Data Management
Based on the observations from the previous post we can code the missing data for crater depth and number of layers.
sub2 = sub1.copy() # recode missing values to python missing (NaN) sub2['DEPTH_RIMFLOOR_TOPOG']=sub2['DEPTH_RIMFLOOR_TOPOG'].replace(0, numpy.nan) sub2['NUMBER_LAYERS']=sub2['NUMBER_LAYERS'].replace(0, numpy.nan) print ('counts for DEPTH_RIMFLOOR_TOPOG with 0 set to NAN and number of missing requested') c4 = sub2['DEPTH_RIMFLOOR_TOPOG'].value_counts(sort=False, dropna=False) print(c4) print ('counts for NUMBER_LAYERS with 0 set to NAN and number of missing requested') c5 = sub2['NUMBER_LAYERS'].value_counts(sort=False, dropna=False) print(c5) counts for DEPTH_RIMFLOOR_TOPOG with 0 set to NAN and number of missing requested NaN 57269 0.50 524 1.00 78 1.26 23 0.38 737 ... 0.33 749 1.21 23 1.16 46 0.41 651 2.07 1 Name: DEPTH_RIMFLOOR_TOPOG, Length: 182, dtype: int64 counts for NUMBER_LAYERS with 0 set to NAN and number of missing requested 3.0 465 2.0 3136 4.0 31 1.0 14720 NaN 101469 Name: NUMBER_LAYERS, dtype: int64
Let's group the craters based on their depth. It seems like most of the craters are below 1km, so we can divide this range in 4 groups.
# Group depthsprint('Depths split in groups') sub2['DEPTH_RIMFLOOR_TOPOG'] = pd.cut(sub2.DEPTH_RIMFLOOR_TOPOG, [0, 0.25, 0.5, 0.75, 1, 2, 3]) c6 = sub2['DEPTH_RIMFLOOR_TOPOG'].value_counts(sort=False, dropna=True) print(c6) Depths split in groups (0.0, 0.25] 31632 (0.25, 0.5] 17171 (0.5, 0.75] 8618 (0.75, 1.0] 3602 (1.0, 2.0] 1521 (2.0, 3.0] 2 Name: DEPTH_RIMFLOOR_TOPOG, dtype: int64
For the next steps I would limit the data to the craters where depth and layer information is present. This significantly reduces the amount of information to be managed and will provide a more accurate insight in future observations.
#subset data to craters with depth and layer information with a diameter less than 30 km. sub3=data[(data['DEPTH_RIMFLOOR_TOPOG'] > 0) & (data['NUMBER_LAYERS'] > 0) &(data['DIAM_CIRCLE_IMAGE'] <= 30) ] print('counts for Crater Diameter') c7 = sub3['DIAM_CIRCLE_IMAGE'].value_counts(sort=True) print(c7) counts for Crater Diameter 3.71 42 3.16 41 3.26 40 3.54 39 3.04 38 .. 14.48 1 1.35 1 21.57 1 23.31 1 21.33 1 Name: DIAM_CIRCLE_IMAGE, Length: 2150, dtype: int64
0 notes
Text
ACTIVITY 4
CODE
#Univariate bar graph for categorical variables sub2["S2DQ1"] = sub2["S2DQ1"].astype('category') seaborn.countplot(x="S2DQ1", data=sub2) plt.xlabel('Father´s problems with liquor') plt.title('Father´s problems with liquor in young people between 17 and 28 years old who spend a lot of time drinking (NESARC Study)')
sub2["S2DQ2"] = sub2["S2DQ2"].astype('category')
seaborn.countplot(x="S2DQ2", data=sub2) plt.xlabel('Mother´s problems with liquor') plt.title('Mother´s problems with liquor in young people between 17 and 28 years old who spend a lot of time drinking (NESARC Study)') #Univariate histogram for quantitative variable: seaborn.distplot(sub2["S2BQ2D"].dropna(), kde=False); plt.xlabel('Age of onset of liquor dependence') plt.title('Age of onset of liquor dependence in young people between 17 and 28 years old who spend a lot of time drinking (NESARC Study)') sub2['S2BQ2D'] = pandas.cut(sub2.S2BQ2D, [7, 13, 17, 20, 23, 27]) c5 = sub2['S2BQ2D'].value_counts(sort=False, dropna=True) print(c5) sub2['S2BQ2D'] = sub2['S2BQ2D'].astype('category')
sub2['S2DQ1'] = pandas.to_numeric(sub2['S2DQ1']==1) seaborn.catplot(x="S2BQ2D", y="S2DQ1", data=sub2, kind="bar", ci=None) plt.xlabel('Age of onset of liquor dependence') plt.ylabel('Proportion father´s problems with liquor') sub2['S2DQ2'] = pandas.to_numeric(sub2['S2DQ2']==1) seaborn.catplot(x="S2BQ2D", y="S2DQ2", data=sub2, kind="bar", ci=None) plt.xlabel('Age of onset of liquor dependence') plt.ylabel('Proportion father´s problems with liquor')
OUTPUT
DESCRIPTION
Regarding the graphs of the quantitative variable that represents the age of onset of liquor dependence, it has a symmetrical unimodal distribution where the mode, mean and median are approximately 18 years of age. Regarding the categorical variables, 337 young people stated that their father had problems with liquor and 141 stated that their mother suffered from alcoholism. I made two bivariate graphs taking as an explanatory variable the age of onset of alcohol dependence and comparing it with the proportion of fathers and mothers who had problems with alcohol. In the case of the father, the data follow a bimodal distribution where the modes are found at ages 7-13 and 23-27. On the other hand, the graph where the mother's condition is represented shows a skewed right unimodal behavior. In conclusion, it can be observed that both mothers and fathers who suffered from alcoholism influenced their children mainly at early ages (7 -13) had problems with liquor. For future analyses, the influence that other family members such as grandparents, uncles, or romantic partners have on alcohol consumption could be taken into account.
0 notes
Text
Analyzing Alcohol Consumption: Data Management and Frequency Distributions in Python
Python Program:
import pandas import numpy as np
data = pandas.read_csv('nesarc_pds.csv', low_memory=False)
print(len(data)) # Number of observations (rows) print(len(data.columns)) # Number of variables (columns)
Drank atleast 12 alcoholic drinks in last 12 months
sub1=data["S2AQ2"].value_counts(sort=False)
make a copy of inserted data
sub2 = sub1.copy()
print("counts for original S2AQ2") c1= sub2 print(c1)
sub2 = sub2.replace(9, np.nan)
print('counts for S2AQ2 with 9 set to nan') c2= sub2 print(c2)
Drank atleast 1 alcoholic drinks in last 12 months
sub3=data["S2AQ3"].value_counts(sort=False)
make a copy of inserted data
sub4 = sub3.copy()
print("counts for original S2AQ3") c3= sub3 print(c3)
sub3 = sub3.replace(9, np.nan)
print('counts for S2AQ3 with 9 set to nan') c3= sub3 print(c3)
Family or friends told to cut down on drinking
sub5=data["S2AQ18"].value_counts(sort=False)
make a copy of inserted data
sub5 = sub1.copy()
print("counts for original S2AQ18") c1= sub5 print(c1)
sub6 = sub5.replace(9, np.nan)
print('counts for S2AQ18 with 9 set to nan') c2= sub6 print(c2)
Interpretation of Results
Variable 1: This variable primarily took values in numeric numbers. Most frequent values were 1 indicating a yes answer to the questions, and value of 9 was used to display data that could be labelled as missing which is replaced in this code to "nan" using the NumPy library.
Variable 2: This distribution reflects that most of the survey population had at some point in their life drank alcoholic drinks. Most of the population answered "1" yes to drinking. 1 was seen as the most common answer in the data set.
Variable 3: The distribution showed a clear grouping, with most of the survey population having drank alcoholic drinks at some point in their life. However, an anomaly was noticed in S2AQ2 when more people answered "2" or no when asked if they had consumed atleast 12 alcoholic drinks in last 12 months. Summary: In this post, I explored alcohol consumption data, managing variables in Python to create meaningful insights. By handling missing data, recoding variables, and analyzing frequency distributions, I highlighted key trends. Most respondents had consumed alcohol at some point (indicated by a frequent "yes" answer), and missing data was coded as "nan" to ensure clarity. Interestingly, while many had consumed alcohol, fewer had done so in the past year, as reflected by a higher count of "no" answers to recent drinking. This assignment emphasized how strategic data management can unveil important behavioral patterns and anomalies within survey data.
#coursera#data management#data visualization#python#data analysis#datascience#assignment#datamanagement
0 notes
Text
code for coursera
import numpy import pandas import statsmodels.formula.api as smf import statsmodels.stats.multicomp as multi
data = pandas.read_csv(r'C:\Users\Downloads_cf16dab6c94262cc58a6bd4e0f753e56_nesarc_pds.csv', low_memory=False)
Converting all column names uppercase
data.columns = map(str.upper, data.columns)
Converting variable values to numeric
data['S2DQ1'] = pandas.to_numeric(data['S2DQ1']) data['S2DQ2'] = pandas.to_numeric(data['S2DQ2']) data['S2BQ2D'] = pandas.to_numeric(data['S2BQ2D'], errors='coerce') data['S2BQ1A12'] = pandas.to_numeric(data['S2BQ1A12'], errors='coerce') data['AGE'] = pandas.to_numeric(data['AGE'])
subset data to youths age 17 to 28 who spend a lot of time drinking
sub1=data[(data['AGE']>=17) & (data['AGE']<=28) &(data['S2BQ1A12']==1)] print(sub1) sub2 = sub1.copy()
counts for original
print ('counts for original S2DQ1') c1 = sub2['S2DQ1'].value_counts(sort=False, dropna=False) print(c1) print ('counts for original S2DQ2') d1 = sub2['S2DQ2'].value_counts(sort=False, dropna=False) print(c1) print ('counts for original S2BQ2D') e1 = sub2['S2BQ2D'].value_counts(sort=False, dropna=False) print(e1)
Set missing data to NAN
sub2['S2DQ1']=sub2['S2DQ1'].replace(9, numpy.nan) sub2['S2DQ2']=sub2['S2DQ2'].replace(9, numpy.nan) sub2['S2BQ2D']=sub2['S2BQ2D'].replace(99, numpy.nan)
set to NAN and number of missing requested
print ('counts for S2DQ1 with 9 set to NAN and number of missing requested') c2 = sub2['S2DQ1'].value_counts(sort=False, dropna=False) print(c2) print ('counts for S2DQ2 with 9 set to NAN and number of missing requested') d2 = sub2['S2DQ2'].value_counts(sort=False, dropna=False) print(d2) print ('counts for S2BQ2D with 99 set to NAN and number of missing requested') e2 = sub2['S2BQ2D'].value_counts(sort=False, dropna=False) print(e2)
Categorize in quartiles
print ('S2BQ2D - 4 categories - quartiles') sub2['S2BQ2D']=pandas.qcut(sub1.S2BQ2D, 4, labels=["1=0%tile","2=25%tile","3=50%tile","4=75%tile"]) c4 = sub2['S2BQ2D'].value_counts(sort=False, dropna=True) print(c4)
0 notes
Text
UCS548 - UCS 538- Data Science Fundamentals Solved
Assignment 7 c. To create a function called st.err()=sd(x)/sqrt(length(x)) to find the standard error in SUB1, SUB2, and SUB3. 3. Create a vector TOTAL_SUM that hold the value of V1, V2, and V3 using sapply(). 6. Create a function f(x,y)=x/y where x is V1 and y is V2. Use mapply() to compute this function. 7. Practice all apply functions on “Seatbelts” data set given in R.
0 notes