#linestyle
Explore tagged Tumblr posts
Text
Chrysalis anthro voted by patreon. Accept no substitutes.
#Love this bitch#my linestyle doesnt fit her well#bc straight hair looks odd on my style#but still#banger#love her sm#chrysalis#queen chrysalis#mlp#my little pony#pony posting#anthro#friendship is magic#Best villan by presentation alone
978 notes
·
View notes
Text
I see... You use underhanded tactics, too. Huh, Shuichi?
#Poisned.art#my art#kokichi#kokichi fanart#danganronpa#danganronpa fanart#kokichi ouma#ouma kokichi#danganronpa spoilers#danganronpa v3#danganronpa v3 spoilers#I spent a couple days on this for one simple reason#I fucking hate line art#Any attempt at any linestyle#even no lines at all#looked absolutely dreadful#so i just said fuck it and used the sketch#finished it in an afternoon my god I hate lineart-
66 notes
·
View notes
Text
0 notes
Text
So fun fact i started drawing this gojo and halfway through gege confirmed that he was more of a dog person. ... 💀💀💀 idk if i'll end up finishing this but i still liked the colors and cute kitty and linestyle i went with!!
#artists on tumblr#digital art#trans artist#small artist#drawing#sketch#my art#fanart#anime fanart#anime art#jujutsu kaisen art#jujutsu gojo#jjk gojo#gojo satoru#jjk satoru#jjk fanart
23 notes
·
View notes
Text
opinions for cel shade artstyle
7 notes
·
View notes
Text
`Yavanna´- Traditional painting. -Pencil and watercolor paint. - The color version of Yavanna is ready now. Soon available in my shops:
(Music in Video by www.frametraxx.de)
#tolkienfanart#valar#yavanna#fantasyart#traditionalpainting#illustration#linestyleartwork#annajaegerhauer#aquarell#drawing#fantasyartist#fantasy#traditionalart#noai#humanartist#femaleartist#artist#jrrtolkien#elves#artistsoninstagram#art#tolkienelben#tolkienelves#lordoftherings#lotr#silmarillion#fee#elfe#watercolor#artistsontumblr
32 notes
·
View notes
Note
HOW DO GOU EVEN MAKE YOUR ART LOOK LIKE SOFT COOKIES WITH LOTS OF CHOCOLATE CHIPS,,,, HOW??????¿??
(btw i left yir server becuaz i realized your server waz 16+,, but il be bac on my birthday!!! im turning 16 on january 19!!!((
(ty for respecting the rules and my boundaries! i generally keep all my socials 16+ for my own comfort and im glad you can understand <3) but awa nonetheless thankyou for the compliments! for anyone curious about how i create the soft look in my art i think theres a few things that contribute to it; - i rarely use fully opaque lineart, i draw all my lines using a transparency/pressue brush and i often use a thicker linestyle than others too, i think this helps lend itself to line weight and depth but also helps accentuate softer areas where the lines are really light and round! - i use a soft-edged brush for shading, i turn the hardness down to pretty much 0% and paint soft shapes and gradients to map out the general lighting with blend modes - i ALSO use a soft-edged brush for rendering-- after laying out the lights and darks i go in with a smaller but still soft brush to add details on a sepearate layer above the lineart, this can often result in me sometimes overlapping the lines a little and creating that fluffy look! i think these things are maybe subtle but they help a lot to generally reduce hard edges in the piece and reduce line intensity- lineart is of course important to a piece but i dont treat it as the focus, its more of a guide to the edges and shadows more than anything, if that makes sense :3
#zilluask#art tips#maybe?#soft art#how to#art advice#im not sure how much this qualifies as advice but its an insight into my process atleast!!#if people are interested id be happy to do more in depth guides with visuals and stuff :3
16 notes
·
View notes
Text
feel like different material could maybe elevate my linestyle stuff but it already takes foreeeeever and im worried something slower would render it stale. like stagnant water
4 notes
·
View notes
Text
Example Code```pythonimport pandas as pdimport numpy as npimport statsmodels.api as smimport matplotlib.pyplot as pltimport seaborn as snsfrom statsmodels.graphics.gofplots import qqplotfrom statsmodels.stats.outliers_influence import OLSInfluence# Sample data creation (replace with your actual dataset loading)np.random.seed(0)n = 100depression = np.random.choice(['Yes', 'No'], size=n)age = np.random.randint(18, 65, size=n)nicotine_symptoms = np.random.randint(0, 20, size=n) + (depression == 'Yes') * 10 + age * 0.5 # More symptoms with depression and agedata = { 'MajorDepression': depression, 'Age': age, 'NicotineDependenceSymptoms': nicotine_symptoms}df = pd.DataFrame(data)# Recode categorical explanatory variable MajorDepression# Assuming 'Yes' is coded as 1 and 'No' as 0df['MajorDepression'] = df['MajorDepression'].map({'Yes': 1, 'No': 0})# Multiple regression modelX = df[['MajorDepression', 'Age']]X = sm.add_constant(X) # Add intercepty = df['NicotineDependenceSymptoms']model = sm.OLS(y, X).fit()# Print regression results summaryprint(model.summary())# Regression diagnostic plots# Q-Q plotresiduals = model.residfig, ax = plt.subplots(figsize=(8, 5))qqplot(residuals, line='s', ax=ax)ax.set_title('Q-Q Plot of Residuals')plt.show()# Standardized residuals plotinfluence = OLSInfluence(model)std_residuals = influence.resid_studentized_internalplt.figure(figsize=(8, 5))plt.scatter(model.predict(), std_residuals, alpha=0.8)plt.axhline(y=0, color='r', linestyle='-', linewidth=1)plt.title('Standardized Residuals vs. Fitted Values')plt.xlabel('Fitted values')plt.ylabel('Standardized Residuals')plt.grid(True)plt.show()# Leverage plotfig, ax = plt.subplots(figsize=(8, 5))sm.graphics.plot_leverage_resid2(model, ax=ax)ax.set_title('Leverage-Residuals Plot')plt.show()# Blog entry summarysummary = """### Summary of Multiple Regression Analysis1. **Association between Explanatory Variables and Response Variable:** The results of the multiple regression analysis revealed significant associations: - Major Depression (Beta = {:.2f}, p = {:.4f}): Significant and positive association with Nicotine Dependence Symptoms. - Age (Beta = {:.2f}, p = {:.4f}): Older participants reported a greater number of Nicotine Dependence Symptoms.2. **Hypothesis Testing:** The results supported the hypothesis that Major Depression is positively associated with Nicotine Dependence Symptoms.3. **Confounding Variables:** Age was identified as a potential confounding variable. Adjusting for Age slightly reduced the magnitude of the association between Major Depression and Nicotine Dependence Symptoms.4. **Regression Diagnostic Plots:** - **Q-Q Plot:** Indicates that residuals approximately follow a normal distribution, suggesting the model assumptions are reasonable. - **Standardized Residuals vs. Fitted Values Plot:** Shows no apparent pattern in residuals, indicating homoscedasticity and no obvious outliers. - **Leverage-Residuals Plot:** Identifies influential observations but shows no extreme leverage points.### Output from Multiple Regression Model```python# Your output from model.summary() hereprint(model.summary())```### Regression Diagnostic Plots![Q-Q Plot of Residuals](insert_url_to_image_qq_plot)![Standardized Residuals vs. Fitted Values](insert_url_to_image_std_resid_plot)![Leverage-Residuals Plot](insert_url_to_image_leverage_plot)"""# Assuming you would generate and upload images of the plots to your blog# Print the summary for submissionprint(summary)```### Explanation:1. **Sample Data Creation**: Simulates a dataset with `MajorDepression` as a categorical explanatory variable, `Age` as a quantitative explanatory variable, and `NicotineDependenceSymptoms` as the response variable. 2. **Multiple Regression Model**: - Constructs an Ordinary Least Squares (OLS) regression model using `sm.OLS` from the statsmo
2 notes
·
View notes
Text
Multiple Regression model
To successfully complete the assignment on testing a multiple regression model, you'll need to conduct a comprehensive analysis using Python, summarize your findings in a blog entry, and include necessary regression diagnostic plots.
Here’s a structured example to guide you through the process:
### Example Code
```pythonimport pandas as pdimport numpy as npimport statsmodels.api as smimport matplotlib.pyplot as pltimport seaborn as snsfrom statsmodels.graphics.gofplots import qqplotfrom statsmodels.stats.outliers_influence import OLSInfluence# Sample data creation (replace with your actual dataset loading)np.random.seed(0)n = 100depression = np.random.choice(['Yes', 'No'], size=n)age = np.random.randint(18, 65, size=n)nicotine_symptoms = np.random.randint(0, 20, size=n) + (depression == 'Yes') * 10 + age * 0.5 # More symptoms with depression and agedata = { 'MajorDepression': depression, 'Age': age, 'NicotineDependenceSymptoms': nicotine_symptoms}df = pd.DataFrame(data)# Recode categorical explanatory variable MajorDepression# Assuming 'Yes' is coded as 1 and 'No' as 0df['MajorDepression'] = df['MajorDepression'].map({'Yes': 1, 'No': 0})# Multiple regression modelX = df[['MajorDepression', 'Age']]X = sm.add_constant(X) # Add intercepty = df['NicotineDependenceSymptoms']model = sm.OLS(y, X).fit()# Print regression results summaryprint(model.summary())# Regression diagnostic plots# Q-Q plotresiduals = model.residfig, ax = plt.subplots(figsize=(8, 5))qqplot(residuals, line='s', ax=ax)ax.set_title('Q-Q Plot of Residuals')plt.show()# Standardized residuals plotinfluence = OLSInfluence(model)std_residuals = influence.resid_studentized_internalplt.figure(figsize=(8, 5))plt.scatter(model.predict(), std_residuals, alpha=0.8)plt.axhline(y=0, color='r', linestyle='-', linewidth=1)plt.title('Standardized Residuals vs. Fitted Values')plt.xlabel('Fitted values')plt.ylabel('Standardized Residuals')plt.grid(True)plt.show()# Leverage plotfig, ax = plt.subplots(figsize=(8, 5))sm.graphics.plot_leverage_resid2(model, ax=ax)ax.set_title('Leverage-Residuals Plot')plt.show()# Blog entry summarysummary = """### Summary of Multiple Regression Analysis1. **Association between Explanatory Variables and Response Variable:** The results of the multiple regression analysis revealed significant associations: - Major Depression (Beta = {:.2f}, p = {:.4f}): Significant and positive association with Nicotine Dependence Symptoms. - Age (Beta = {:.2f}, p = {:.4f}): Older participants reported a greater number of Nicotine Dependence Symptoms.2. **Hypothesis Testing:** The results supported the hypothesis that Major Depression is positively associated with Nicotine Dependence Symptoms.3. **Confounding Variables:** Age was identified as a potential confounding variable. Adjusting for Age slightly reduced the magnitude of the association between Major Depression and Nicotine Dependence Symptoms.4. **Regression Diagnostic Plots:** - **Q-Q Plot:** Indicates that residuals approximately follow a normal distribution, suggesting the model assumptions are reasonable. - **Standardized Residuals vs. Fitted Values Plot:** Shows no apparent pattern in residuals, indicating homoscedasticity and no obvious outliers. - **Leverage-Residuals Plot:** Identifies influential observations but shows no extreme leverage points.### Output from Multiple Regression Model```python# Your output from model.summary() hereprint(model.summary())```### Regression Diagnostic Plots![Q-Q Plot of Residuals](insert_url_to_image_qq_plot)![Standardized Residuals vs. Fitted Values](insert_url_to_image_std_resid_plot)![Leverage-Residuals Plot](insert_url_to_image_leverage_plot)"""# Assuming you would generate and upload images of the plots to your blog# Print the summary for submissionprint(summary)```
### Explanation:
1. **Sample Data Creation**: Simulates a dataset with `MajorDepression` as a categorical explanatory variable, `Age` as a quantitative explanatory variable, and `NicotineDependenceSymptoms` as the response variable.
2. **Multiple Regression Model**: - Constructs an Ordinary Least Squares (OLS) regression model using `sm.OLS` from the statsmodels library. - Adds an intercept to the model using `sm.add_constant`. - Fits the model to predict `NicotineDependenceSymptoms` using `MajorDepression` and `Age` as predictors.
3. **Regression Diagnostic Plots**: - Q-Q Plot: Checks the normality assumption of residuals. - Standardized Residuals vs. Fitted Values: Examines homoscedasticity and identifies outliers. - Leverage-Residuals Plot: Detects influential observations that may affect model fit.
4. **Blog Entry Summary**: Provides a structured summary including results of regression analysis, hypothesis testing, discussion on confounding variables, and inclusion of regression diagnostic plots.
### Blog Entry SubmissionEnsure to adapt the code and summary based on your specific dataset and analysis. Upload the regression diagnostic plots as images to your blog entry and provide the URL to your completed assignment. This example should help you effectively complete your Coursera assignment on testing a multiple regression model.
0 notes
Text
To successfully complete the assignment on testing a multiple regression model, you'll need to conduct a comprehensive analysis using Python, summarize your findings in a blog entry, and include necessary regression diagnostic plots. Here’s a structured example to guide you through the process:### Example Code```pythonimport pandas as pdimport numpy as npimport statsmodels.api as smimport matplotlib.pyplot as pltimport seaborn as snsfrom statsmodels.graphics.gofplots import qqplotfrom statsmodels.stats.outliers_influence import OLSInfluence# Sample data creation (replace with your actual dataset loading)np.random.seed(0)n = 100depression = np.random.choice(['Yes', 'No'], size=n)age = np.random.randint(18, 65, size=n)nicotine_symptoms = np.random.randint(0, 20, size=n) + (depression == 'Yes') * 10 + age * 0.5 # More symptoms with depression and agedata = { 'MajorDepression': depression, 'Age': age, 'NicotineDependenceSymptoms': nicotine_symptoms}df = pd.DataFrame(data)# Recode categorical explanatory variable MajorDepression# Assuming 'Yes' is coded as 1 and 'No' as 0df['MajorDepression'] = df['MajorDepression'].map({'Yes': 1, 'No': 0})# Multiple regression modelX = df[['MajorDepression', 'Age']]X = sm.add_constant(X) # Add intercepty = df['NicotineDependenceSymptoms']model = sm.OLS(y, X).fit()# Print regression results summaryprint(model.summary())# Regression diagnostic plots# Q-Q plotresiduals = model.residfig, ax = plt.subplots(figsize=(8, 5))qqplot(residuals, line='s', ax=ax)ax.set_title('Q-Q Plot of Residuals')plt.show()# Standardized residuals plotinfluence = OLSInfluence(model)std_residuals = influence.resid_studentized_internalplt.figure(figsize=(8, 5))plt.scatter(model.predict(), std_residuals, alpha=0.8)plt.axhline(y=0, color='r', linestyle='-', linewidth=1)plt.title('Standardized Residuals vs. Fitted Values')plt.xlabel('Fitted values')plt.ylabel('Standardized Residuals')plt.grid(True)plt.show()# Leverage plotfig, ax = plt.subplots(figsize=(8, 5))sm.graphics.plot_leverage_resid2(model, ax=ax)ax.set_title('Leverage-Residuals Plot')plt.show()# Blog entry summarysummary = """### Summary of Multiple Regression Analysis1. **Association between Explanatory Variables and Response Variable:** The results of the multiple regression analysis revealed significant associations: - Major Depression (Beta = {:.2f}, p = {:.4f}): Significant and positive association with Nicotine Dependence Symptoms. - Age (Beta = {:.2f}, p = {:.4f}): Older participants reported a greater number of Nicotine Dependence Symptoms.2. **Hypothesis Testing:** The results supported the hypothesis that Major Depression is positively associated with Nicotine Dependence Symptoms.3. **Confounding Variables:** Age was identified as a potential confounding variable. Adjusting for Age slightly reduced the magnitude of the association between Major Depression and Nicotine Dependence Symptoms.4. **Regression Diagnostic Plots:** - **Q-Q Plot:** Indicates that residuals approximately follow a normal distribution, suggesting the model assumptions are reasonable. - **Standardized Residuals vs. Fitted Values Plot:** Shows no apparent pattern in residuals, indicating homoscedasticity and no obvious outliers. - **Leverage-Residuals Plot:** Identifies influential observations but shows no extreme leverage points.### Output from Multiple Regression Model```python# Your output from model.summary() hereprint(model.summary())```### Regression Diagnostic Plots![Q-Q Plot of Residuals](insert_url_to_image_qq_plot)![Standardized Residuals vs. Fitted Values](insert_url_to_image_std_resid_plot)![Leverage-Residuals Plot](insert_url_to_image_leverage_plot)"""# Assuming you would generate and upload images of the plots to your blog# Print the summary for submissionprint(summary)```###
0 notes
Text
To successfully complete the assignment on testing a multiple regression model, you'll need to conduct a comprehensive analysis using Python, summarize your findings in a blog entry, and include necessary regression diagnostic plots. Here’s a structured example to guide you through the process:
Example Code
import pandas as pd import numpy as np import statsmodels.api as sm import matplotlib.pyplot as plt import seaborn as sns from statsmodels.graphics.gofplots import qqplot from statsmodels.stats.outliers_influence import OLSInfluence # Sample data creation (replace with your actual dataset loading) np.random.seed(0) n = 100 depression = np.random.choice(['Yes', 'No'], size=n) age = np.random.randint(18, 65, size=n) nicotine_symptoms = np.random.randint(0, 20, size=n) + (depression == 'Yes') * 10 + age * 0.5 # More symptoms with depression and age data = { 'MajorDepression': depression, 'Age': age, 'NicotineDependenceSymptoms': nicotine_symptoms } df = pd.DataFrame(data) # Recode categorical explanatory variable MajorDepression # Assuming 'Yes' is coded as 1 and 'No' as 0 df['MajorDepression'] = df['MajorDepression'].map({'Yes': 1, 'No': 0}) # Multiple regression model X = df[['MajorDepression', 'Age']] X = sm.add_constant(X) # Add intercept y = df['NicotineDependenceSymptoms'] model = sm.OLS(y, X).fit() # Print regression results summary print(model.summary()) # Regression diagnostic plots # Q-Q plot residuals = model.resid fig, ax = plt.subplots(figsize=(8, 5)) qqplot(residuals, line='s', ax=ax) ax.set_title('Q-Q Plot of Residuals') plt.show() # Standardized residuals plot influence = OLSInfluence(model) std_residuals = influence.resid_studentized_internal plt.figure(figsize=(8, 5)) plt.scatter(model.predict(), std_residuals, alpha=0.8) plt.axhline(y=0, color='r', linestyle='-', linewidth=1) plt.title('Standardized Residuals vs. Fitted Values') plt.xlabel('Fitted values') plt.ylabel('Standardized Residuals') plt.grid(True) plt.show() # Leverage plot fig, ax = plt.subplots(figsize=(8, 5)) sm.graphics.plot_leverage_resid2(model, ax=ax) ax.set_title('Leverage-Residuals Plot') plt.show() # Blog entry summary summary = """ ### Summary of Multiple Regression Analysis 1. **Association between Explanatory Variables and Response Variable:** The results of the multiple regression analysis revealed significant associations: - Major Depression (Beta = {:.2f}, p = {:.4f}): Significant and positive association with Nicotine Dependence Symptoms. - Age (Beta = {:.2f}, p = {:.4f}): Older participants reported a greater number of Nicotine Dependence Symptoms. 2. **Hypothesis Testing:** The results supported the hypothesis that Major Depression is positively associated with Nicotine Dependence Symptoms. 3. **Confounding Variables:** Age was identified as a potential confounding variable. Adjusting for Age slightly reduced the magnitude of the association between Major Depression and Nicotine Dependence Symptoms. 4. **Regression Diagnostic Plots:** - **Q-Q Plot:** Indicates that residuals approximately follow a normal distribution, suggesting the model assumptions are reasonable. - **Standardized Residuals vs. Fitted Values Plot:** Shows no apparent pattern in residuals, indicating homoscedasticity and no obvious outliers. - **Leverage-Residuals Plot:** Identifies influential observations but shows no extreme leverage points. ### Output from Multiple Regression Model
python
Your output from model.summary() here
print(model.summary())### Regression Diagnostic Plots ![Q-Q Plot of Residuals](insert_url_to_image_qq_plot) ![Standardized Residuals vs. Fitted Values](insert_url_to_image_std_resid_plot) ![Leverage-Residuals Plot](insert_url_to_image_leverage_plot) """ # Assuming you would generate and upload images of the plots to your blog # Print the summary for submission print(summary)
Explanation:
Sample Data Creation: Simulates a dataset with MajorDepression as a categorical explanatory variable, Age as a quantitative explanatory variable, and NicotineDependenceSymptoms as the response variable.
Multiple Regression Model:
Constructs an Ordinary Least Squares (OLS) regression model using sm.OLS from the statsmodels library.
Adds an intercept to the model using sm.add_constant.
Fits the model to predict NicotineDependenceSymptoms using MajorDepression and Age as predictors.
Regression Diagnostic Plots:
Q-Q Plot: Checks the normality assumption of residuals.
Standardized Residuals vs. Fitted Values: Examines homoscedasticity and identifies outliers.
Leverage-Residuals Plot: Detects influential observations that may affect model fit.
Blog Entry Summary: Provides a structured summary including results of regression analysis, hypothesis testing, discussion on confounding variables, and inclusion of regression diagnostic plots.
Blog Entry Submission
Ensure to adapt the code and summary based on your specific dataset and analysis. Upload the regression diagnostic plots as images to your blog entry and provide the URL to your completed assignment. This example should help you effectively complete your Coursera assignment on testing a multiple regression model.
0 notes
Text
To successfully complete the assignment on testing a multiple regression model, you'll need to conduct a comprehensive analysis using Python, summarize your findings in a blog entry, and include necessary regression diagnostic plots. Here’s a structured example to guide you through the process:
Example Code
import pandas as pd import numpy as np import statsmodels.api as sm import matplotlib.pyplot as plt import seaborn as sns from statsmodels.graphics.gofplots import qqplot from statsmodels.stats.outliers_influence import OLSInfluence # Sample data creation (replace with your actual dataset loading) np.random.seed(0) n = 100 depression = np.random.choice(['Yes', 'No'], size=n) age = np.random.randint(18, 65, size=n) nicotine_symptoms = np.random.randint(0, 20, size=n) + (depression == 'Yes') * 10 + age * 0.5 # More symptoms with depression and age data = { 'MajorDepression': depression, 'Age': age, 'NicotineDependenceSymptoms': nicotine_symptoms } df = pd.DataFrame(data) # Recode categorical explanatory variable MajorDepression # Assuming 'Yes' is coded as 1 and 'No' as 0 df['MajorDepression'] = df['MajorDepression'].map({'Yes': 1, 'No': 0}) # Multiple regression model X = df[['MajorDepression', 'Age']] X = sm.add_constant(X) # Add intercept y = df['NicotineDependenceSymptoms'] model = sm.OLS(y, X).fit() # Print regression results summary print(model.summary()) # Regression diagnostic plots # Q-Q plot residuals = model.resid fig, ax = plt.subplots(figsize=(8, 5)) qqplot(residuals, line='s', ax=ax) ax.set_title('Q-Q Plot of Residuals') plt.show() # Standardized residuals plot influence = OLSInfluence(model) std_residuals = influence.resid_studentized_internal plt.figure(figsize=(8, 5)) plt.scatter(model.predict(), std_residuals, alpha=0.8) plt.axhline(y=0, color='r', linestyle='-', linewidth=1) plt.title('Standardized Residuals vs. Fitted Values') plt.xlabel('Fitted values') plt.ylabel('Standardized Residuals') plt.grid(True) plt.show() # Leverage plot fig, ax = plt.subplots(figsize=(8, 5)) sm.graphics.plot_leverage_resid2(model, ax=ax) ax.set_title('Leverage-Residuals Plot') plt.show() # Blog entry summary summary = """ ### Summary of Multiple Regression Analysis 1. **Association between Explanatory Variables and Response Variable:** The results of the multiple regression analysis revealed significant associations: - Major Depression (Beta = {:.2f}, p = {:.4f}): Significant and positive association with Nicotine Dependence Symptoms. - Age (Beta = {:.2f}, p = {:.4f}): Older participants reported a greater number of Nicotine Dependence Symptoms. 2. **Hypothesis Testing:** The results supported the hypothesis that Major Depression is positively associated with Nicotine Dependence Symptoms. 3. **Confounding Variables:** Age was identified as a potential confounding variable. Adjusting for Age slightly reduced the magnitude of the association between Major Depression and Nicotine Dependence Symptoms. 4. **Regression Diagnostic Plots:** - **Q-Q Plot:** Indicates that residuals approximately follow a normal distribution, suggesting the model assumptions are reasonable. - **Standardized Residuals vs. Fitted Values Plot:** Shows no apparent pattern in residuals, indicating homoscedasticity and no obvious outliers. - **Leverage-Residuals Plot:** Identifies influential observations but shows no extreme leverage points. ### Output from Multiple Regression Model
python
Your output from model.summary() here
print(model.summary())### Regression Diagnostic Plots ![Q-Q Plot of Residuals](insert_url_to_image_qq_plot) ![Standardized Residuals vs. Fitted Values](insert_url_to_image_std_resid_plot) ![Leverage-Residuals Plot](insert_url_to_image_leverage_plot) """ # Assuming you would generate and upload images of the plots to your blog # Print the summary for submission print(summary)
Explanation:
Sample Data Creation: Simulates a dataset with MajorDepression as a categorical explanatory variable, Age as a quantitative explanatory variable, and NicotineDependenceSymptoms as the response variable.
Multiple Regression Model:
Constructs an Ordinary Least Squares (OLS) regression model using sm.OLS from the statsmodels library.
Adds an intercept to the model using sm.add_constant.
Fits the model to predict NicotineDependenceSymptoms using MajorDepression and Age as predictors.
Regression Diagnostic Plots:
Q-Q Plot: Checks the normality assumption of residuals.
Standardized Residuals vs. Fitted Values: Examines homoscedasticity and identifies outliers.
Leverage-Residuals Plot: Detects influential observations that may affect model fit.
Blog Entry Summary: Provides a structured summary including results of regression analysis, hypothesis testing, discussion on confounding variables, and inclusion of regression diagnostic plots.
0 notes
Text
Lasso Regression Project
#from pandas import Series, DataFrame
import pandas as pd
import numpy as np
import os
import matplotlib.pylab as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report
import sklearn.metrics
from sklearn.linear_model import LassoLarsCV
# standardize predictors to have mean=0 and sd=1
#predictors=predvar.copy()
from sklearn import preprocessing
predictors.loc[:,'CODPERING']=preprocessing.scale(predictors['CODPERING'].astype('float64'))
predictors.loc[:,'CODPRIPER']=preprocessing.scale(predictors['CODPRIPER'].astype('float64'))
predictors.loc[:,'CODULTPER']=preprocessing.scale(predictors['CODULTPER'].astype('float64'))
predictors.loc[:,'CICREL']=preprocessing.scale(predictors['CICREL'].astype('float64'))
predictors.loc[:,'CRDKAPRACU']=preprocessing.scale(predictors['CRDKAPRACU'].astype('float64'))
predictors.loc[:,'PPKAPRACU']=preprocessing.scale(predictors['PPKAPRACU'].astype('float64'))
predictors.loc[:,'CODPER5']=preprocessing.scale(predictors['CODPER5'].astype('float64'))
predictors.loc[:,'RN']=preprocessing.scale(predictors['RN'].astype('float64'))
predictors.loc[:,'MODALIDADC']=preprocessing.scale(predictors['MODALIDADC'].astype('float64'))
predictors.loc[:,'SEXOC']=preprocessing.scale(predictors['SEXOC'].astype('float64'))
# split data into train and test sets
pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, targets,
test_size=.3, random_state=123)
# specify the lasso regression model
model=LassoLarsCV(cv=10, precompute=False).fit(pred_train,tar_train)
# print variable names and regression coefficients
dict(zip(predictors.columns, model.coef_))
{'CODPERING': 0.0057904161622687605, 'CODPRIPER': 0.6317522521193139, 'CODULTPER': -0.15191539575581153, 'CICREL': 0.07945048661923974, 'CRDKAPRACU': -0.3022282694810491, 'PPKAPRACU': 0.15702206999868978, 'CODPER5': -0.11697786485114721, 'RN': -0.03802582617592532, 'MODALIDADC': 0.017655346467683828, 'SEXOC': 0.10597063961395894}
# plot coefficient progression
m_log_alphas = -np.log10(model.alphas_)
ax = plt.gca()
plt.plot(m_log_alphas, model.coef_path_.T)
plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
label='alpha CV')
plt.ylabel('Regression Coefficients')
plt.xlabel('-log(alpha)')
plt.title('Regression Coefficients Progression for Lasso Paths')
# plot mean square error for each fold
m_log_alphascv = -np.log10(model.cv_alphas_)
plt.figure()
plt.plot(m_log_alphascv, model.mse_path_, ':')
plt.plot(m_log_alphascv, model.mse_path_.mean(axis=-1), 'k',
label='Average across the folds', linewidth=2)
plt.axvline(-np.log10(model.alpha_), linestyle='--', color='k',
label='alpha CV')
plt.legend()
plt.xlabel('-log(alpha)')
plt.ylabel('Mean squared error')
plt.title('Mean squared error on each fold')
# MSE from training and test data
from sklearn.metrics import mean_squared_error
train_error = mean_squared_error(tar_train, model.predict(pred_train))
test_error = mean_squared_error(tar_test, model.predict(pred_test))
print ('training data MSE')
print(train_error)
print ('test data MSE')
print(test_error)
training data MSE 10.81377398206078
test data MSE 10.82396461754075
# R-square from training and test data
rsquared_train=model.score(pred_train,tar_train)
rsquared_test=model.score(pred_test,tar_test)
print ('training data R-square')
print(rsquared_train)
print ('test data R-square')
print(rsquared_test)
training data R-square 0.041399684741574516
test data R-square 0.04201223667290355
Results Explanation:
A lasso regression analysis was conducted to identify a subset of variables from a pool of 10 quantitative predictor variables that best predicted a quantitative response variable. All predictor variables were standardized to have a mean of zero and a standard deviation of one.
Data were randomly split into a training set that included 70% of the observations and a test set that included 30% of the observations. The least angle regression algorithm with k=10 fold cross validation was used to estimate the lasso regression model in the training set, and the model was validated using the test set. The change in the cross validation average (mean) squared error at each step was used to identify the best subset of predictor variables.
The MSE values for both training and test data are quite similar, indicating that the model performs consistently across both datasets. However, the R-square values are quite low (around 0.04), suggesting that the model does not explain much of the variance in the data.
Of the 10 predictor variables, 6 were retained in the selected model.
1 note
·
View note
Text
Linear Regression
So in my first proper post i'd like to talk a bit about Linear Regression.
In the Data Science course I am studying, we are given an example dataset (real_estate_price_size.csv) to play that contains one column with house price and another with house size. In one exercise, we were asked to create a simple linear regression using this dataset. Here is the code for it:
------------------------------------------------------------------------------
import numpy as np import pandas as pd import matplotlib.pyplot as plt import statsmodels.api as sm import seaborn seaborn.set() import unicodeit
data = pd.read_csv('real_estate_price_size.csv') #using the data.describe() method gives us nice descriptive statistics! print(data.describe())
#y is the dependent variable y = data['price'] #x is the independent variable x1 = data['size']
#add a constant x = sm.add_constant(x1) #fit the model, according to the OLS (ordinary least squares) method with a dependent variable y and independent x results = sm.OLS(y, x).fit() print(results.summary())
plt.scatter(x1,y) #coefficients obtained from the OLS summary. In this case, the property size has the multiplier of 223.1787 and the intercept of 101900 (the constant) yhat = 223.1787*x1 + 101900 fig = plt.plot(x1, yhat, lw=2, color='red', linestyle='dotted', label='Regression line') plt.xlabel(unicodeit.replace('Size (m^2)'),fontsize = 15) plt.ylabel('Price',fontsize = 15) plt.title('House size and price',fontsize =20,loc='center') plt.savefig('LinearRegression_PropertyPriceSize.png')
plt.show()
------------------------------------------------------------------------------
And here is the scatter graph of the data with a regression line!
So overall I'm very happy with this. After a bit more playing around i'd like to be able to sort out the y-axis so that the title isn't cut off!
But what does that code actually do?
A couple of things in the code that I'd like to explain:
The first 7 lines (i.e the import numpy etc) are for importing the relevant libraries. By stating something like 'import matplotlib.pyplot as plt' means that each time you want to call on matplotlib.pyplot, you only have to write 'plt'!
The next line of code, "data = pd.read_csv('real_estate_price_size.csv')" is using the pandas library and syntax to load in the real estate data (in comma seperated value [.csv] format).
By using "print(data.describe())" we can see a variety of statistics as shown below:
4. After this, the next lines of code are declaring the dependent variable (y) and the independent variable (x1). With regards to the '1' after the 'x' - it's a good habit to get into labelling them in this way (we will see later that we can make our regression more sophisticated [or in some cases not!] by using more than one x term [i.e x1, x2, x3,...,xk]).
5. The next bit is slightly more complicated - so it's probably worth taking a step back. Remember that a straight line has the equation: y = mx + c, where y = dependent variable, x = independent variable, m = slope and c = intercept (where the line cuts the y-axis). So now imagine that we're trying to fit this line to some data points - we want a constant term c that is non-zero.
A bit of a digression here - but hopefully this will make sense. Imagine that we are trying to predict someones salary based on their years of experience. We collect data on both and we want to fit a line to this data to understand how salary changes with experience.
Even if someone has zero years of experience, they still might have a base salary.
So in regresison analysis, we often include a constant term (or intercept) to account for this baseline value. When you add this constant term to your data, it ensures that the regression model considers this starting salary even if someone has zero years of experience.
This constant term is like the starting point on the salary scale, just like the y-intercept is the starting point on the graph of a straight line.
x = sm.add_constant(x1) will add constants (of 1) as a column next to our 'x1' which in this example is 'size'
So this column of ones corresponds to x0 in the equation
y_hat = b0 * x0 + b1 * x1.
So this means that x0 is always going to be 1, which in turn yields
y_hat = b0 + b1*x1
So now that we've got that cleared up, it's on to the next bit. This is the following section of code:
results = sm.OLS(y, x).fit() print(results.summary())
sm.OLS (Odinary Least Squares) is a method for estimating the parameters in a linear regression model
(y,x) represent the dependent variable we're trying to predict and the indepenedent variable we believe are influencing y.
.fit() this is a function that fits the linear regression model to your data, estimating the coefficients that best describe the relationship between the variables.
results.summary() returns this:
6. The next section is relatively straightforward! This is the bit where we actually plot the data on the graph!
So first of all we have plt.scatter(x1, y) - which is fairly self explanatory!
Next, we declare a new variable yhat. The highlighted sections in the previous image give us the values we need for the equation. So we have 223.1787 (this is our Beta1 that is multiplying x1) and we have 1.019e+05 (which is 101,900) which is our Beta0. I suppose we could have arranged it (so that it is in the same format as the diagram) as yhat = 101900 + 223.1787*x1
So the next bit - I have to be honest - I'm not entirely sure of the arguments and their order, but essentially it looks as if
fig = plt.plot(x1, yhat, lw=2, color='red', linestyle='dotted', label='Regression line')
is determining the attributes of the graph (i.e the variables for the x and y axes, the line width (lw), colour, the style of the line etc).
The next 3 to 4 lines are quite straightforward. Essentially just giving the axes a title, specifying the font size, giving the chart a title and using the plt.savefig() method to save the image!
0 notes
Text
6 python libraries to make beautiful maps
At some point any Data Scientist faces the need to analyze or model geo spatial data, and it can’t be done without the crucial visual part. As I’m a huge map-lover, I’m glad to share with you these 6 great libraries for making informative and stylish maps. Some of the libraries I shared here are more suitable for static visualizations, others for interactive ones, so the range of problems they can tackle is wide.
1. Cartopy
Cartopy is a powerful well-known library perfect for plotting static maps with scalar or polygon data. It provides many built-in layers for land, water and administrative borders. It’s really easy-to-use and has an intuitive set of commands.
As an example, let’s try to visualize MODIS data.If you want to follow along, you can find the code here.
To install the package, you can use regular expression with pip:!pip install cartopy
Now let’s load the data:import numpy as np import matplotlib.pyplot as plt
lats = np.load('lats.npy') lons = np.load('lons.npy') data = np.load('data.npy')
After this we can plot the data right away:proj = ccrs.PlateCarree() #let's set the map's projection
fig, ax = plt.subplots(subplot_kw=dict(projection=proj), figsize=(10, 20))#now we need to create a figure with the pre-set projection and a size
ax.set_extent([-160, -105, 40 ,70], crs=ccrs.PlateCarree())#let's limit the coordinates to have only the region of MODIS product
plt.contourf(lons, lats, data, transform=ccrs.PlateCarree(), cmap = 'summer') #let's add a countor of the data using matplotlib '''Adding nice cartopy features''' ax.add_feature(cfeature.BORDERS, edgecolor='black', linewidth=1) ax.add_feature(cfeature.LAKES, alpha=0.5) ax.add_feature(cfeature.LAND) ax.add_feature(cfeature.COASTLINE, edgecolor='black', linewidth=1) ax.add_feature(cartopy.feature.RIVERS, edgecolor='blue', linewidth=0.5) states_provinces = cfeature.NaturalEarthFeature( category='cultural', name='admin_1_states_provinces', scale='10m', facecolor='none') ax.add_feature(states_provinces, edgecolor='black', zorder=10, linestyle = '-', linewidth=0.5)
ax.gridlines(draw_labels=True)#formating the grid
As you can judge from the result, cartopy provides a great abundance of ways to customize your map, as you can manually set colors, linewidth, density and other parameters of your layers. Additionally, the code itself is really intuitive and easy to understand.
Cartopy is one of the tools I regularly use in my work, and I hope that you’ll find it extremely helpful as well!
2. Folium
Now let’s add the world’s countries. To do that I’ll use the defualt geopandas dataframe:import geopandas as gpd df = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
map = folium.Map(zoom_start=4, tiles="Cartodb Positron") gdf_json = df.to_json()
3. Plotly
Plotly is another famous library known for its beautiful interactive charts. Among many features, it has several functions to plot maps, such as px.choropleth, px.choropleth_mapbox, px.scatter_mapbox, px.scatter_geo and some others. You can find more details here.
As a demonstration let’s plot the same geo pandas dataset, but this time visualizing gdp_md_est variable. It can be done very easily within the following lines:import plotly.express as px
4. ipyleaflet
The fourth library I want to show you is ipyleaflet.It’s another great JS-based library for interactive mapping. One of my favorite things about this package is the amount of tiles it has. So let’s start with the basic one:from ipyleaflet import Map
You can play around with other options, there are literally dozens of them!
Now let’s plot some real world data. One of the most impressive features of the library I found is Velocity visualization. For this purpose, we can use NCEP(WMC) forecast data, which has 1° resolution. The dataset is acquired for 2016–04–30 06:00 and was provided in the docs of the ipyleaflet library. To read it we will use xarray, which is perfect to read netCDF4 files.from ipyleaflet.velocity import Velocity import xarray as xr import os import requests
if not os.path.exists('wind-global.nc'): url = 'https://github.com/benbovy/xvelmap/raw/master/notebooks/wind-global.nc' r = requests.get(url) wind_data = r.content with open('wind-global.nc', 'wb') as f: f.write(wind_data)
m = Map(center=(45, 2), zoom=4, interpolation='nearest', basemap=basemaps.CartoDB.DarkMatter)
ds = xr.open_dataset('wind-global.nc')
wind = Velocity(data=ds, zonal_speed='u_wind', meridional_speed='v_wind', latitude_dimension='lat', longitude_dimension='lon', velocity_scale=0.01, max_velocity=20) m.add(wind)
As you can see the output is not simply an interactive map, it’s an animated one. So it definitely enhances data representativeness and makes your data speak!
5. geemap
Geemap is a package for interactive mapping integrated with Google Earth Engine. So obviously it’s really convenient when your’re working with python ee library, which is python GEE package.
As a demonstration, let’s collect land cover data from Dynamic World product for an island in the Northern Europe:import ee
radius = 1250 point=ee.Geometry.Point([19.9, 60.2]) roi = point.buffer(radius) #setting a circle around the point of interest
DW = ee.ImageCollection("GOOGLE/DYNAMICWORLD/V1")\ .filterDate(start = '2022-07-08', end='2022-08-30')\ .filterBounds(roi) #getting the data DW_list = DW.toList(DW.size()) #converting the data to GEE List
Now we can do the plotting:m = geemap.Map(center=[60.2, 19.9], zoom=14)
I’d say that geemap is a really great tool to work with GEE. It has tons of different function, which can solve a solid range of tasks. The main and only disadvantage is it’s not super user-friendly. You need to know ee library syntaxis and have a general understanding of how GEE works before using geemap.
6. ridgemap
This library is the last and truly my favorite one, since it allows to make truly unique plots, which are pieces of art.
Before plotting, let’s install two libs:!pip install ridge_map mplcyberpunk
Now let’s create a map:import matplotlib.pyplot as plt from ridge_map import FontManager, RidgeMap import ridge_map as rm import mplcyberpunk import matplotlib.font_manager as fm
plt.style.use("cyberpunk") plt.rcParams["figure.figsize"] = (16,9)
fm = FontManager('https://github.com/google/fonts/blob/main/ofl/arbutusslab/ArbutusSlab-Regular.ttf?raw=true')
r = RidgeMap(bbox=(-15, 32, 45,90), font=fm.prop) #creating a map
values =r.get_elevation_data(num_lines=200) #getting elevation data values = r.preprocess(values=values, #setting hypoparameters water_ntile=70, vertical_ratio=40, lake_flatness=3)
From my point of view, this is awesome! You can check out the library, find a plenty of other visualizations and post your own:)
Hopefully you’ll find these libraries helpful and worth including in your toolbox.
0 notes