0% found this document useful (0 votes)
50 views

Regression Analysis

1. The document describes using ordinary least squares (OLS) regression to analyze the Boston housing dataset. It loads the data, selects the variables to predict ('RM') and predict ('target'), imports statsmodels to run OLS regression, prints the regression results, and records the r-squared value (0.90). 2. It then describes using multiple linear regression (MLR) on the same Boston housing dataset. It drops the 'target' variable, calculates correlations between predictors, runs OLS on all predictors with statsmodels, prints results, records r-squared (0.96) and correlation value (0.29), and saves outputs.

Uploaded by

Ragnar Main
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Regression Analysis

1. The document describes using ordinary least squares (OLS) regression to analyze the Boston housing dataset. It loads the data, selects the variables to predict ('RM') and predict ('target'), imports statsmodels to run OLS regression, prints the regression results, and records the r-squared value (0.90). 2. It then describes using multiple linear regression (MLR) on the same Boston housing dataset. It drops the 'target' variable, calculates correlations between predictors, runs OLS on all predictors with statsmodels, prints results, records r-squared (0.96) and correlation value (0.29), and saves outputs.

Uploaded by

Ragnar Main
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

1.

OLS (Ordinary Least Squares that algorithm used here)

cell 1:- (Just Shift + Enter, No need to write below code)

from sklearn.datasets import load_boston

import pandas as pd

boston = load_boston()

dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)

dataset['target'] = boston.target

print(dataset.head())

Cell 2:- 

###Start code here

X = dataset['RM']

Y = dataset['target']

###End code(approx 2 lines)

(shift + enter) 

Cell 3:- 

###Start code here


import statsmodels.api as sm

###End code(approx 1 line

(shift + enter)

Cell 4:- ###Start code here

X =  sm.add_constant(X)

statsModel = sm.OLS(Y,X)

fittedModel = statsModel.fit()

###End code(approx 2 lines)

(Shift + Enter)

Cell 5:-

###Start code here

print(fittedModel.summary())

###End code(approx 1 line)

(Shift + Enter)

Cell 6:-

###Start code here

r_squared = 0.90

###End code(approx 1 line)


(Shift + Enter)

Cell 7:-  (Just Shift + Enter no need to write below code)

import hashlib

import pickle

def gethex(ovalue):

  hexresult=hashlib.md5(str(ovalue).encode())

  return hexresult.hexdigest()

def pickle_ans1(value):

  hexresult=gethex(value)

  with open('ans/output1.pkl', 'wb') as file:

    hexresult=gethex(value)

    print(hexresult)

    pickle.dump(hexresult,file)

pickle_ans1(r_squared)

2. MLR (Multi Linear Regression Analysis)

For the execution of cell run shift + enter 

cell 1:- 

from sklearn.datasets import load_boston

import pandas as pd
boston = load_boston()

dataset = pd.DataFrame(data=boston.data, columns=boston.feature_names)

dataset['target'] = boston.target

print(dataset.head())

cell 2:-

X = dataset.drop('target',axis=1)

Y = dataset['target']

cell 3:- 

print(X.corr())

corr_value = 0.29

cell 4:- 

import statsmodels.api as sm

X = sm.add_constant(X)

fitted_model = sm.OLS(Y,X).fit()

print(fitted_model.summary())

cell 5:- 

r_squared = 0.96 

cell 6:- 

import hashlib
import pickle

def gethex(ovalue):

  hexresult=hashlib.md5(str(ovalue).encode())

  return hexresult.hexdigest()

def pickle_ans1(value):

  hexresult=gethex(value)

  with open('ans/output1.pkl', 'wb') as file:

    hexresult=gethex(value)

print(hexresult)

pickle.dump(hexresult,file)

def pickle_ans2(value):

  hexresult=gethex(value)

  with open('ans/output2.pkl', 'wb') as file:

    hexresult=gethex(value)

    print(hexresult)

    pickle.dump(hexresult,file)

pickle_ans1(corr_value)

pickle_ans2(r_squared)

You might also like