Swapnil Saurav

QualityThought Learn ML – Code

JULY 31 2022: SQL Programming

 

SELect * from olym.olym_events

select * from olym.olym_base_events

select * from olym.olym_disciplines

select ID, SPORT from olym.olym_sports

select S.ID, S.SPORT, D.Discipline from olym.olym_sports S, olym.olym_disciplines D,   WHERE s.ID = d.sport_id

select * from olym.olym_medals_view where Edition=1996 and discipline='Tennis' and Gender='Men' and event = 'singles'
select * from olym.olym_medals_view where Edition>1950 and NOC='IND' and athlete like '%KU%' order by edition desc

Date:  AUGUST 1  2022


import sqlite3
import pymysql
con = sqlite3.connect('library.db') #1. Make connection to your Database
#con = pymysql()
dbobj = con.cursor()
command = '''
Create Table Books(
BOOKID INTEGER PRIMARY KEY,
TITLE TEXT,
PRICE REAL,
COPIES INTEGER
)
'''
command = '''
Insert Into BOOKS(
BookID, Title, Price,Copies
) values(3, 'Practice Machine Learning',410.25, 18)
'''
#dbobj.execute(command)
#con.commit()

command = '''
Delete from Books where Bookid=2
'''
#dbobj.execute(command)
#con.commit()

command = '''
Update Books set copies = 9 where Bookid=3
'''
#dbobj.execute(command)
command = '''
Select * from Books
'''
dbobj.execute(command)
records = dbobj.fetchall()
for r in records:
current_count = r[3]
print(r[3]) #Tuple

if current_count >0:
command = '''
Update Books set copies = '%d' where Bookid=3
'''%(current_count-1)
dbobj.execute(command)
con.commit()
else:
print("Sorry, we do not have any copies left")
print("After removing one value: ")
command = '''
Select * from Books where BookID=3
'''
dbobj.execute(command)
records = dbobj.fetchall()
for r in records:
#current_count = r[3]
print(r)
#print(r[3]) #Tuple

while True:
print("1. Add a New Book")
print("2. Issue a Book")
print("3. Display all books")
print("4. Return a book")
print("5. Exit")
ch=int(input("Enter your choice: "))

AUGUST  2  2022

 

# File operations:
## Read - r / r+
## Write - w / w+
## Append - a
fileobj = open("files\\myfile.txt","a")
my_poem = '''HI
How are you today
You should be fine today
lets have a great day today
Enjoy your day today'''
print(fileobj.writable())
fileobj.write(my_poem)
fileobj.close()
fileobj = open("files\\myfile.txt","r")
output = fileobj.read(100)
fileobj.seek(0)
print(output)
output = fileobj.readline()
#output = fileobj.readlines()
print(output)
print("---------------")
fileobj.seek(49)
output = fileobj.read(10)
print(output)
fileobj.close()
## Read and remove vowels and save back

 

 

# File operations:
## Read - r / r+
## Write - w / w+
## Append - a
fileobj = open("files\\myfile.txt","w")
my_poem = '''HI
How are you today
You should be fine today
lets have a great day today
Enjoy your day today'''
fileobj.write(my_poem)
fileobj.close()
fileobj = open("files\\myfile.txt","r")
output = fileobj.read()
fileobj.close()
for i in "aeiouAEIOU":
new_content = output.split(i)
output = "".join(new_content)
fileobj = open("files\\myfile.txt","w")
fileobj.write(output)
fileobj.close()
#JSON

{
"Name": "Sachin Tendulkar",
"Teams": ['Mumbai','MI','India'],
"Kids": {
"Name": ['Arjun', 'Saara'],
"Age": [23,25]
}
}

#load /loads = read from json file
#dump /dumps = to write to json file
import json
txt = '{ "Name": "Sachin Tendulkar", "Teams": ["Mumbai","MI","India"] , "Branch":["A","B","C"] }'

jsonobj = json.loads(txt)

print(json.dumps(jsonobj, indent=5, sort_keys=True))
jsonfile = open("myjson.json","w")
json.dump(jsonobj, jsonfile,indent=5)

#Program to read a dictionary using loop and save the content as json in a file

txt = '{"name":"Rohit","teams":["MI","IND","M"]}'
f = open("files\\jsonfile.txt",'w')
json.dump(txt, f,indent=5)
f.close()
f = open("files\\jsonfile.txt",'r')
content = json.load(f)
print("After loading \n",content)

# File operations:
## Read - r / r+
## Write - w / w+
## Append - a
fileobj = open("files\\myfile.txt","w")
my_poem = '''HI
How are you today
You should be fine today
lets have a great day today
Enjoy your day today'''
fileobj.write(my_poem)
fileobj.close()
fileobj = open("files\\myfile.txt","r")
output = fileobj.read()
fileobj.close()
for i in "aeiouAEIOU":
new_content = output.split(i)
output = "".join(new_content)
fileobj = open("files\\myfile.txt","w")
fileobj.write(output)
fileobj.close()

AUGUST 4,  2022

 

try:
num = int(input("Enter a number: "))
a = 5
b = 0
val = a / b

except ValueError:
print("Ending the execution of program because you have not entered a valid number")
except ZeroDivisionError:
print("Zero division error, please retry")
except Exception:
print("Not sure what but some error occurred")
finally:
print("I am in finally")
print("Thank you")

# Errors:
#1. Syntax error
#2. Logical error
#3. Exception or runtime
#4. Exceptions: ZeroDivisionError, ValueError

while True:
try:
num1 = int(input("Enter first number: "))
break
except ValueError:
print("Unknown error occurred, please try again")

while True:
try:
num2 = int(input("Enter second number: "))
break
except ValueError:
print("Unknown error occurred, please try again")

sum = num1 + num2
print("Total: ",sum)


#WAP to input marks in 5 subjects and calculate total and average- use exception where necessary
class NegativeNumber(BaseException):
pass
try:
num_marks = int(input("Total number of subjects: "))
if num_marks <0:
raise NegativeNumber
sum = 0
for i in range(num_marks):
while True:
try:
marks = int(input("Enter marks in subject " + str(i + 1) + ": "))
break
except ValueError:
print("Invalid marks, try again!")
sum += marks
avg = sum / num_marks
print("Total avg = ", avg)
except ValueError:
print("Invalid input, exiting...")
except NegativeNumber:
print("Sorry you are not allowed to enter Negative numbers, exiting...")

6 AUGUST  2022

#lambda function
#anonymous function
l1 = lambda x,y : x*y
print(l1(5,4))

#map:
ls1 = [2,4,8,16,32,64]
ls2=[]
for i in ls1:
ls2.append(i**2)
print(ls2)

#map in list
result = map(lambda x: x**2, ls1)
print(list(result))

#filter
ls3 = [2,4,6,8,10,12,15,18,20,25,28,30,40]
#I want multiples of 5- that means
#filter out those which are not multiples of five
filtered_val = list(filter(lambda x: x%5==0,ls3))
print(filtered_val)
filtered_val = list(filter(lambda x: x>=18,ls3))
print(filtered_val)

#reduce
ls3 = [2,4,6,8,10,12,15,18,20,25,28,30,40]
#cumulative sum:
sum=0
for i in ls3:
sum+=i
print("Sum: ",sum)

from functools import reduce
sol = reduce(lambda x,y: x+y, ls3)
print(sol)

# take a list of values (c) and using map convert them into F

#take a list of values and filter out values which are multiples of 3 and 7 only

# take a list of values (c) and using map convert them into F
ls1 = [2,3,54,6,7,87,65]
print(list(map(lambda x : (x*(9/5)+32),ls1)))

#take a list of values and filter out values which are multiples of 3 and 7 only
ls2 = [3,6,21,34,42,63,65,78,189]
print(list(filter(lambda x : x%3==0 and x%7==0,ls2)))

8 AUGUST 2022

Program to read content from wikipedia page:

 

import requests
link = "https://en.wikipedia.org/wiki/List_of_Indian_people_by_net_worth"
website_content = requests.get(url=link).text
#print(website_content)
from bs4 import BeautifulSoup
s = BeautifulSoup(website_content,'lxml')
#print(s.prettify())
print(s.title.string)
#tables = s.find_all('table')
my_table = s.find('table', class_ = "wikitable sortable")
table_links = my_table.find_all('a')
#print(table_links)
rich_indians =[]
for l in table_links:
rich_indians.append(l.get('title'))
rich_indians.pop(0)
rich_indians.pop(0)
print(rich_indians)

9 AUGUST 2022

#NUMPY - matrix like datastructure
import numpy as np
x = range(9)
print(type(x))
x = np.reshape(x,(3,3))
print(x)
print(type(x))
print("Shape of the numpy: ",x.shape)
y=[[2,3,4],[5,6,2],[3,7,4]]
y = np.array(y)
print(y)
print(y[0])
print(y[0,2])
print(y[:,2])

#dummy values to the numpy
z = np.zeros((4,4))
print(z)
z = np.ones((4,4))
print(z)
z = np.full((4,4),2)
print(z)
idm1 = np.identity(3, dtype=int)
print(idm1)
print("Operation")
x=[[5,1,0],[1,1,2],[3,0,4]]
x = np.array(x)
y=[[2,3,4],[5,6,2],[3,7,4]]
y = np.array(y)
print(x)
print(y)
#print(x+y)
#print(x-y)
#print(x*y)
print(x/y)
#for above operations both matrices should have same shape
#MATRIX MULTIPLICATION
## condition a *b matmul m * n => b should be equal to m
x=[[5,1,0],[1,1,2],[3,0,4]]
x = np.array(x)
y=[[2,3,4],[5,6,2],[3,7,4]]
y = np.array(y)
print(x)
print(y)
z = np.matmul(x,y)
print(z)

#determinant
a= np.array([[23,14],[37,28]])
det_a = np.linalg.det(a)
print(det_a)
inv_b = np.linalg.inv(a)
print(inv_b)
print(np.matmul(a,inv_b))

a= np.array([[23,28],[23,28]])
det_a = np.linalg.det(a)
print(det_a)
#Matrix with zero determinant, is singular matrix
inv_b = np.linalg.inv(a)
print(inv_b)
print(np.matmul(a,inv_b))

10 AUGUST 2022

# 3x +4y - 7z = 2
# -2x +y -z = -6
# x +y + z = 2
#form 3 matrices:
## Coefficient matrix
## Variable matrix
## Constant matrix
### Coefficient matrix X Variable Matrix = Constant Matrix
# 5X = 15 => X = 15/5
# => variable matrx = inverse of Coefficient matrix * Constant matrix
import numpy as np
coeff_matrix = np.array([[3,4,-7],[-2,1,-1],[1,1,1]])
cont_matrix = np.array([[2],[-6],[2]])
det_coeff = np.linalg.det(coeff_matrix)
if det_coeff==0:
print("Solution is not possible")
else:
variable_mat = np.matmul(np.linalg.inv(coeff_matrix) , cont_matrix)
print(variable_mat)

AUGUST 11, 2022

# Permutation & Combination
# => selecting r things from n things
## in Permutation Order Matters - 2 cases: with or without replacement
## in Combination Order Doesnt Matter - 2 cases: with or without replacement

### P = n! / (n-r)!
### C = n! / [(n-r)! r!]

# 10 students - 4students->

#4 Coats, 3 hats, 2 umbrellas

from scipy.special import perm,comb
result = comb(10,4)
print(result)
# 6B , 4 G => 4 Students:
#1. 4B+0G, 3B +1G, 2B + 2G, 1B + 3G, 0B+4G
c1 = comb(6,4,repetition=True)
c2 = comb(6,3) + comb(4,1)
c3 = comb(6,2) + comb(4,2)
c4 = comb(6,1) + comb(4,3)
c5 = comb(4,4)
result = c1+c2+c3+c4+c5
print(result)

##4 Coats, 3 hats, 2 umbrellas
## 2
c1=perm(4,2)
c2 = perm(3,2)
c3 = perm(2,2)
result = c1 * c2 * c3
print(result)

###################
#Own a factory: 2 kinds of products: desktop & laptops
#each desktop gives you Rs 1000
# each laptop gives you Rs 2000
### How much is your profit?
# profit: 1000 * D + 2000 * L =========> OBJECTIVE
# manpower: 5000 min: D= 100 L= 41
##50 min 120 min <= Total of 5000 min
##1 2 <= 1000
# D = 1000 , L=500

##HDD: 1000
# 1 1:

# F -Full worker, P , R
#Obj: 200*F + 80 * P + 40*R
#Constraints:
## 200*F + 80 * P + 40*R <=4000

14 AUGUST  2022

## Scipy
import numpy as np
from scipy.optimize import minimize, LinearConstraint, linprog

x = 1;y = 1
profit_desktop, profit_notebook = 1000, 750
profit = profit_desktop*x + profit_notebook*y

obj_function = [-profit_notebook, -profit_desktop] #converting maximize to minimize
## constraints
lhs_contraint = [[1,1],[1,2],[4,3]]
rhs_constraint = [10000,15000,25000]
bounds =[(0,float("inf")),
(0,10000)]
opt_sol = linprog(c=obj_function, A_ub=lhs_contraint, b_ub=rhs_constraint,bounds=bounds,
method="revised simplex")
if opt_sol.success:
print("Solution is ",opt_sol)

# x + y +2000 =10000
# x+2y +0 =15000 #
# 4x + 3y + 0 <= 25000 #
# 1000 7000


### Pandas: library
### data type is called dataframe
data = [[1,"Rohit"],[2,"Pant"],[3,"Surya"],[4,"Dhawan"],[5,"Kohli"]]
import pandas as pd
data_df = pd.DataFrame(data, columns=["Position","Player"],index=["First","Second","Third","Forth","Fifth"])
print(data_df)

#fruit production
data = {
"Apples": [100,200,150,250],
"Oranges":[250,200,300,200],
"Mangoes":[150,700,800,50]
}
data_df = pd.DataFrame(data,index=["Q1 2021","Q2 2021","Q3 2021","Q4 2021"])
print(data_df)

16  AUGUST  2022


# Monday to Friday - 10am to 12 noon
# online class

## Saturday - only offline class- practice
## Sunday - only - practice
## #######################
#Pandas
import pandas as pd
link="https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/hotel_bookings.csv"
hotel_df = pd.read_csv(link)
#print(hotel_df)
df_shape = hotel_df.shape
print("Shape: ",df_shape)
print("Total rows = ",df_shape[0])
print("Data types: ", hotel_df.dtypes)
print(hotel_df['hotel'])
#filter numeric column
import numpy as np
numericval_df = hotel_df.select_dtypes(include=[np.number])
print(numericval_df)
numeric_cols =numericval_df.columns.values
print("Numeric columns in Hotel df is \n",numeric_cols)
#get non-numeric values
#exclude
nonnumericval_df = hotel_df.select_dtypes(exclude=[np.number])
print(nonnumericval_df)
nonnumeric_cols =nonnumericval_df.columns.values
print("Numeric non-columns in Hotel df is \n",nonnumeric_cols)

import matplotlib.pyplot as plt
#from matplotlib.pyplot import figure
#plt.figure((6,3))
import seaborn as sns
cols_25 = hotel_df.columns[:25]
colors = ['#FF5733','#3333FF']
sns.heatmap(hotel_df[cols_25].isnull(), cmap=sns.color_palette(colors))
plt.show()

for c in hotel_df.columns:
pct_missing = (np.mean(hotel_df[c].isnull()))*100
if pct_missing>85:
print(f"{c} - {pct_missing}%")

17 AUGUST  2022


#Pandas
import pandas as pd
link="https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/hotel_bookings.csv"
hotel_df = pd.read_csv(link)
#print(hotel_df)
df_shape = hotel_df.shape
#print("Shape: ",df_shape)
#print("Total rows = ",df_shape[0])
#print("Data types: ", hotel_df.dtypes)
#print(hotel_df['hotel'])
#filter numeric column
import numpy as np
numericval_df = hotel_df.select_dtypes(include=[np.number])
print(numericval_df)
numeric_cols =numericval_df.columns.values
print("Numeric columns in Hotel df is \n",numeric_cols)
#get non-numeric values
#exclude
nonnumericval_df = hotel_df.select_dtypes(exclude=[np.number])
print(nonnumericval_df)
nonnumeric_cols =nonnumericval_df.columns.values
#print("Numeric non-columns in Hotel df is \n",nonnumeric_cols)

import matplotlib.pyplot as plt
#from matplotlib.pyplot import figure
#plt.figure((6,3))
import seaborn as sns
cols_25 = hotel_df.columns[:25]
colors = ['#FF5733','#3333FF']
sns.heatmap(hotel_df[cols_25].isnull(), cmap=sns.color_palette(colors))
plt.show()

for c in hotel_df.columns:
missing = hotel_df[c].isnull()
num_missing = np.sum(missing)

pct_missing = (np.mean(hotel_df[c].isnull())) * 100
if pct_missing > 85:
print(f"{c} - {pct_missing}%")

for c in hotel_df.columns:
missing = hotel_df[c].isnull()
num_missing = np.sum(missing)
if num_missing >0:
hotel_df[f'{c}_missing'] = missing
#print(hotel_df.shape)
#create missing total column
missing_col_list = [c for c in hotel_df.columns if '_missing' in c]
print(missing_col_list)
hotel_df['_missing'] = hotel_df[missing_col_list].sum(axis=1)
#create bar graph
hotel_df['_missing'].value_counts().reset_index().plot.bar(x='index',y="_missing")
plt.show()
# delete the not required columns and rows
print("Before row dropping: ",hotel_df.shape)
row_missing = hotel_df[hotel_df['_missing'] > 10].index
print("========== ROW MISSING: \n",row_missing)
hotel_df = hotel_df.drop(row_missing, axis=0) #axis = 0: look for each row
hotel_df = hotel_df.drop(['company'],axis=1)
print("After row & column dropping: ",hotel_df.shape)

for c in hotel_df.columns:
missing = hotel_df[c].isnull()
num_missing = np.sum(missing)

pct_missing = (np.mean(hotel_df[c].isnull())) * 100
if pct_missing > 0:
print(f"{c} - {pct_missing}%")

med = hotel_df['babies'].median()
hotel_df['babies'] = hotel_df['babies'].fillna(med)
med = hotel_df['children'].median()
hotel_df['children'] = hotel_df['children'].fillna(med)

mode = hotel_df['meal'].describe()['top']
hotel_df['meal'] = hotel_df['meal'].fillna(mode)

mode = hotel_df['country'].describe()['top']
hotel_df['country'] = hotel_df['country'].fillna(mode)

med = hotel_df['agent'].median()
hotel_df['agent'] = hotel_df['agent'].fillna(med)

mode = hotel_df['deposit_type'].describe()['top']
hotel_df['deposit_type'] = hotel_df['deposit_type'].fillna(mode)

print("Missing values after all replacement:")
for c in hotel_df.columns:
missing = hotel_df[c].isnull()
num_missing = np.sum(missing)

pct_missing = (np.mean(hotel_df[c].isnull())) * 100
if pct_missing > 0:
print(f"{c} - {pct_missing}%")

22 AUGUST 2022

import pandas as pd
datadf = pd.read_csv(“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/Mall_Customers.csv”,index_col=0)
#print(datadf)
#slicing
#print(datadf[‘Gender’])
#print(datadf.iloc[:3,:])
#print(datadf.iloc[:3,-2:])
#print(datadf.loc[[2,4],[‘Age’,’Gender’]])

#Conditions
print(datadf[‘Age’].mean())
print(datadf.groupby(‘Gender’).mean())
print(datadf.groupby(‘Gender’)[‘Age’].mean())
print(datadf.groupby(‘Gender’)[‘Annual Income (k$)’].sum())
datadf = datadf.drop([‘Spending Score (1-100)’],axis=1#dropping row 
print(datadf)

24 AUGUST 2022

import pandas as pd
datadf1 = pd.read_csv(“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/user_usage.csv”,index_col=0)
import pandas as pd
datadf2 = pd.read_csv(“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/user_device.csv”,index_col=0)

#Merge:
print(“Size of d1: “,datadf1.shape)
print(“Size of d2: “,datadf2.shape)
result = pd.merge(datadf1, datadf2,
                  on=‘use_id’,
                  how=“left”)
print(“result df size: “,result.shape)
result = pd.merge(datadf1, datadf2,
                  on=‘use_id’,
                  how=“right”)
print(“result df size: “,result.shape)

result = pd.merge(datadf1, datadf2,
                  on=‘use_id’,
                  how=“inner”)
print(“result df size: “,result.shape)
result = pd.merge(datadf1, datadf2,
                  on=‘use_id’,
                  how=“outer”)
print(“result df size: “,result.shape)  # 159 + 81 + 113 = 353
####  Machine Learning example
import pandas as pd
df = pd.read_csv(“https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv”)

#divide this into X (input variables) and y (Output variable)
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

#To perform Machine Learning: we need Python Library: Scikit-learn
# Step 1 of Preprocessing : Missing Value handling
# no missing values

#Step 2: Handling categorical values
from sklearn.preprocessing import LabelEncoder
#2.1: Encode
lb = LabelEncoder()
X[:,3] = lb.fit_transform(X[:,3])

#2.2: Column Transform: 1 to many (#of unique values)
#print(X)
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
transform = ColumnTransformer([(‘one_hot_encoder’,OneHotEncoder(),[3])],remainder=‘passthrough’)
X = transform.fit_transform(X)
#2.3 drop anyone new column
X=X[:,1:]
print(X)

AUGUST  25  2022 (Machine Learning) 



####  Machine Learning example
import pandas as pd
df = pd.read_csv(“https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv”)

#divide this into X (input variables) and y (Output variable)
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

#To perform Machine Learning: we need Python Library: Scikit-learn
# Step 1 of Preprocessing : Missing Value handling
# no missing values

#Step 2: Handling categorical values
from sklearn.preprocessing import LabelEncoder
#2.1: Encode
lb = LabelEncoder()
X[:,3] = lb.fit_transform(X[:,3])

#2.2: Column Transform: 1 to many (#of unique values)
#print(X)
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
transform = ColumnTransformer([(‘one_hot_encoder’,OneHotEncoder(),[3])],remainder=‘passthrough’)
X = transform.fit_transform(X)
#2.3 drop anyone new column
X=X[:,1:]

from sklearn.model_selection import train_test_split
X_train, X_test,y_train,y_test =train_test_split(X,y, test_size=0.2)

#selection of algorithm
from sklearn.linear_model import LinearRegression
lm =LinearRegression()
lm.fit(X_train, y_train)  #training
y_pred = lm.predict(X_test)
result_df = pd.DataFrame({‘Actual’: y_test, ‘Predicted’: y_pred})
print(result_df)
# RMSE
#mse
from sklearn import metrics
mse = metrics.mean_squared_error(y_test,y_pred)
rmse = mse **0.5
print(“Root Mean Squared Error is: “,rmse)

#R2

#MAE
# Different phases on ML modeling
## 1. Preprocessing the dataset and making it ready for modeling
## 2. Choosing the right model – Regression / classification / clustering
## 2A. Breaking the dataset into Training and Test data
## 3. Run the model (choosing the algo and running): Training the algo 
## 4. Test your algorithm – parameter tuning 

29 AUGUST 2022

 

import pandas as pd
txt_df = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/OnlineRetail/master/order_reviews.csv")

#NLP: Natural Language Processing - NLP Analysis
#1. entire text to lowercase
#2. non-english, decomposition - convert non-english characters into English
#3. converting utf8
#4. Tokensize: converting sentence into words
#5. Removal Stop words - words which doesnt carry meaning
######################
import unicodedata
import nltk
#nltk.download('punkt')
#nltk.download('stopwords')

#function to normalize text
def normalize_text(word):
return unicodedata.normalize('NFKD',word).encode('ascii', errors = 'ignore').decode('utf-8')
#get stop words database
STOP_WORDS = set(normalize_text(word) for word in nltk.corpus.stopwords.words('portuguese'))
#STOP_WORDS =

## function tp perform all the analysis
def convert_into_lowercase(comments):
lower_case = comments.lower()
unicode = unicodedata.normalize('NFKD',lower_case).encode('ascii', errors = 'ignore').decode('utf-8')
words = nltk.tokenize.word_tokenize(unicode)
words = tuple(word for word in words if word not in STOP_WORDS and word.isalpha())
return words

analysis_txt = txt_df[txt_df['review_comment_message'].notnull()].copy()
#print(analysis_txt['review_comment_message'])
analysis_txt['review_txt'] =analysis_txt['review_comment_message'].apply(convert_into_lowercase)
#print(analysis_txt['review_txt'])


# Dont buy now
# unigram => Dont, buy, now
# bigram=> Dont buy, buy now
# trigram => Dont buy now

# create 2 datasets
rating_5 = analysis_txt[analysis_txt['review_score']==5]
rating_1 = analysis_txt[analysis_txt['review_score']==1]

def word_to_grams(words):
unigrams,bigrams,trigrams = [],[],[]
for w in words:
unigrams.extend(w)
bigrams.extend(" ".join(bigram) for bigram in nltk.bigrams(w))
trigrams.extend(" ".join(trigram) for trigram in nltk.trigrams(w))
return unigrams,bigrams,trigrams

unigram_5,bigram_5,trigram_5 = word_to_grams(rating_5['review_txt'])
unigram_1,bigram_1,trigram_1 = word_to_grams(rating_1['review_txt'])

#print(unigram_1)
#input()
#print(bigram_1)
#input()
print(trigram_1)
#input()

29 AUG 2022 - ClassWork

Z Score and Emphirical Rule (click here to access)

SEPTEMBER 9, 2022 CLASS NOTES

import pandas as pd
import numpy as np

data_df = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv")
X = data_df.iloc[:,:-1].values
y = data_df.iloc[:,-1].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
le_obj = LabelEncoder()
X[:,3] = le_obj.fit_transform(X[:,3])
from sklearn.compose import ColumnTransformer
transform = ColumnTransformer([('one_hot_encoder',OneHotEncoder(),[3])],remainder='passthrough')
X=np.array(transform.fit_transform(X), dtype=np.float)
################### ABOVE THIS COMMON FOR ALL
#drop one column
X = X[:,1:]
#print(X)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.25)
############################## REGRESSION OR CLASSIFICATION
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()

### POLYNOMIAL REGRESSION
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import Pipeline
parameter = [('polynomial',PolynomialFeatures(degree=2)),('modal',LinearRegression())]
Pipe = Pipeline(parameter)
Pipe.fit(X,y)
from sklearn import metrics
y_prep_poly = Pipe.predict(X_test)
mse = metrics.mean_squared_error(y_test,y_prep_poly)
rmse = np.sqrt(mse)
r2 = metrics.r2_score(y_test,y_prep_poly)
print("POLYNOMIAL: R2 and RMSE: ", r2,rmse)
########### BELOW IS COMMON FOR ALL REGRESSION
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)

from sklearn.svm import SVR
svr_obj = SVR(kernel='linear')
svr_obj = SVR(kernel='poly',degree=3, C=100)

mse = metrics.mean_squared_error(y_test,y_pred)
rmse = np.sqrt(mse)
r2 = metrics.r2_score(y_test,y_pred)
print("R2 and RMSE: ", r2,rmse)

import statsmodels.api as sm
from statsmodels.api import OLS
X = sm.add_constant(X)
summary = OLS(y,X).fit().summary()
print(summary)

#First elimination
X_select = X[:,[0,3,5]]
X = sm.add_constant(X)
summary = OLS(y,X_select).fit().summary()
print(summary)


import pandas as pd
import numpy as np
from sklearn import metrics
data_df = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv")
X = data_df.iloc[:,:-1].values
y = data_df.iloc[:,-1].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
le_obj = LabelEncoder()
X[:,3] = le_obj.fit_transform(X[:,3])
from sklearn.compose import ColumnTransformer
transform = ColumnTransformer([('one_hot_encoder',OneHotEncoder(),[3])],remainder='passthrough')
X=np.array(transform.fit_transform(X), dtype=np.float)
################### ABOVE THIS COMMON FOR ALL
#drop one column
X = X[:,1:]
#print(X)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.25)
############################## REGRESSION OR CLASSIFICATION



from sklearn.svm import SVR
svr_obj = SVR(kernel='linear')
svr_obj = SVR(kernel='poly',degree=3, C=100)
i=0.03
while i<=0.06:
i+=0.005
for j in range(10,1000,200):
svr_obj = SVR(kernel='rbf', C=j,gamma=i)
y_pred = svr_obj.fit(X_train, y_train).predict(X_test)
mse = metrics.mean_squared_error(y_test,y_pred)
rmse = np.sqrt(mse)
r2 = metrics.r2_score(y_test,y_pred)
print(f"gamma = {i}, C = {j}, RMSE = {rmse} ")

from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor()
regressor.fit(X_train,y_train)
y_pred = regressor.predict(X_test)
se = metrics.mean_squared_error(y_test,y_pred)
rmse = np.sqrt(mse)
r2 = metrics.r2_score(y_test,y_pred)
print(f" RMSE = {rmse} and R2 = {r2} ")

from sklearn.ensemble import RandomForestRegressor
print("Performing Random Forest regressor")
for i in range(50,1000,75):
regressor = RandomForestRegressor(n_estimators=i)
regressor.fit(X_train,y_train)
y_pred = regressor.predict(X_test)
mse = metrics.mean_squared_error(y_test,y_pred)
rmse = np.sqrt(mse)
r2 = metrics.r2_score(y_test,y_pred)
print(f" RMSE = {rmse} and R2 = {r2} ")

#Ridge LAsso as assignment
SEPTEMBER 11, 2022

import numpy as np
import pandas as pd
dataset = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv")
X =dataset.iloc[:,1:4].values
y =dataset.iloc[:,4].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
label = LabelEncoder()
X[:,0] = label.fit_transform(X[:,0])
transform = ColumnTransformer([('one_hot_encoder',OneHotEncoder(),[0])],remainder='passthrough')
X=np.array(transform.fit_transform(X), dtype=np.float)
X= X[:,1:]
print(X)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y, test_size=0.25, random_state=1)
##############################
##classifier
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression()
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

####################################
#Model Evaluation: build confusion matrix
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
cm_test = confusion_matrix(y_test, y_pred)
y_train_pred = classifier.predict(X_train)
cm_train = confusion_matrix(y_train, y_train_pred)
accuracy_test = accuracy_score(y_test, y_pred)
accuracy_train = accuracy_score(y_train, y_train_pred)

print("CONFUSION MATRIX:\n-------------------")
print("TEST: \n",cm_test)
print("\nTRAINING: \n",cm_train)
print("\n ACCURACY SCORE OF TEST: ",accuracy_test)
print("\nACCURACY SCORE OF TRAINING: ",accuracy_train)

#############################

12 SEPTEMBER 2022: CLASSIFICATION – SVC< DECISION TREE

import numpy as np
import pandas as pd
dataset = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv")
X =dataset.iloc[:,1:4].values
y =dataset.iloc[:,4].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
label = LabelEncoder()
X[:,0] = label.fit_transform(X[:,0])
transform = ColumnTransformer([('one_hot_encoder',OneHotEncoder(),[0])],remainder='passthrough')
X=np.array(transform.fit_transform(X), dtype=np.float)
X= X[:,1:]
print(X)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y, test_size=0.25, random_state=1)
##############################
##classifier
#from sklearn.linear_model import LogisticRegression
#classifier = LogisticRegression()
#from sklearn.svm import SVC
#classifier = SVC(kernel='rbf',gamma=0.1,C=100)
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion='entropy')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

####################################
#Model Evaluation: build confusion matrix
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
cm_test = confusion_matrix(y_test, y_pred)
y_train_pred = classifier.predict(X_train)
cm_train = confusion_matrix(y_train, y_train_pred)
accuracy_test = accuracy_score(y_test, y_pred)
accuracy_train = accuracy_score(y_train, y_train_pred)

print("CONFUSION MATRIX:\n-------------------")
print("TEST: \n",cm_test)
print("\nTRAINING: \n",cm_train)
print("\n ACCURACY SCORE OF TEST: ",accuracy_test)
print("\nACCURACY SCORE OF TRAINING: ",accuracy_train)

#############################
# Complete the visualization step

13 SEPTEMBER 2022

import numpy as np
import pandas as pd
dataset = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv")
X =dataset.iloc[:,1:4].values
y =dataset.iloc[:,4].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
label = LabelEncoder()
X[:,0] = label.fit_transform(X[:,0])
transform = ColumnTransformer([('one_hot_encoder',OneHotEncoder(),[0])],remainder='passthrough')
X=np.array(transform.fit_transform(X), dtype=np.float)
X= X[:,1:]
print(X)

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y, test_size=0.25, random_state=1)
##############################
##classifier
from sklearn.ensemble import RandomForestClassifier
#classifier = RandomForestClassifier(n_estimators=100,criterion='entropy')
from sklearn.linear_model import SGDClassifier
classifier = SGDClassifier(max_iter=5000, tol=0.01,penalty="elasticnet")
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

####################################
#Model Evaluation: build confusion matrix
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix
cm_test = confusion_matrix(y_test, y_pred)
y_train_pred = classifier.predict(X_train)
cm_train = confusion_matrix(y_train, y_train_pred)
accuracy_test = accuracy_score(y_test, y_pred)
accuracy_train = accuracy_score(y_train, y_train_pred)

print("CONFUSION MATRIX:\n-------------------")
print("TEST: \n",cm_test)
print("\nTRAINING: \n",cm_train)
print("\n ACCURACY SCORE OF TEST: ",accuracy_test)
print("\nACCURACY SCORE OF TRAINING: ",accuracy_train)

#############################
# Complete the visualization step

SEPTEMBER 15 2022

Practice project from below link:

1. Predict future sales:  https://thecleverprogrammer.com/2022/03/01/future-sales-prediction-with-machine-learning/

2. Predict Tip for the waiter: https://thecleverprogrammer.com/2022/02/01/waiter-tips-prediction-with-machine-learning/

SEPTEMBER 16 2022

1. NLP – Flipkart Review analysis:  https://thecleverprogrammer.com/2022/02/15/flipkart-reviews-sentiment-analysis-using-python/

2. Cryptocurrency Price Prediction: https://thecleverprogrammer.com/2021/12/27/cryptocurrency-price-prediction-with-machine-learning/

 

SEPTEMBER 17 2022

1. Demand Prediction: https://thecleverprogrammer.com/2021/11/22/product-demand-prediction-with-machine-learning/

SEPTEMBER 19 2022

from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
x,y = make_blobs(n_samples= 300, n_features=2,centers=3, random_state=88)
plt.scatter(x[:,0],x[:,1])
plt.show()
from sklearn.cluster import KMeans
cluster_obj = KMeans(n_clusters=2,init='random',max_iter=500)
Y_val = cluster_obj.fit_predict(x)
print(Y_val)
#plotting the centers
plt.scatter(x[Y_val==0,0],x[Y_val==0,1],c="blue",label="Cluster 0")
plt.scatter(x[Y_val==1,0],x[Y_val==1,1],c="red",label="Cluster 1")
#plt.scatter(x[Y_val==2,0],x[Y_val==2,1],c="black",label="Cluster 2")
#plt.scatter(x[Y_val==3,0],x[Y_val==3,1],c="green",label="Cluster 3")
#plt.scatter(x[Y_val==4,0],x[Y_val==4,1],c="Yellow",label="Cluster 4")
plt.show()
#Measure Distortion for elbow graph
distortion = [] #save distortion from each k value
for i in range(1,50):
cluster_obj = KMeans(n_clusters=i, init='random', max_iter=500)
cluster_obj.fit(x)
distortion.append(cluster_obj.inertia_)
print(distortion)
plt.plot(range(1,50),distortion)
plt.show()


from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
x,y = make_blobs(n_samples= 20, n_features=2,centers=3, random_state=88)
plt.scatter(x[:,0],x[:,1])
plt.show()
from sklearn.cluster import KMeans
cluster_obj = KMeans(n_clusters=2,init='random',max_iter=500)
Y_val = cluster_obj.fit_predict(x)
print(Y_val)
#plotting the centers
plt.scatter(x[Y_val==0,0],x[Y_val==0,1],c="blue",label="Cluster 0")
plt.scatter(x[Y_val==1,0],x[Y_val==1,1],c="red",label="Cluster 1")
#plt.scatter(x[Y_val==2,0],x[Y_val==2,1],c="black",label="Cluster 2")
#plt.scatter(x[Y_val==3,0],x[Y_val==3,1],c="green",label="Cluster 3")
#plt.scatter(x[Y_val==4,0],x[Y_val==4,1],c="Yellow",label="Cluster 4")
plt.show()
#Measure Distortion for elbow graph
distortion = [] #save distortion from each k value
for i in range(1,50):
cluster_obj = KMeans(n_clusters=i, init='random', max_iter=500)
cluster_obj.fit(x)
distortion.append(cluster_obj.inertia_)
print(distortion)
plt.plot(range(1,50),distortion)
plt.show()

SEPTEMBER 20, 2022

import pandas as pd
dataset = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/USArrests.csv")

data_df = dataset.iloc[:,1:]
print(data_df)

import scipy.cluster.hierarchy as sch
import matplotlib.pyplot as plt
plt.figure(figsize=(9,6))
dendo_obj = sch.dendrogram(sch.linkage(data_df))
plt.axhline(y=26)
plt.show()

from sklearn.cluster import AgglomerativeClustering
cluster = AgglomerativeClustering(n_clusters=3)
Y_pred = cluster.fit_predict(data_df)
print(Y_pred)
plt.figure(figsize=(9,6))
plt.scatter(data_df.iloc[:,0],data_df.iloc[:,1], c=cluster.labels_)
plt.show()

Next class on Sunday 25th

Practice below 8 projects during that time.

SEPTEMBER 28, 2022

import pandas as pd
from apyori import apriori
data = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/Market_Basket_Optimisation.csv")
print(data.shape)
products = []
cols = 20
for i in range(len(data)):
#for j in range(20):
products.append(str(data.values[i,j]) for j in range(20) )

#print(products)
association = apriori(products,min_support=0.001,min_confidence=0.1,min_lift=2)
print("Associated Products are: \n",list(association))

############################

SEPTEMBER 29, 2022

import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller
import numpy as np
#from statsmodels.tsa.arima_model import ARIMA - removed
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.seasonal import seasonal_decompose

data_df = pd.read_csv("https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/AirPassengers.csv",
index_col=['Month'],parse_dates=['Month'])
rolling_mean = data_df.rolling(window=12).mean()
rolling_std = data_df.rolling(window=12).std()

plt.plot(data_df, label="Original Data")
plt.plot(rolling_mean, color="red", label="Rolling Mean")
plt.plot(rolling_std, color="green", label="Rolling StdDev")
plt.show()

afduller_result = adfuller(data_df['#Passengers'])
print("ADF Stats = ",afduller_result[0])
print("P-Value = ",afduller_result[1]) #<0.05 then its stationary
for k,v in afduller_result[4].items():
print(k," : ",v)

#to make it stationary - we need to find log value

mean_log = data_df.rolling(window=12).mean()
std_log = data_df.rolling(window=12).std()
plt.plot(data_df, color="blue",label="Log of Original Data")
#plt.plot(data_df, color="black",label="Original Data")
plt.plot(mean_log, color="red", label="Rolling Mean")
plt.plot(std_log, color="green", label="Rolling StdDev")
plt.title("All information")
plt.show()

#Now we will perform TSA using ARIMA model
#Prediction
order_val = (2,1,2)
#tsa_model = ARIMA(data_log, order = order_val) #old
tsa_model = ARIMA(data_df['#Passengers'].values, order=(2, 1, 2))
tsa_result = tsa_model.fit()
print("Summary: \n",tsa_result.summary())

# we have 12 * 12 + 12 * 10 months
data = tsa_result.predict(264) #predict for next 10 yrs
plt.plot(data_df,color="blue",label="Log Data")
plt.plot(data,color="red",label="Fitted Value")
plt.title("Log Data and Predicted Values")
plt.show()



#new
# make predictions
predictions = tsa_result.forecast(120)
plt.plot(predictions,color="red")
plt.title("Using Forecast Method")
plt.show()

NOTE: from statsmodels.tsa.arima_model import ARIMA is no longer used, instead use:

statsmodels.tsa.arima.model.ARIMA

Predict() is no longer used, instead use forecast()
Predict was from the initial period but forecast takes the period in future

OCTOBER 8 2022

Click here for entire R content

Machine Learning with R

Learning ML – July 2022 by Digital

DAY 1: 31 JULY 2022

print(5+6+3)
print('5+6+3')
print('DDDDDDDDDDDDD')
print("DDDDDDD\tDDDDDDD",4+3+2,"This is it")
print("\\\\n is for newline")
# \ escape sequence
print("Good Morning",end="! ")
print("How are you", end="? ")
print("I am fine")
print(5+3)
a=5 # sPDSJIJFGDFSJGKDFGM print()
b=3
print(a+b)
a=50
#comment
'''
This is a sample comment
Please ignore
another line of
comment
'''

# Data Types - basic: int, float, str, bool, complex
# int - integer: non decimal values: 8 -5 -4 99 8989999
# float - +/- values with decimal point: 5.0 -5.0 8.989
# str - string "hello" 'morning' '''How''' """Thanks"""
#bool - boolean: True / False
#complex - square root of -1 (i) j in python: 5j: sqrt(-100) 100*-1 = 10j

val1 = 5
print(val1, " : ",type(val1)) #type
val1 = -3.0
print(val1, " : ",type(val1))
val1 = "Hello"
print(val1, " : ",type(val1))
val1 = True
print(val1, " : ",type(val1))
val1 = 4+2j
print(val1, " : ",type(val1))
val2 = 4-2j
print(val1 * val2) #(a+b)(a-b) = a sq - b sq: 16 + 4 = 20 +0j)
a,b,c = 3,8.6,True
print(a,type(b),c)
# 5.6 + 4.4 = 10.0

DAY 2:  1 AUGUST  2022

Download the Python Installer from here (we will install Python 3.9.9):

https://www.python.org/downloads/release/python-399/

You can follow the instructions to install from here:
http://learn.swapnil.pw/python/pythoninstallation/


Watch installation video from here:   https://www.youtube.com/watch?v=mhpu9AsZNiQ

Download & Install IDE for Python Programming

Follow the steps from here: https://learn.swapnil.pw/python/pythonides/


DAY 3: 2 AUGUST 2022

#Basic data types

#int
print("Working with Integer: ARITHEMATIC OPERATIONS")
var1 = 7
var2 = 3
print(var1 + var2) #add
print(var1 - var2) #difference
print(var1 * var2) #multiply
print(var1 / var2) #divide (float value)
print(var1 // var2) #integer divide (only integer part)
print(var1 ** var2) #var1 to the power of var2
print(var1 % var2) #modulo - this gives us remainder
print( 4+5/2+4-5*3+2*5)
print( 5.5)
#float
print("Working with Float now")
var1 = 7.0
var2 = 3.0
print(var1 + var2) #add
print(var1 - var2) #difference
print(var1 * var2) #multiply
print(var1 / var2) #divide (float value)
print(var1 // var2) #integer divide (only integer part)
print(var1 ** var2) #var1 to the power of var2
print(var1 % var2) #modulo - this gives us remainder
#str
print("Working with Strings now")
var1 = 'Good Morning'
var2 = "How are you?"
var3 = '''How are you today?
Whats your plan?
Are you doing well?'''
var4 = """I am fine"""
print(var1)
print(var3)
print(var1 + " "+var2)
print((var1 + " ")* 3)
#bool
var1 = True
var2 = False
#AND OR NOT XOR - operations for boolean values
# AND : If one of the value is False then result will be false
#- otherwise it will True
print("AND OPERATION")
print(True and True)
print(True and False)
print(False and True)
print(False and False)
#Prediction: Rohit and Surya will open the batting
#Actual: Rohit and Pant opened the batting
print("OR OPERATION")
print(True or True)
print(True or False)
print(False or True)
print(False or False)
#Prediction: Rohit or Surya will open the batting
#Actual: Rohit and Pant opened the batting

#complex (imaginary numbers)
print("Working with Complex now")
var1 = 6j
print(var1 **2) #6j*6j = 36* -1= -36 + 0j
print(var1 * var1)

#Comparison Operator: Output is always boolean
print(5 > 6) #False
print(6 < 6) #False
print(5 <= 6) #True
print(6 >= 6) #True
print(6==6) #True
print(5!=5) #False

DAY 4  :  AUGUST  3, 2022

#WAP to find the total cost when quantity and price of each item is given
quant = 19
price_each_item = 46
total_cost = price_each_item * quant
print("Total cost for",quant,"quantities with each costing Rs",price_each_item,"is Rs",total_cost)
print(f"Total cost for {quant} quantities with each costing Rs {price_each_item} is Rs {total_cost}")
print(f"Total cost for {quant} quantities with each costing Rs {price_each_item} is Rs {total_cost}")

quant = 3
total_cost = 100
price_each_item = total_cost / quant
print(f"Total cost for {quant} quantities with each costing Rs {price_each_item:0.2f} is Rs {total_cost}")

#Format string values
name = "Kohli"
country = "India"
position = "One Down"
print(f"Player {name:.<15} represents {country:^10} and plays at {position:>10} for international matches")

name = "Ombantabawa"; country = "Zimbabwe"; position = "opener"
print(f"Player {name:<15} represents {country:_^10} and plays at {position:X>10} for international matches")
# Add padding - we will fix the number of spaces for each variable

#logical line v physical line
#; to indicate end of line but its not mandatory

a,b,c,d = 10,20,30,40 #assign multiple values
print(c) #30
print("Hello\nThere")
print("Good Morning",end=". ")
print("How are you?")

# wap to take side of a square as input from the user and perform area and perimeter
#area = sides ** 2
#perimeter = 4 * s
side = input("Enter the side value: ") # is used to take input (dynamic value)
side = int(side) # to convert into integer, we use int(). flot -> float(), string -> str()
print(side, ":",type(side))

perimeter = 4 * side
area = side ** 2
print(f"Square of {side} has a perimeter of {perimeter} and the area is {area}")

## Use input() and formatting for below assignments
#1. Input the length and breadth of a rectangle and calculate area and perimeter
#2. Input radius of a circle and calculate area and circumference

 

DAY 5: AUGUST 4, 2022

#Conditions -  IF, IF-ELSE,  IF - ELIF - ELSE, IF - ELIF............ ELSE
avg = 40
#if avg > 50: pass
if avg >50:
print("Congratulations")
print("You have passed")
print("Great job")
else:
print("Sorry, you have failed")

# avg > 70: Grade A, avg > 50 B ;
avg = 60
if avg >=70:
print("Grade A")
elif avg >=50:
print("Grade B")
else:
print("Grade C")

#Avg > 80: A, 70: B, >60: C, >50: D, 40: E, <40: F
avg = 90
#Nested IF- if inside another if
if avg>=80:
if avg >=90:
print("AWESOME PERFORMANCE")
if avg >=95:
print("You win Presidents Medal")

print('grade A')
elif avg>=70:
print('grade B')
elif avg>=60:
print('grade C')
elif avg>=50:
print('grade D')
##########3
avg=90
if avg>=80:
print ('grade A')
elif avg>=70:
print('grade B')
elif avg>=60:
print('grade C')
elif avg>=50:
print('grade D')
elif avg>=40:
print('grade E')
else:
print('grade F')

#WAP to check if a person is eligible to vote in India or not
#Age >=18 then you will check nationality - yes

age = 18
nationality = "indian"
if age >=18:
if nationality =="Indian":
print("You are eligible to vote in India")
else:
print("Sorry, only Indians are eligible to vote")
else:
print("Sorry you do not meet the required criteria")

age =20
if age>=18:
print(" awesome you are eiligble to vote ")


#Assignment: Take 3 numbers and print the highest, second highest and the lowest value

 

DAY 6:  11 AUGUST  2022

#Strings
val1 = "Hello"
val2 = 'Good Morning'
val3 = '''Hello
how are
you'''
val4 = """I am fine thank you"""

# what's your name?
print("what's your name?")
print('what\'s your name?') #escape character - \ it works only for 1 character after
print("\\n will give new line")
print("\\\\n will give new line")
val1 = "Hellodsogjidaoioadpif orgpoaitpoaigtpoafdifgpo poergpadigpifgpi igopigof oprgiodfigdofig"

#indexing /slicing - []
print(val1[0]) #first character
print(val1[4])
print(len(val1))
tot_char = len(val1)
print(val1[tot_char - 1]) #last character
print(val1[tot_char - 3]) #3rd last character
print(val1[- 1]) #last character
#series of characters in continuation
val1 = "GOOD DAY"
print(val1[5:8])
print(val1[0:4])
print(val1[:4])
#negative indexes
print(val1[-5:-2])
print(val1[-7:-5])
print(val1[-3:]) #DAY
print(val1[:]) #DAY

Day 7 – AUGUST 16 , 2022     STRING -2 and Basic Intro to IF Condition


#Methods in String
txt1 = "Good Morning"
print(txt1[-5:-1])
#len(txt1)
fname="Sachin"
print(fname.isalpha())
print(fname.isupper()) #SACHIN
print(fname.islower()) #sachin tendulkar
print(fname.istitle()) #Sachin Tendulkar
print(fname.isdigit())
print(fname.isalnum())
print(fname.upper())
print(fname.lower())
print(fname.title())
fname = "Sachin Tendulkar" #input("Enter your name: ")
print(fname.upper().count("S"))
print(fname.upper().count("S",1,7))
txt1 = "First Second Thirty first thirty fifth sixty first sixty ninth"
print("Total first are: ",txt1.upper().count("ST "))
print(fname.lower().index("ten"))
print(fname.lower().replace("ten","eleven"))

############# IF Condition ###########
fname = "Sachin Tendulkar"
ten_count = fname.lower().count("ten")
print(ten_count)
if ten_count >0:
print("Counting the number of ten(s)")
print(fname.lower().index("ten"))
print("Thank You")

DAY 8 : AUGUST 18 , 2022

# Conditions
avg = 30
if avg >=40:
print("You have passed")
print("Congratulations")
else:
print("I am in else")
print("Sorry, you havent passed")
print("Thank you")

num = 0
if num >0:
print(f"{num} is positive")
elif num==0:
print("Zero is neither positive or negative")
else:
print(f"{num} is negative")

### Take marks in 5 subjects, calculate total and avg, based on avg assign grades
marks1 = int(input("Enter marks in subject 1: "))
marks2 = int(input("Enter marks in subject 2: "))
marks3 = int(input("Enter marks in subject 3: "))
marks4 = int(input("Enter marks in subject 4: "))
marks5 = int(input("Enter marks in subject 5: "))
total = marks1 + marks2 + marks3 + marks4 + marks5
avg = total / 5
print(f"Student has scored total marks of {total} and average of {avg}")
#avg>=80: A, avg>=70: B, avg>=60: C, avg>=50: D, avg>=40: E, avg<40: Failed
#avg >=90: win school medal / avg>95: President Medal
if avg>=80:
print("You have scored Grade A")
if avg>=90:
if avg>=95:
print("You win President Medal")
else:
print("You win School Medal")
elif avg>=70:
print("You have scored Grade B")
elif avg>=60:
print("You have scored Grade C")
elif avg>=50:
print("You have scored Grade D")
elif avg>=40:
print("You have scored Grade E")
else:
print("You have scored Failed Grade")
if avg>=35:
print("You just missed, try harder next time")
elif avg>=20:
print("Please study hard")
else:
print("You are too far behind")
##########
#WAP to find the bigger of the 2 numbers
num1,num2 = 30,50
if num1>=num2:
print(f"{num1} is greater than or equal to {num2}")
else:
print(f"{num2} is greater than {num1}")

#WAP to find the bigger of the 3 numbers
num1,num2,num3 = 90,50,140
b1 = num1
if num1>=num2: #between num1 and num2 we know num1 is greater
if num1 >= num3:
b1=num1
else: #num1 >num2 and num3 > num1
b1=num3

else: #num2 is greater than num1
if num2 > num3:
b1=num2
else: #num1 >num2 and num3 > num1
b1=num3
print(f"{b1} is greatest")

## get the order of 3 numbers (decreasing order)
num1,num2,num3 = 9,50,40
b1,b2,b3 = num1,num1,num1
if num1>=num2: #between num1 and num2 we know num1 is greater
if num1 >= num3:
b1=num1
if num2 >=num3:
b2,b3=num2, num3
else:
b2, b3 = num3, num2
else: #num1 >num2 and num3 > num1
b1,b2,b3=num3,num1,num2

else: #num2 is greater than num1
if num2 > num3:
b1=num2
if num1>=num3:
b2,b3=num1,num3
else:
b2, b3 = num3, num1
else: #num1 >num2 and num3 > num1
b1,b2,b3=num3,num2,num1

print(f"{b1} >= {b2} >= {b3}")

DAY 9:  AUGUST 20, 2022

#Loops : repeat set of lines of code multiple times
# for - for loop when we know how many times
# while - used for repeatition based on conditions
for i in range(0,5,1): #generate values starting from zero upto 5(excluded), increment is 1
print(i+1)

for i in range(2,15,4): #generate values starting from 2 upto 15(excluded), increment is 3
#print(i+100) #3,7,11, 15
print("Hello")
for j in range(3,7): #start & end - default is increment = 1
print(j)
for i in range(4): #its ending value, default is start(=0) & increment (=1)
print(i+1)

n = 5
for i in range(n):
print("*",end=" ")
'''
* * * * *
* * * * *
* * * * *
* * * * *
* * * * *
'''
print("\n2...........")
for j in range(n):
print()
for i in range(n):
print("*",end=" ")
print()

'''
*
* *
* * *
* * * *
* * * * *
'''
print("\n3...........")
for j in range(n):
print()
for i in range(j+1):
print("*",end=" ")
print()

'''
* * * * *
* * * *
* * *
* *
*
'''
print("\n4...........")
for j in range(n):
print()
for i in range(n-j):
print("*",end=" ")
print()

#Assignment
'''
*
* *
* * *
* * * *
* * * * *
'''

 

DAY 10: AUGUST 21, 2022

 

#While
##repeat block of code

 

n=1
while n<=5:
    print(n)
    n+=1

 

#wap to read marks in 3 subjects and calculate sum and average till user want
choice = “y”

 

while choice==“n”#entry check is not important
    sum=0
    for i in range(3):
        marks = int(input(“Enter the marks in subject “+str(i+1)+“: “))
        sum+=marks
    avg = sum/3
    print(f“Sum is {sum} and average is {avg})
    choice = input(“Type y to continue, anyother key to stop: “)

 

# instances where entry check is not important, you can create infite loop
while True#entry check is not important
    sum=0
    for i in range(3):
        marks = int(input(“Enter the marks in subject “+str(i+1)+“: “))
        sum+=marks
    avg = sum/3
    print(f“Sum is {sum} and average is {avg})
    choice = input(“Type y to continue, anyother key to stop: “)
    if choice!=‘y’:
        break

 

#wap to generate numbers between given input values
sn = int(input(“Enter the start number: “))
en = int(input(“Enter the end number: “))
for i in range(sn,en+1):
    print(i, end=”   “)
print()

while sn <= en:  #entry check is important
    print(sn, end=”   “)
    sn+=1
print()

#Assignment 1: WAP to check if a number is prime or not
#Assignment 2: WAP to generate first 10 multiples of given value

 

DAY 11: AUGUST 22, 2022
 

 

#wap to read menu options
import getpass
dict_username = {}
while True:
    print(“Select your options: “)
    print(“1. Register \n2. Add Member\n3. Add Books\n4. Issue Books \n5. Return Books”)
    print(“6. Display Username”)
    print(“\n11. Quit”)


    ch=int(input(“Your Option: “))
    if ch==1:
        uname = input(“Enter username: “)
        passwd = getpass.getpass(“Enter Password: “)  #input(“Enter password: “)
        t_dict = {uname:passwd}
        dict_username.update(t_dict)
    elif ch==2:
        pass
    elif ch==3:
        pass
    elif ch==4:
        pass
    elif ch==5:
        pass
    elif ch==11:
        break
    elif ch==6:
        #Displaying username
        print(“Usernames are:”)
        for i in dict_username.values():
            print(i)
    else:
        print(“Invalid option! Please try again…  “)
        continue


    print(“Your Option has been successfully completed!”)

#wap to guess the number thought by the computer
import random
comp_num = random.randint(1,100#int(input(“Enter a number: “))
counter = 0
while True:
    guess_num = int(input(“Guess the number: “))
    counter+=1
    if guess_num == comp_num:
        print(f“You have guessed the number correctly in {counter} attempts!”)
        break
    else:
        print(“You have not correctly guessed the number!”)
        if guess_num > comp_num:
            print(“HINT: You have guessed a higher number!!!”)
        else:
            print(“HINT: You have guessed a lower number!!!”)
    
#wap to guess the number thought by the computer
import random
comp_num = random.randint(1,100#int(input(“Enter a number: “))
counter = 0
low,high=1,100
while True:
    guess_num = random.randint(low,high) #int(input(“Guess the number: “))
    counter+=1
    if guess_num == comp_num:
        print(f“You have guessed the number {comp_num} correctly in {counter} attempts!”)
        break
    else:
        print(f{guess_num} is not correct”)
        if guess_num > comp_num:
            #print(“HINT: You have guessed a higher number!!!”)
            high = guess_num-1
        else:
            #print(“HINT: You have guessed a lower number!!!”)
            low = guess_num+1
    

#wap to guess the number thought by the computer
import random
comp_num = random.randint(1,100#int(input(“Enter a number: “))
counter = 0
low,high=1,100
while True:
    guess_num = (low+high)//2  #random.randint(low,high) #int(input(“Guess the number: “))
    counter+=1
    if guess_num == comp_num:
        print(f“You have guessed the number {comp_num} correctly in {counter} attempts!”)
        break
    else:
        print(f{guess_num} is not correct”)
        if guess_num > comp_num:
            #print(“HINT: You have guessed a higher number!!!”)
            high = guess_num-1
        else:
            #print(“HINT: You have guessed a lower number!!!”)
            low = guess_num+1
Day 12: AUGUST 25 2022 – STRING – 2 and LIST – 1

 #Strings
txt1 = “Hello”
txt2 = ‘Good Morning Good Day’
txt3 = ”’How are you?
where are you going
when will you be back”’
txt4 = “””I am fine
I am going to school
I will be back in the evening”””
print(type(txt1), type(txt2),type(txt3),type(txt4))
print(txt3)
print(txt2)

print(txt1 + txt2)
print(“7” + “8”)
print(txt1 * 4)
num = 78
num = int(str(78)*4)
print(num)
print(“Fine” in txt4) #membership test
for i in txt1:
    print(i, end=” “)
print()

txt1 = “Good Morning”
#Strings are immutable
#txt1[0]=”H” – you cant edit/ overwrite
txt1 = “H” + txt1[1:]
print(txt1)
#reverse the text
reverse_str = “”
for i in txt1:
    reverse_str = i+reverse_str
print(“Reversed String: “,reverse_str)

given_txt = “Sachin;Kohli;Rohit;Kapil;Dhoni;”
if “;” in given_txt:
    given_txt = given_txt.replace(“;”,” “)
    print(given_txt)
else:
    print(“Sorry, text doesnt have : as separator”)

# WAP to read a string and find sum of all the numbers only 
#and keep doing till you find single number
txt1 = “sifdsdi43250934ur934ur09csdi43250934ur09c”

while len(txt1)!=1:
    sum=0
    for i in txt1:
        if i.isdigit():
            sum+=int(i)
    txt1 = str(sum)

print(“Final Sum is “,sum)

#List”
list1 = [2,4,5.5,“Hello”True,[3,6,9]]
print(type(list1))
print(len(list1))
print(list1[0])
print(list1[-1])
print(list1[-3:-1])
print(list1[-2:])
print(type(list1[-2]))
print(list1[-3][-3:])
print(list1[1]+list1[0])
list2 = [4,8,12]
print(“list addition: “,list1 +list2)
for i in list2:
    print(i)

##############
list1 = []
#adding members using append
list1.append(5#added at the back
list1.append(15)
list1.append(25)
list1.append(35)
print(list1)
#adding using insert(position,value)
list1.insert(1,10)
list1.insert(3,20)
print(list1)

#WAP a program to read marks of 5 students and calculate
sum=0
marks = []
for i in range(5):
    m = int(input(“Enter marks for subject “+str(i+1)+“: “))
    sum+=m
    marks.append(m)
print(f“Marks obtained are {marks} and the total is {sum})

# modify the above program to read marks of 5 students
#all_marks = [[],[],[],[]]

DAY 13 : AUGUST 27, 2022

#27 AUGUST 2022

list1 = []
print(len(list1))
#append() – adds at the last
#insert() – inserts at given position
#pop() – removes from given position
#remove() – removes given value

#Queues: First In First Out (FIFO)
my_queue = []
while True:
    print(“Select following options:”)
    print(“1. Display the content of the Queue\n2. Add a new member”)
    print(“3. Remove the member\n4. Exit”)
    ch=input(“Ënter your choice:”)
    if ch==“2”:
        inp = input(“Enter the member to be added: “)
        my_queue.append(inp)
    elif ch==“3”:
        if len(my_queue)<=0:
            print(“Sorry, there is no one in the queue!”)
            continue
        my_queue.pop(0)
    elif ch==“1”:
        print(“Current members in the queue: \n”,my_queue)
    elif ch==“4”:
        break
    else:
        print(“Invalid Option, try again!”)
#Stack: Last In First Out (LIFO)

#Implement Stack as assignment

#Strings are immutable
str1 = “Hello”
#str1[1]= “E”
list1 = [“H”,“E”,“L”,“L”,“O”]
list1[1] = “K”
print(list1)
#Lists are MUTABLE
list1 = [“H”,“E”,“L”,“L”,“O”]
list2 = list1   #shallow copy
list3 = list1.copy()   #deep copy
print(“1. List 1”, list1)
print(“1. List 2”, list2)
print(“1. List 3”, list3)
list1.append(“K”)
list2.append(“L”)
list3.append(“M”)
print(“2. List 1”, list1)
print(“2. List 2”, list2)
print(“2. List 3”, list3)

list3.clear()
print(list3)
del list3  #delete the variable
#print(list3)
count = list1.count(“L”)
print(count)
list3 = list1 + list2
#list1 = list1 + list2
list1.extend(list2)
print(list1)
print(list1.index(“O”))
#index can take 2 other values:
## 1. start value-it will search in the string after this index
## 2. end value – search till this index
value_to_search = “L”
count = list1.count(value_to_search)
print(f“The indexes of {value_to_search} are: “,end=“”)
start_search = 0
for i in range(count):
    ind = list1.index(value_to_search,start_search)
    print(ind,end=”  “)
    start_search= ind + 1
print()
# reverse() – reverse the list  values
list1.reverse()
print(list1)
list1.sort()
print(list1)
#list1.reverse()
list1.sort(reverse = True)
print(list1)

 

DAY 14 : AUGUST 28, 2022

 

#TUPLE
#Its immutable version of list
t1 =(3,5,7,9,11,3,5,7,9,3,5)
print(type(t1))
print(t1.count(7))
print(t1.index(5))

t1 = list(t1)
t1=()
t2=(2,3)
#just one value in tuple:
t3 = (3,)

if (23,54) > (23,54,99,89):
  print(23,54)

#unpacking
t1 = (3,5,7)
a,b,c = t1
print(a,b,c)


#Dictionary
dict1 = {9:“Sachin”“Name”“Rohit”True : “Cricket”}
#key can be anything but they have to be unique
print(dict1[True])
temp = {5.6“Mumbai”}
dict1.update(temp)

print(dict1)
#wap to input marks in 3 subjects and save under rollno
all_info = {}
for i in range(3):
  temp_dict ={}
  t_list = []
  rollno = int(input(“Enter the Roll No.: “))
  for j in range(3):
    m = int(input(“Enter the marks: “))
    t_list.append(m)
  temp_dict = {rollno: t_list}
  all_info.update(temp_dict)

print(“Marks of all students are: “,all_info)

all_info = {101: [767869], 102: [985671], 68: [528988]}
print(all_info.keys())
print(all_info.values())
print(all_info.items())

for i,j in all_info.items():
  print(i,” : “,j)


 

DAY 15 : AUGUST 29, 2022


#WAP where we input date, month and year in numbers
# and display as – date(st/nd/rd/th) Month_in_text Year
# eg. date = 25  month = 8 year = 2022
# output would be 25th August 2022
month_txt = [‘January’‘February’,‘March’,‘April’,‘May’,
             ‘June’‘July’,‘August’,‘Setember’,‘October’,
             ‘November’‘December’]
date_th = [‘st’,‘nd’,‘rd’] + 17*[‘th’] + [‘st’,‘nd’,‘rd’] +7*[‘th’] +[‘st’]
date = int(input(“Enter the Date: “))
month = int(input(“Enter the Month: “))
year = input(“Enter the Year: “)
result = str(date)+date_th[date-1]+” “ + month_txt[month-1]+” “ + year
print(result)


#Assignment: input marks of 5 subjects for 5 students and display
#the highest marks in each subject and also for overall and name the student


### Assignment – rewrite the below program  by using list
#wap to arrange given 3 numbers in increasing order
#enter numbers as: 45, 75, 35 => 35 45 75
a,b,c = 85, 75,95
l1,l2,l3 = a,a,a
if a < b: #when a is less than b
if a<c: # a is less than b and a is less than cv [
l1 = a
if b<c:
l2,l3=b,c
else: #c is less than b
l2,l3 = c,b
else: #a is less than b and greater than c [e.g. 3 5 2]
l1,l2,l3=c,a,b

else: #when b is less than a
if b <c:
l1 =b
if a <c:
l2,l3=a,c
else:
l2,l3 = c,a
else: # c <b
l1,l2,l3 = c,b,a

print(f"{l1} <= {l2} <={l3}")

DAY 16: SEPTEMBER 3, 2022

#Dictionary
list1 = [4,5,6,7] #automatic position
dict1 = {"Harsh": 8.4,"Manish":8.12}
print(dict1["Harsh"])
#update to add another dictionary
dict2 = {"Sachin": 5.6, "Laxman": 7.2}
dict1.update(dict2)
print(dict1)
# FUT, QY, HY, SUT, FI (5%, 20%, 25%, 5%, 45%)
all_info ={}
for i in range(2): #2 students
name = input("Enter Name: ")
main_list=[] #list of list
for k in range(5): # 5 types of exams
t_list = []
for j in range(3):
marks = int(input("Enter marks: "))
t_list.append(marks)

 

main_list.append(t_list)
t_dict = {name: main_list}
all_info.update(t_dict)

#{“Manish”: [[],[],[],[],[]]} – final output template
#Now we have the data

#to get all the keys:
keys = all_info.keys()
values = all_info.values()
items = all_info.items() # (key, value)

apply_std = [5, 20, 25, 5, 45]
#updated marks will be stored in another dictionary:
final_marks ={} #{name: []}
for k,v in all_info.items(): #[[5,7,8],[55,55,55],[66,66,66],[9,8,7],[88,88,88]]
updated_marks=[]
for i in range(3): #3 subjects ke liye
add = 0
for j in range(5): #5 exams
add = add + v[j][i] * apply_std[j]/100 #v[0][0] * apply[0] + v[1][0]* apply[1] +
updated_marks.append(add)
final_marks.update({k:updated_marks})

 

DAY 17: 11 SEP 2022

#Function
def mystatements():
print("Hello")
print("How are you doing?")
print("Good morning")
def mystatements2(name,greeting): #required & positional
print("Hello",name)
print("How are you doing?")
print(greeting)

def mystatements3(name, greeting="Good Morning"): #default & positional
#name is required and greeting is default
#required parameters are given before default
print("Hello",name)
print("How are you doing?")
print(greeting)
return 100

mystatements()
result = mystatements2("Sachin","Good Morning")
print(result) #None is returned
result = mystatements3("Sachin")
print(result)

#function to take 2 numbers as input and
# perform add, sub,multiplication & division
# create - 2 functions: 1)required positional
# 2)default wheren numbers are 99 & 99
#return 4 answers as tuple
OSError errno22 invalid argument in Python

What is OSError?
OSError is the type of error in OSError : [errno22] invalid argument. OSError is an error class for the OS module. It is a built-in exception in python, which is raised. It is raised when the error occurs due to some system failure. I/O failures also give rise to OSErrors.

When the disk is full, or the file cannot be found, OSError is raised. The subclasses of OSError are BlockingIOError, ChildProcessError, ConnectionError, FileExistsError, FileNotFoundError, etc. OSError itself is derived from the EnvironmentError.

What is errorno22 invalid argument?
As the name suggests, invalid argument errors occur when an invalid argument is passed to a function. If a function was expecting an argument of a particular data type but instead received an argument of a different data type, it will throw an invalid argument error.

import tensorflow as tf
tf.reshape(1,2)

This code will raise invalid argument error. The tf.reshape() function was expecting a tensor as an argument. But instead, it received 1 and 2 as the argument.

‘OSError : [errno22] invalid argument’ while using read_csv()
Read_csv() is a function in pandas which is used to read a csv file in python. We can read a csv file by accessing it through a URL or even locally. While reading a csv file using read_csv, python can throw OSError : [errno22] invalid argument error.

Let us try to understand it with the help of an example. The below code has been executed in python shell to access local files. First, we shall import the pandas file to use read_csv()

import pandas as pd
file = read_csv(“C:\textfile.csv”)

The above line of code will raise the below error.

OSError: [Errno 22] Invalid argument: ‘C:\textfile.csv’
The reason behind the error is that python does not consider the backslash. Because of that, it showed oserror invalid argument. So what we have to do is that instead of a backslash, we have to replace it with a forwarding slash.

Correct method:
file = read_csv(“C:/textfile.csv”)

‘OSError : [errno22] invalid argument’ while using open()
We can get OSError : [errno22] invalid argument error while opening files with the open() function. The open() function in python is used for opening a file. It returns a file object. Thus, we can open the file in read, write, create or append mode.
Let us understand the error by taking an example. We shall try to open a .txt file in read mode using open(). The file would be returned as an object and saved in variable ‘f’.

f = open(“C:\textfile.txt”,”r”)

The code will throw the below error.

Traceback (most recent call last):
File “”, line 1, in
f = open(“C:\textfile.txt”,”r”)
OSError: [Errno 22] Invalid argument: ‘C:\textfile.

The OSError : [errno22] invalid argument error has been thrown because of the same reason as before. Here also, python fails to recognize the backslash symbol. On replacing backslash with forward slash, the error will be resolved.

Correct format:
f = open(“C:/textfile.txt”,”r”)

‘OSError : [errno22] invalid argument’ while reading image using open()
The above error can appear while opening an image using the open() function even though the backslash character has been replaced with forward slash. Let us see the error using an example.

image = open(“C:/image1.jpg”)

The error thrown would be:

Traceback (most recent call last):
File “”, line 1, in
image = open(“‪C:/image1.jpg”)
OSError: [Errno 22] Invalid argument: ‘\u202aC:/image1.jpg’

This error mainly occurs because of the copying of the file path. The Unicode characters also get copied sometimes when we copy the file path from our local system or the internet.

The Unicode character, ‘\u202a’ in the above example, is not visible in the file pathname. ‘\u202a’ is the Unicode control character from left to right embedding. So, it causes the above oserror invalid arguments.

The solution to this is straightforward. We simply have to type the URL manually instead of copying it. Thus, the Unicode character will no longer be in the URL and the error will be resolved.

What do you think? Please share in the comment section.

How to add a Machine Learning Project to GitHub

Maintaining a GitHub data science portfolio is very essential for data science professionals and students in their career. This will essentially showcase their skills and projects.

Steps to add an existing Machine Learning Project in GitHub

Step 1: Install GIT on your system

We will use the git command-line interface which can be downloaded from:

https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

Step 2: Create GitHub account here:

https://github.com/

Step 3: Now we create a repository for our project. It’s always a good practice to initialize the project with a README file.

Step 4: Go to the Git folder located in Program Files\Git and open the git-bash terminal.

Step 5: Now navigate to the Machine Learning project folder using the following command.

cd PATH_TO_ML_PROJECT

Step 6: Type the following git initialization command to initialize the folder as a local git repository.

git init

We should get a message “Initialized empty Git repository in your path” and .git folder will be created which is hidden by default.

Step 7: Add files to the staging area for committing using this command which adds all the files and folders in your ML project folder.

git add .

Note: git add filename.extension can also be used to add individual files.

Step 8: We will now commit the file from the staging area and add a message to our commit. It is always a good practice to having meaningful commit messages which will help us understand the commits during future visits and revision. Type the following command for your first commit.

git commit -m "Initial project commit"

Step 9: This only adds our files to the local branch of our system and we have to link with our remote repository in GitHub. To link them go to the GitHub repository we have created earlier and copy the remote link under “..or push an existing repository from the command line”.

First, get the url of the github project:

Now, In the git-bash window, paste the command below followed by your remote repository’s URL.

git remote add origin YOUR_REMOTE_REPOSITORY_URL

Step 10: Finally, we have to push the local repository to the remote repository in GitHub

git push -u origin master

Sign into your github account

Authorize GitCredentialManager

After this, the Machine Learning project will be added to your GitHub with the files.

We have successfully added an existing Machine Learning Project to GitHub. Now is the time to create your GitHub portfolio by adding more projects to it.