To read from a file, you can use the open() function in Python, which opens a file and returns a file object. The read() method is used to read the contents of the file.
Syntax for Reading:
python
# Reading from a file file = open(‘file.txt’, ‘r’) # Opens the file in read mode (‘r’) content = file.read() # Reads the entire file content print(content) file.close() # Close the file after reading
Writing to Files (open() and write()):
To write to a file, open it with the appropriate mode (‘w’ for write, ‘a’ for append). The write() method is used to write content to the file.
Syntax for Writing:
python
# Writing to a file file = open(‘file.txt’, ‘w’) # Opens the file in write mode (‘w’) file.write(‘Hello, World!\n’) # Writes content to the file file.close() # Close the file after writing
B. File Modes and Operations:
File Modes:
Read Mode (‘r’): Opens a file for reading. Raises an error if the file does not exist.
Write Mode (‘w’): Opens a file for writing. Creates a new file if it doesn’t exist or truncates the file if it exists.
Append Mode (‘a’): Opens a file for appending new content. Creates a new file if it doesn’t exist.
Read and Write Mode (‘r+’): Opens a file for both reading and writing.
Binary Mode (‘b’): Used in conjunction with other modes (e.g., ‘rb’, ‘wb’) to handle binary files.
File Operations:
read(): Reads the entire content of the file or a specified number of bytes.
readline(): Reads a single line from the file.
readlines(): Reads all the lines of a file and returns a list.
write(): Writes content to the file.
close(): Closes the file when finished with file operations.
Using with Statement (Context Manager):
The with statement in Python is used to automatically close the file when the block of code is exited. It’s a good practice to use it to ensure proper file handling.
Syntax:
python
with open(‘file.txt’, ‘r’) as file: content = file.read() print(content) # File is automatically closed outside the ‘with’ block
VII. Object-Oriented Programming (OOP) Basics
A. Classes and Objects:
Classes:
Classes are blueprints for creating objects in Python. They encapsulate data (attributes) and behaviors (methods) into a single unit.
Syntax for Class Declaration:
python
# Class declaration class MyClass: # Class constructor (initializer) def __init__(self, attribute1, attribute2): self.attribute1 = attribute1 self.attribute2 = attribute2 # Class method def my_method(self): return “This is a method in MyClass”
Objects:
Objects are instances of classes. They represent real-world entities and have attributes and behaviors defined by the class.
Creating Objects from a Class:
python
# Creating an object of MyClass obj = MyClass(“value1”, “value2”)
B. Inheritance and Polymorphism:
Inheritance:
Inheritance allows a class (subclass/child class) to inherit attributes and methods from another class (superclass/parent class).
Syntax for Inheritance:
python
# Parent class class Animal: def sound(self): return “Some sound” # Child class inheriting from Animal class Dog(Animal): def sound(self): # Overriding the method return “Woof!”
Polymorphism:
Polymorphism allows objects of different classes to be treated as objects of a common superclass. It enables the same method name to behave differently for each class.
Example of Polymorphism:
python
# Polymorphism example def animal_sound(animal): return animal.sound() # Same method name, different behaviors # Creating instances of classes animal1 = Animal() dog = Dog() # Calling the function with different objects print(animal_sound(animal1)) # Output: “Some sound” print(animal_sound(dog)) # Output: “Woof!”
Exceptions are errors that occur during the execution of a program, disrupting the normal flow of the code.
Examples include dividing by zero, trying to access an undefined variable, or attempting to open a non-existent file.
Types of Exceptions:
Python has built-in exception types that represent different errors that can occur during program execution, like ZeroDivisionError, NameError, FileNotFoundError, etc.
B. Using Try-Except Blocks:
Handling Exceptions with Try-Except Blocks:
Try-except blocks in Python provide a way to handle exceptions gracefully, preventing the program from crashing when errors occur.
Syntax:
python
try: # Code that might raise an exception result = 10 / 0 # Example: Division by zero except ExceptionType as e: # Code to handle the exception print(“An exception occurred:”, e)
Handling Specific Exceptions:
You can catch specific exceptions by specifying the exception type after the except keyword.
Example:
python
try: file = open(‘nonexistent_file.txt’, ‘r’) except FileNotFoundError as e: print(“File not found:”, e)
Using Multiple Except Blocks:
You can use multiple except blocks to handle different types of exceptions separately.
Example:
python
try: result = 10 / 0 except ZeroDivisionError as e: print(“Division by zero error:”, e) except Exception as e: print(“An exception occurred:”, e)
Handling Exceptions with Else and Finally:
The else block runs if no exceptions are raised in the try block, while the finally block always runs, whether an exception is raised or not.
Example:
python
try: result = 10 / 2 except ZeroDivisionError as e: print(“Division by zero error:”, e) else: print(“No exceptions occurred!”) finally: print(“Finally block always executes”)
IX. Introduction to Python Libraries
A. Overview of Popular Libraries:
NumPy:
Description: NumPy is a fundamental package for scientific computing in Python. It provides support for arrays, matrices, and mathematical functions to operate on these data structures efficiently.
Key Features:
Multi-dimensional arrays and matrices.
Mathematical functions for array manipulation.
Linear algebra, Fourier transforms, and random number capabilities.
Example:
python
import numpy as np # Creating a NumPy array arr = np.array([1, 2, 3, 4, 5])
Pandas:
Description: Pandas is a powerful library for data manipulation and analysis. It provides data structures like Series and DataFrame, making it easy to handle structured data.
Key Features:
Data manipulation tools for reading, writing, and analyzing data.
Data alignment, indexing, and handling missing data.
Time-series functionality.
Example:
python
import pandas as pd # Creating a DataFrame data = {‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35]} df = pd.DataFrame(data)
Matplotlib:
Description: Matplotlib is a comprehensive library for creating static, interactive, and animated visualizations in Python. It provides functionalities to visualize data in various formats.
Key Features:
Plotting 2D and 3D graphs, histograms, scatter plots, etc.
Customizable visualizations.
Integration with Jupyter Notebook for interactive plotting.
Example:
python
import matplotlib.pyplot as plt # Plotting a simple line graph x = [1, 2, 3, 4, 5] y = [2, 4, 6, 8, 10] plt.plot(x, y) plt.xlabel(‘X-axis’) plt.ylabel(‘Y-axis’) plt.title(‘Simple Line Graph’) plt.show()
B. Installing and Importing Libraries:
Installing Libraries using pip:
Open a terminal or command prompt and use the following command to install libraries:
pip install numpy pandas matplotlib
Importing Libraries in Python:
Once installed, import the libraries in your Python script using import statements:
Python import numpy as np import pandas as pd import matplotlib.pyplot as plt
After importing, you can use the functionalities provided by these libraries in your Python code.
X. Real-life Examples and Projects
A. Simple Projects for Practice:
To-Do List Application:
Create a command-line to-do list application that allows users to add tasks, mark them as completed, delete tasks, and display the list.
Temperature Converter:
Build a program that converts temperatures between Celsius and Fahrenheit or other temperature scales.
Web Scraper:
Develop a web scraper that extracts information from a website and stores it in a structured format like a CSV file.
Simple Calculator:
Create a basic calculator that performs arithmetic operations such as addition, subtraction, multiplication, and division.
Hangman Game:
Implement a command-line version of the Hangman game where players guess letters to reveal a hidden word.
Address Book:
Develop an address book application that stores contacts with details like name, phone number, and email address.
File Organizer:
Write a script that organizes files in a directory based on their file extensions or other criteria.
B. Exploring Python’s Applications in Different Fields:
Web Development (Django, Flask):
Python is widely used for web development. Explore frameworks like Django or Flask to build web applications, REST APIs, or dynamic websites.
Data Science and Machine Learning:
Use libraries like NumPy, Pandas, Scikit-learn, or TensorFlow to perform data analysis, create machine learning models, or work on predictive analytics projects.
Scientific Computing:
Python is used extensively in scientific computing for simulations, modeling, and solving complex mathematical problems. Use libraries like SciPy or SymPy for scientific computations.
Natural Language Processing (NLP):
Explore NLP with Python using libraries like NLTK or spaCy for text processing, sentiment analysis, or language translation tasks.
Game Development:
Develop simple games using Python libraries like Pygame, allowing you to create 2D games and learn game development concepts.
Automation and Scripting:
Create scripts to automate repetitive tasks like file manipulation, data processing, or system administration using Python’s scripting capabilities.
IoT (Internet of Things) and Raspberry Pi Projects:
Experiment with Python for IoT projects by controlling sensors, actuators, or devices using Raspberry Pi and Python libraries like GPIO Zero.
XI. Conclusion
A. Recap of Key Points:
Python Basics: Python is a high-level, versatile programming language known for its simplicity, readability, and vast ecosystem of libraries and frameworks.
Core Concepts: Understanding Python’s syntax, data types, control structures, functions, and handling exceptions is crucial for effective programming.
Popular Libraries: Libraries like NumPy, Pandas, Matplotlib, etc., offer specialized functionalities for data manipulation, scientific computing, visualization, and more.
Project Ideas: Simple projects, such as to-do lists, calculators, web scrapers, etc., provide practical experience and reinforce learning.
Real-world Applications: Python’s applications span diverse fields like web development, data science, machine learning, scientific computing, automation, IoT, and more.
B. Encouragement for Further Exploration:
Continuous Learning: Python’s versatility and vast ecosystem offer endless opportunities for learning and growth.
Practice and Projects: Build upon your knowledge by working on more complex projects, contributing to open-source, and experimenting with different libraries and domains.
Community Engagement: Engage with the Python community through forums, meetups, conferences, and online platforms to learn, share experiences, and collaborate.
Stay Curious: Python evolves continuously, and exploring new libraries, updates, or trends keeps your skills up-to-date and opens doors to new possibilities.
Persistence: Embrace challenges as learning opportunities. Persistence and dedication in learning Python will yield rewarding results in the long run.
C. Final Thoughts:
Python is an exceptional programming language renowned for its simplicity, readability, and versatility. Its applications span across numerous fields, from web development to scientific computing, data analysis, machine learning, and beyond. Whether you’re a beginner starting your programming journey or an experienced developer seeking new avenues, Python offers a rich ecosystem and a supportive community to aid your exploration and growth.
Python is an interpreted language
”’ print(5*4,end=” and “); # will evaluate print(‘5*4’) #will print as it is print(“5*6”); print(“5*6=”,‘\n‘+str(5*6)) # functions have arguments- they are separated by
, print(“20”+“30”,20+30,20,30) print(“5*6=”+str(5*6))
# This isn’t right! print(“This isn’t right!”) # He asked,”What’s your name”? print(”’He asked,”What’s your name”?”’) print(“””He asked,”What’s your
name”?”””) print(‘This isn\’t right!’) print(“He asked,\”What\’s your name\”?”)
# \ – is called as ESCAPE SEQUENCE
# \ will add or remove power from you print(“\\n is used for newline in Python”) print(“\\\\n will result in \\n”) print( r”\\n will result in \n”
) # regular expression print(“HELLO”);print(“HI”)
## datatypes
#numeric: integer (int), float (float), complex (complex)
#text: string (str) – ‘ ” ”’
“””
#boolean: boolean(bool) – True and False x = 1275 # let x = 5 y = 6 print(x+y) print(type(x))
# basic data types: var1 = 5 print(type(var1)) #<class ‘int’> var1 = 5.0 print(type(var1)) #<class ‘float’> var1 = “5.0” print(type(var1)) #<class ‘str’> var1 = “””5.0″”” print(type(var1)) #<class ‘str’> var1 = True print(type(var1)) #<class ‘bool’> var1 = 5j print(type(var1)) #<class ‘complex’> length = 100 breadth = 15 area = length * breadth peri = 2*(length + breadth) print(“Area of a rectangle with length”,length,“and breadth”,breadth,“is”,area,“and perimeter is”,peri) # f-string print(f”Area of a rectangle with length {length} and breadth {breadth} is {area} and perimeter is {peri}“) print(f”Area of a rectangle with length {length} and breadth {breadth} is {area} and perimeter is {peri}“)
# float value tot_items= 77 tot_price = 367 price_item =tot_price/tot_items print(f”Cost of each item when total price paid is {tot_price} for {tot_items} items is {price_item:.1f} currency”)
”’ Assignment submission process: 1. Create Google drive folder: share with the instructor 2. within this folder – add your .py files ”’ ”’ Assignment 1: 1. Write a program to calculate area and circumference of a circle and display info in a formatted manner 2. WAP to calculate area and perimeter of a square 3. WAP to calculate simple interest to be paid when principle amount, rate of interest and time is given 4. WAP to take degree celcius as input and give Fahrenheit output ”’ name, country,position=“Virat”,“India”,“Opening” print(f”Player {name:<10} plays for {country:>12} as a/an {position:^15} in the cricket.”) name, country,position=“Mangwaba”,“Zimbabwe”,“Wicket-keeper” print(f”Player {name:<10} plays for {country:>12} as a/an {position:^15} in the cricket.”)
# Comparison operators – compare the values # asking, is … # your output is always a bool value – True or False val1,val2,val3 = 20,20,10 print(val1 > val2) #val1 greater than val2 ? print(val1 >= val2) print(val1 > val3) #val1 greater than val3 ? print(val1 >= val3) # True print(“Second set:”) print(val1 < val2) #F print(val1 <= val2) #T print(val1 < val3) #F print(val1 <= val3) #F print(“third set:”) print(val1 == val2) # T print(val2==val3) # F print(val1 != val2) # F print(val2!=val3) # T ”’ a = 5 # assign value 5 to the variable a a ==5 # is the value of a 5? a!=5 # is value of a not equal to 5 ? ”’ ## Logical operators: and or not ”’ Committment: I am going to cover Python and SQL in this course Actual 1: I covered Python and SQL Actual 2: I covered SQL Actual 3: I covered Python Committment 2: I am going to cover Python or SQL in this course Actual 1: I covered Python and SQL Actual 2: I covered SQL Actual 3: I covered Python ”’ #logical operators takes bool values as input and also output is another bool print(True and True ) # T print(False and True ) #F print(True and False ) #F print(False and False ) #F print(“OR:”) print(True or True ) # T print(False or True ) #T print(True or False ) #T print(False or False ) #F print(“NOT”) print(not True) print(not False) val1,val2,val3 = 20,20,10 print(val1 > val2 and val1 >= val2 or val1 > val3 and val1 >= val3 or val1 < val2 and val2!=val3) # F and T or T and T or F and T # F or T or F # T # Self Practice: output is True – solve it manually print(val1 <= val2 or val1 < val3 and val1 <= val3 and val1 == val2 or val2==val3 or val1 != val2)
# Bitwise operator : & | >> << print(bin(50)) #bin() convert into binary numbers # 50 = 0b 110010 print(int(0b110010)) #int() will convert into decimal number print(oct(50)) # Octal number system: 0o62 print(hex(50)) #hexadecimal: 0x32 # Assignments (3 programs) – refer below
total_marks = 150 if total_marks>=200: print(“Congratulations! You have passed the exam”) print(“You have 7 days to reserve your admission”) else: print(“Sorry, You have not cleared the exam”) print(“Try again after 3 months”)
print(“Thank you”) # marks = 75 ”’ >=85: Grade A >=75: B >=60: C >=50: D <50: E ”’ if marks>=85: print(“Grade A”) elif marks>=75: print(“Grade B”) elif marks>=60: print(“Grade C”) elif marks>=50: print(“Grade D”) else: print(“Grade E”)
print(“Done”) ###. marks = 85 ”’ >=85: Grade A >=75: B >=60: C >=50: D <50: E ”’ if marks>=85: print(“Grade A”)
if marks>=75 and marks<85: print(“Grade B”) if marks>=60 and marks<75: print(“Grade C”) if marks>=50 and marks<60: print(“Grade D”) if marks<50: print(“Grade E”)
print(“Done”) ### NEST IF marks = 98.0001 ”’ >=85: Grade A >=75: B >=60: C >=50: D <50: E >90: award them with medal ”’ if marks>=85: print(“Grade A”) if marks >= 90: print(“You win the medal”) if marks>98: print(“Your photo will be on the wall of fame”) elif marks>=75: print(“Grade B”) elif marks>=60: print(“Grade C”) elif marks>=50: print(“Grade D”) else: print(“Grade E”)
”’ Practice basic programs from here: https://www.scribd.com/document/669472691/Flowchart-and-C-Programs ”’ # check if a number is odd or even num1 = int(input(“Enter the number: “)) if num1<0: print(“Its neither Odd or Even”) else: if num1%2==0: print(“Its Even”) else: print(“Its Odd”)
## check the greater of the given two numbers: num1, num2 = 20,20 if num1>num2: print(f”{num1} is greater than {num2}“) elif num2>num1: print(f”{num2} is greater than {num1}“) else: print(“They are equal”)
## check the greater of the given three numbers: num1, num2,num3 = 29,49,29 if num1>num2: # n1 > n2 if num1>num3: print(f”{num1} is greater”) else: print(f”{num3} is greater”) else: # n2 is greater or equal to if num2 > num3: print(f”{num2} is greater”) else: print(f”{num3} is greater”) ## #enter 3 sides of a triangle and check if they are: #equilateral, isoceles, scalene, right angled triangle side1,side2,side3 = 90,60,30 if side1==side2: if side1 == side3: print(“Equilateral”) else: print(“Isoceles”) else: if side1==side3: print(“Isoceles”) else: if side2==side3: print(“Isoceles”) else: print(“Scalene”)
#modify the above code to handle Right Angled triangle logic
# loops – # FOR : know how many times you need to repeat # WHILE : dont know how many times but you have the condition # range(start, stop,step): starts with start, goes upto stop (not including) # step: each time value is increasesd by step # range(10,34,6): 10, 16, 22, 28 # range(start, stop) : default step is 1 # range(10,17): 10,11,12,13,14,15,16 # range(stop): default start is zero, default step is 1 # range(5): 0,1,2,3,4 # generate values from 1 to 10 for counter in range(1,11): # 1,2,3…10 print(counter,end=“, “) print() print(“Thank You”)
# generate first 10 odd numbers for odd_num in range(1,11,2): # 1,2,3…10 print(odd_num,end=“, “) print() print(“———-“) for counter in range(10): print(2*counter+1,end=“, “) print() print(“———-“) # generate even numbers till 50 for even_num in range(0,50,2): # 1,2,3…10 print(even_num,end=“, “) print() ############## # WHILE: is always followed by a condition and only if the condition is true, u get in # WAP to print hello till user says so user = “y” while user==“y”: print(“Hello”) user = input(“Enter y to continue or anyother key to stop: “) ## print(“method 2”)
while True: user = input(“Enter y to continue or anyother key to stop: “) if user!=“y”: break print(“Hello”)
print(“Thank you”) count = int(input(“How many times you want to print: “)) while count >0: print(“Hello”) count-=1 #count = count-1
# For loops ”’ * * * * * * * * * * * * * * * * * * * * * * * * * ”’ n=5 for j in range(n): for i in range(n): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ n=5 num_stars=1 for j in range(n): for i in range(num_stars): print(“*”,end=” “) print() num_stars+=1 # n=5 for j in range(n): for i in range(j+1): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(n): for i in range(n-j): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(n): for k in range(j): print(“”, end=” “) for i in range(n-j): print(“*”,end=” “) print()
”’ Print prime numbers between 5000 and 10,000 10 – prime or not 2 10%2==0 => not a prime 3 4 ”’ for num in range(5000,10000): isPrime = True for i in range(2,num//2+1): if num%i==0: isPrime = False break if isPrime: print(num,end=“, “) ”’ num = 11 isPrime =T i in range(2,6) isPrime =F ”’
# WAP to create a menu option to perform arithmetic operations ”’ Before you use while loop, decide: 1. Should the loop run atleast once (Exit Controlled), or 2. Should we check the condition even before running the loop (Entry controlled) ”’ # method 1: Exit controlled while True: num1 = int(input(“Enter first number: “)) num2 = int(input(“Enter second number: “)) print(“Your Option: “) print(“1. Add”) print(“2. Subtract”) print(“3. Multiply”) print(“4. Divide”) print(“5. Exit”) ch = input(“Enter your choice: “) if ch==“1”: print(“Addition = “,num1 + num2) elif ch==“2”: print(“Difference = “, num1 – num2) elif ch==“3”: print(“Multiplication = “, num1 * num2) elif ch==“4”: print(“Division = “,num1 / num2) elif ch==“5”: break else: print(“Invalid Option”)
# # Generate odd numbers from 1 till user wants to continue num1 = 1 while True: print(num1) num1+=2 ch=input(“Enter y to generate next number or anyother key to stop: “) if ch!=‘y’: break # Generate fibonacci numbers from 1 till user wants to continue num1 = 0 num2 = 1 while True: num3 =num1 +num2 print(num3) num1,num2 = num2,num3 ch=input(“Enter y to generate next number or anyother key to stop: “) if ch!=‘y’: break # Generate fibonacci numbers from 1 till user wants to continue print(“Hit Enter key to continue or anyother key to stop! “) num1 = 0 num2 = 1 while True: num3 =num1 +num2 print(num3,end=“”) num1,num2 = num2,num3 ch=input() if ch!=”: break
import random print(random.random()) print(random.randint(100,1000))
from random import randint print(randint(100,1000))
# guess the number game – computer (has the number) v human (attempting) from random import randint
num = randint(1,100) attempt=0 while True: guess = int(input(“Guess the number (1-100): “)) if guess<1 or guess>100: print(“Invalid attempt!!!”) continue attempt+=1 #attempt=attempt+1 if guess ==num: print(f”Congratulations! You got it right in {attempt} attempts.”) break elif guess < num: print(“Sorry, that’s incorrect. Please try again with a higher number!”) else: print(“Sorry, that’s incorrect. Please try again with a lower number!”)
### ### # guess the number game – computer (has the number) v computer (attempting) from random import randint start,stop = 1,100 num = randint(1,100) attempt=0 while True: #guess = int(input(“Guess the number (1-100): “)) guess = randint(start,stop) if guess<1 or guess>100: print(“Invalid attempt!!!”) continue attempt+=1 #attempt=attempt+1 if guess ==num: print(f”Congratulations! You got it right in {attempt} attempts.”) break elif guess < num: print(f”Sorry, {guess} that’s incorrect. Please try again with a higher number!”) start=guess+1 else: print(f”Sorry, {guess} that’s incorrect. Please try again with a lower number!”) stop=guess-1 ## # guess the number game – computer (has the number) v computer (attempting) from random import randint total_attempts = 0 for i in range(10000): start,stop = 1,100 num = randint(1,100) attempt=0 while True: #guess = int(input(“Guess the number (1-100): “)) guess = randint(start,stop) if guess<1 or guess>100: print(“Invalid attempt!!!”) continue attempt+=1 #attempt=attempt+1 if guess ==num: print(f”Congratulations! You got it right in {attempt} attempts.”) total_attempts+=attempt break elif guess < num: print(f”Sorry, {guess} that’s incorrect. Please try again with a higher number!”) start=guess+1 else: print(f”Sorry, {guess} that’s incorrect. Please try again with a lower number!”) stop=guess-1 print(“========================================”) print(“Average number of attempts = “,total_attempts/10000) print(“========================================”)
”’ Multi line text of comments which can go into multiple lines ”’ # Strings str1 = ‘Hello’ str2 = “Hello there” print(type(str1), type(str2)) str3 = ”’How are you? Where are you from? Where do you want to go?”’ str4 = “””I am fine I live here I am going there””” print(type(str3), type(str4)) print(str3) print(str4) # one line of comment ”’ Multi line text of comments which can go into multiple lines ”’ # what’s your name? print(‘what\’s your name?’)
# counting in Python starts from zero str1 = ‘Hello there how are you?’ print(“Number of characters in str1 is”,len(str1)) print(“First character: “,str1[0], str1[-len(str1)]) print(“Second character: “,str1[1]) print(“Last character: “,str1[len(str1)-1]) print(“Last character: “,str1[-1]) print(“Second Last character: “,str1[-2]) print(“5th 6th 7th char: “,str1[4:7]) print(“First 4 char: “,str1[0:4],str1[:4]) print(“first 3 alternate char: “,str1[1:5:2]) print(“last 3 characters:”,str1[-3:]) print(“last 4 but one characters:”,str1[-4:-1]) print(str1[5:1:-1])
txt1 = “HiiH” txt2=txt1[-1::-1] #reversing the text print(txt2) txt2=str1[-1:-7:-1] #reversing the text print(txt2) if txt2 == txt1: print(“Its palindrome”) else: print(“Its not a palindrome”) var1 = 5 #print(var1[0]) # ‘int’ object is not subscriptable # add two strings print(“Hello”+“, “+“How are you?”) print(“Hello”,“How are you?”) print((“Hello”+” “)*5) print(“* “*5)
# for loop – using strings str1 = “hello” for i in str1: print(i)
for i in range(len(str1)): print(i, str1[i])
print(type(str1)) # <class ‘str’> str2 = “HOW Are You?” up_count, lo_count,sp_count = 0,0,0 for i in str2: if i.islower(): lo_count+=1 if i.isupper(): up_count+=1 if i.isspace(): sp_count+=1 print(f”Number of spaces={sp_count}, uppercase letters={up_count} and lower case letters={lo_count}“)
#input values: val1 = input(“Enter a number: “) if val1.isdigit(): val1 = int(val1) print(val1 * 5) else: print(“Invalid value”)
str3 = “123af ds” print(str3.isalnum())
# str1 =“How are You” # docs.python.org help(str.isascii)
help(help)
str1 = “HOw are YOU today?” print(str1.upper()) print(str1.lower()) print(str1.title()) #str1 = str1.title() # strings are immutable – you cant edit #str1[3] = “A” #TypeError: ‘str’ object does not support item assignment str1= str1[0:3]+“A”+str1[4:] print(str1) cnt = str1.lower().count(‘o’) print(cnt) cnt = str1.count(‘O’,3,15) # x,start,end print(cnt)
# Strings – method str1 = “Hello how are you doing today” var1 = str1.split() print(“Var 1 =”,var1) var2 = str1.split(‘o’) print(“Var 2 =”,var2) str2 = “1,|Sachin,|Mumbai,|Cricket” var3 = str2.split(‘,|’) print(var3) str11 = ” “.join(var1) print(“Str11 = “,str11) str11 = “”.join(var2) print(“Str11 = “,str11) str11= “–“.join(var3) print(“Str11 = “,str11) # Strings – method str1 = “Hello how are you doing today” str2 = str1.replace(‘o’,‘ooo’) print(str2) cnt = str1.count(‘z’) print(“Number of z in the str1 =”,cnt) find_cnt = str1.find(‘ow’) if find_cnt==-1: print(“Given substring is not in the main string”) else: print(“Substring in the str1 found at =”,find_cnt)
find_cnt = str1.find(‘o’,5,6) print(“Substring in the str1 found at =”,find_cnt)
str2 = str1.replace(‘z’,‘ooo’,3) print(str2)
################ ## LIST = Linear Ordered Mutable Collection l1 = [55, ‘Hello’,False,45.9,[2,4,6]] print(“type of l1 = “,type(l1)) print(“Number of members in the list=”,len(l1)) print(l1[0],l1[4],l1[-1]) print(“type of l1 =”,type(l1[0])) print(“type of l1 =”,type(l1[-1])) l2 = l1[-1] print(l2[0], l1[-1][0], type(l1[-1][0])) l1[0] = 95 print(“L1 =”,l1) ## LIST = Linear Ordered Mutable Collection l1 = [55, ‘Hello’,False,45.9,[2,4,6]]
for member in l1: print(member)
print(l1+l1) print(l1*2)
print(“count = “,l1.count(False)) print(“count = “,l1.count(‘Hello’)) # remove second last member – pop takes position l1.pop(-2) print(“L1 after Pop: “,l1) l1.pop(-2) print(“L1 after Pop: “,l1) # delete the element – remove takes value cnt = l1.count(‘Helloo’) if cnt>0: l1.remove(‘Helloo’) print(“L1 after Remove: “,l1) else: print(“‘Helloo’ not in the list”)
# Collections – Lists – linear mutable ordered collection l1 = [10,50,90,20,90] # add and remove members l1.append(25) #append will add at the end l1.append(45) print(“L1 after append: “,l1) #insert takes position and the value to add l1.insert(2,35) l1.insert(2,65) print(“L1 after insert: “,l1) l1.remove(35) #takes value to delete l1.remove(90) print(“L1 after remove: “,l1) cnt_90 = l1.count(90) print(“Number of 90s: “,cnt_90) l1.pop(2) #index at which you want to delete print(“L1 after pop: “,l1)
# Collections – Lists – linear mutable ordered collection l1 = [10,50,90,20,90] l2 = l1.copy() #shallow – photocopy l3 = l1 # deepcopy – same list with two names print(“1. L1 = “,l1) print(“1. L2 = “,l2) print(“1. L3 = “,l3) l1.append(11) l2.append(22) l3.append(33) print(“2. L1 = “,l1) print(“2. L2 = “,l2) print(“2. L3 = “,l3) print(“Index of 90:”,l1.index(90,3,7))
# Extend: l1 = l1+l2 l2=[1,2,3] l1.extend(l2) print(“L1 after extend:”,l1) l1.reverse() print(“L1 after reverse: “,l1) l1.sort() #sort in ascending order print(“L1 after sort: “,l1) l1.sort(reverse=True) #sort in descending order print(“L1 after reverse sort: “,l1) l1.clear() print(“L1 after clear: “,l1)
######## question from Vivek: ########### l1 = [9,5,7,2] target = 12 l2=l1.copy() l2.sort() #[2,5,7,19] for i in range(len(l2)-1): if l2[i]+l2[i+1] == target #l1.index(l2[i]), l1.index(l2[i+1]) break else: > target: stop <target: check with i+1 with i+2
#t1[1] = 14 TypeError: ‘tuple’ object does not support item assignment print(“Index of 2 =”,t1.index(2)) print(“Count of 2 =”,t1.count(2)) print(t1, type(t1)) t1=list(t1) t1[1] = 14 t1 = tuple(t1) print(t1, type(t1)) for i in t1: print(i)
# Dictionary ”’ WAP to input marks of three students in three subjects marks = {‘Sachin’: [78, 87, 69], ‘Kapil’: [59, 79, 49], ‘Virat’: [88, 68, 78]} ”’ students = [‘Sachin’,‘Kapil’,‘Virat’] subjects = [‘Maths’,‘Science’,‘English’] marks = {} #marks_list = [] num_students, num_subjects = 3,3 for i in range(num_students): marks_list = [] for j in range(num_subjects): m = int(input(“Enter the marks in subject ” + subjects[j]+” : “)) marks_list.append(m) temp = {students[i]:marks_list} marks.update(temp) #marks_list.clear() print(“Marks entered are: “,marks)
# Dictionary ”’ WAP to input marks of three students in three subjects. calculate total and average of marks for all the 3 students find who is the highest scorer in total and also for each subject marks = {‘Sachin’: [78, 87, 69], ‘Kapil’: [59, 79, 49], ‘Virat’: [88, 68, 78]} ”’ students = [‘Sachin’, ‘Kapil’, ‘Virat’] subjects = [‘Maths’, ‘Science’, ‘English’] marks = {‘Sachin’: [78, 87, 69], ‘Kapil’: [59, 79, 49], ‘Virat’: [88, 68, 78]} topper = {‘Total’: –1, ‘Name’: []} subject_highest = [-1, –1, –1]
num_students, num_subjects = 3, 3 for i in range(num_students): tot, avg = 0, 0 key = students[i] for j in range(num_subjects): tot = tot + marks[key][j] # checking the highest values for each subject # … avg = tot / 3 print(f”Total marks obtained by {students[i]} is {tot} and average is {avg:.1f}“) # check highest total if tot >= topper[‘Total’]: topper[‘Total’] = tot topper[‘Name’].append(key)
print(f”{topper[‘Name’]} has topped the class with total marks of {topper[‘Total’]}“)
# Dictionary ”’ WAP to input marks of three students in three subjects. calculate total and average of marks for all the 3 students find who is the highest scorer in total and also for each subject marks = {‘Sachin’: [78, 87, 69], ‘Kapil’: [59, 79, 49], ‘Virat’: [88, 68, 78]} ”’ students = [‘Sachin’,‘Kapil’,‘Virat’] subjects = [‘Maths’,‘Science’,‘English’] marks = {‘Sachin’: [78, 87, 69], ‘Kapil’: [59, 79, 49], ‘Virat’: [88, 68, 78]} topper = {‘Total’:-1, ‘Name’:[]} subject_highest = [-1,-1,-1]
num_students, num_subjects = 3,3 for i in range(num_students): tot,avg = 0,0 key = students[i] for j in range(num_subjects): tot = tot + marks[key][j] #checking the highest values for each subject if marks[key][j] > subject_highest[j]: subject_highest[j] = marks[key][j]
avg = tot / 3 print(f”Total marks obtained by {students[i]} is {tot} and average is {avg:.1f}“) # check highest total if tot >=topper[‘Total’]: topper[‘Total’] = tot topper[‘Name’].append(key)
print(f”{topper[‘Name’]} has topped the class with total marks of {topper[‘Total’]}“) print(f”Highest marks for subjects {subjects} is {subject_highest}“)
# sets, lists, tuples -> they are convertible in each others form l1 = [‘Apple’,‘Apple’,‘Apple’,‘Apple’,‘Apple’] l1 = list(set(l1)) print(l1) s1 = {4,2,3} print(s1)
# Functions def smile(): txt=”’ A smile, a curve that sets all right, Lighting days and brightening the night. In its warmth, hearts find their flight, A silent whisper of pure delight.”’ print(txt)
smile()
smile()
smile()
#================== # function to calculate gross pay def calc_grosspay(): basic_salary = 5000 hra = 0.1 * basic_salary da = 0.4 * basic_salary gross_pay = basic_salary + hra + da print(“Your gross pay is”,gross_pay)
def calc_grosspay_withreturn(): basic_salary = 5000 hra = 0.1 * basic_salary da = 0.4 * basic_salary gross_pay = basic_salary + hra + da return gross_pay
def calc_grosspay_return_input(basic_salary): hra = 0.1 * basic_salary da = 0.4 * basic_salary gross_pay = basic_salary + hra + da return gross_pay
bp_list = [3900,5000,6500,9000] gp_1 = calc_grosspay_return_input(bp_list[3]) print(“Gross Pay for this month is”,gp_1)
gp = calc_grosspay_withreturn() print(“Total gross pay for ABC is”,gp) gp_list=[] gp_list.append(gp) calc_grosspay()
# function to check prime numbers ”’ 10 = 2 to 5 7 = 2, 9 = 2,3 ”’ def gen_prime(num): ”’ This function takes a parameter and checks if its a prime number or not :param num: number (int) :return: True/False (True for prime number) ”’ isPrime = True for i in range(2,num//2): if num%i ==0: isPrime = False break return isPrime
if __name__ ==“__main__”: num = 11 print(num,” : “,gen_prime(num)) num = 100 print(num,” : “,gen_prime(num))
# generate prime numbers between given range start,end = 1000, 5000 for i in range(start,end): check = gen_prime(i) if check: print(i,end=“, “)
# doc string: multi line comment added at the beginning of the function help(gen_prime)
#import infy_apr as ia from infy_apr import gen_prime
def product_val(n1,n2): return n1 * n2
if __name__==“__main__”: num1 = 1 num2 = 3 print(“Sum of two numbers is”,num2+num1) # generate prime numbers between 50K to 50.5K for i in range(50000,50500): check = gen_prime(i) if check: print(i,end=“, “)
res = calculate(5,10,plus) print(“1. Result = “,res) res = calculate(5,10,diff) print(“2. Result = “,res) ############### # in-built functions() # user defined functions() # anonymous / one line /lambda def myfunc1(a,b): return a**b #above myfunc1() can also be written as: myfunc2 = lambda a,b: a**b print(“5 to power of 4 is”,myfunc2(5,4))
”’ map: apply same logic on all the values of the list: multiply all the values by 76 filter: filter out values in a list based on a condition: remove -ve values reduce: reduce multiple values in a list to a single value ”’ # a= 11, b = 12, c = 13… calc = 0 list1 = [‘a’,‘b’,‘c’,‘d’] word = input() for i in word: calc = calc+list1.index(i) + 11 # 11 + 13+14 print(calc)
res = calculate(5,10,plus) print(“1. Result = “,res) res = calculate(5,10,diff) print(“2. Result = “,res) ############### # in-built functions() # user defined functions() # anonymous / one line /lambda def myfunc1(a,b): return a**b #above myfunc1() can also be written as: myfunc2 = lambda a,b: a**b print(“5 to power of 4 is”,myfunc2(5,4))
”’ map: apply same logic on all the values of the list: multiply all the values by 76 filter: filter out values in a list based on a condition: remove -ve values reduce: reduce multiple values in a list to a single value ”’ # a= 11, b = 12, c = 13… calc = 0 list1 = [‘a’,‘b’,‘c’,‘d’] word = input() for i in word: calc = calc+list1.index(i) + 11 # 11 + 13+14 print(calc)
”’ map: apply same logic on all the values of the list: multiply all the values by 76 filter: filter out values in a list based on a condition: remove -ve values reduce: reduce multiple values in a list to a single value ”’ value_usd = [12.15,34.20,13,8,9,12,45,87,56,78,54,34] value_inr = [] # 1 usd = 78 inr for v in value_usd: value_inr.append(v*78) print(“Value in INR: “,value_inr)
value_inr =list(map(lambda x: 78*x,value_usd)) print(“Value in INR: “,value_inr)
# filter: filter out the values new_list=[12,7,0,-5,-6,15,18,21,-44,-90,-34,56,43,12,7,0,-5,-6,15,18,21,-44,-90,-34,56,43] output_list = list(filter(lambda x: x>=0,new_list)) print(“Filtered: “,output_list)
output_list = list(filter(lambda x: x%3==0 and x>=0,new_list)) print(“Filtered: “,output_list)
################################## ## class & objects ”’ car – class number of wheels – 4, color, make driving parking ”’ class Book: number_of_books = 0 def reading(self): print(“I am reading a book”)
b1 = Book() #creating object of class Book b2 = Book() b3 = Book() b4 = Book() print(b1.number_of_books) b1.reading() ”’ class level variables and methods object level variables and methods ”’
”’ __init__() : will automatically called when object is created ”’ class Book: book_count = 0 # class level variable def __init__(self,title): # object level method self.title=title # object level variable total = 0 #normal variable Book.book_count+=1 @classmethod def output(cls): print(“Total book now available = “, Book.book_count)
def check_prime(self): # check if n1 is prime or not self.checkPrime = True for i in range(2, self.n1//2+1): if self.n1 % i==0: self.checkPrime=False m1 = MyMathOp(15,10) print(m1.n1) m1.check_prime() print(m1.checkPrime)
s1 = Shape() #s1.myarea() #s1.myarea(10) #s1.area(10,20) as1 = AnotherShape() as1.test1() ”’ public: anyone can call public members of a class protected (_var): (concept exists but practically it doesnt exist) – behaves like public concept: only the derived class call private (__var): available only within the given class ”’ #s1.__dummy3() #r1.__dummy3() s1.dummy4()
# logical error # runtime errors – exceptions a = 50 try: b = int(input(“Enter the denominator: “)) except ValueError: print(“You have provided invalid value for B, changing the value to 1”) b = 1 try: print(a/b) # ZeroDivisionError print(“A by B is”,a/b) except ZeroDivisionError: print(“Sorry, we cant perform the analysis as denominator is zero”)
print(“thank you”)
################ a = 50 b = input(“Enter the denominator: “) try: print(“A by B is”, a / int(b)) # ZeroDivisionError & ValueError except ValueError: print(“You have provided invalid value for B, changing the value to 1”) b = 1 except ZeroDivisionError: print(“Sorry, we cant perform the analysis as denominator is zero”)
except Exception: print(“An error has occurred, hence skipping this section”)
else: print(“So we got the answer now!”)
finally: print(“Not sure if there was an error but we made it through”) print(“thank you”)
# File handling ”’ Working with Text files: 1. read: read(), readline(), readlines() 2. write: write(), writelines() 3. append Modes: r,r+, w, w+, a, a+ Accessing the file: 1. Absolute path: 2. Relative path: ”’ path=“C:/Folder1/Folder2/txt1.txt” path=“C:\\Folder1\\Folder2\\txt1.txt” path=“ptxt1.txt” content=”’Twinkle twinkle little star How I wonder what you are Up above the world so high like a diamond in the sky ”’ file_obj = open(path,“a+”)
file_obj.write(content)
file_obj.seek(0) # go to the beginning of the content read_cnt = file_obj.read() file_obj.close()
file_obj = open(path,“w”) write_nt = [‘Hello how are you?\n‘,‘I am fine\n‘,‘Where are you going\n‘,‘sipdfjisdjisdjf\n‘] file_obj.writelines(write_nt) file_obj.close()
Maintaining a GitHub data science portfolio is very essential for data science professionals and students in their career. This will essentially showcase their skills and projects.
Steps to add an existing Machine Learning Project in GitHub
Step 1: Install GIT on your system
We will use the git command-line interface which can be downloaded from:
Step 2: Create GitHub account here:
Step 3: Now we create a repository for our project. It’s always a good practice to initialize the project with a README file.
Step 4: Go to the Git folder located in Program Files\Git and open the git-bash terminal.
Step 5: Now navigate to the Machine Learning project folder using the following command.
cd PATH_TO_ML_PROJECT
Step 6: Type the following git initialization command to initialize the folder as a local git repository.
git init
We should get a message “Initialized empty Git repository in your path” and .git folder will be created which is hidden by default.
Step 7: Add files to the staging area for committing using this command which adds all the files and folders in your ML project folder.
git add .
Note: git add filename.extension can also be used to add individual files.
Step 8: We will now commit the file from the staging area and add a message to our commit. It is always a good practice to having meaningful commit messages which will help us understand the commits during future visits and revision. Type the following command for your first commit.
git commit -m "Initial project commit"
Step 9: This only adds our files to the local branch of our system and we have to link with our remote repository in GitHub. To link them go to the GitHub repository we have created earlier and copy the remote link under “..or push an existing repository from the command line”.
First, get the url of the github project:
Now, In the git-bash window, paste the command below followed by your remote repository’s URL.
git remote add origin YOUR_REMOTE_REPOSITORY_URL
Step 10: Finally, we have to push the local repository to the remote repository in GitHub
git push -u origin master
Sign into your github account
Authorize GitCredentialManager
After this, the Machine Learning project will be added to your GitHub with the files.
We have successfully added an existing Machine Learning Project to GitHub. Now is the time to create your GitHub portfolio by adding more projects to it.
A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs. Restricted Boltzmann machines can also be used in deep learning networks. In particular, deep belief networks can be formed by “stacking” RBMs and optionally fine-tuning the resulting deep network with gradient descent and backpropagation. This deep learning algorithm became very popular after the Netflix Competition where RBM was used as a collaborative filtering technique to predict user ratings for movies and beat most of its competition. It is useful for regression, classification, dimensionality reduction, feature learning, topic modelling and collaborative filtering.
Restricted Boltzmann Machines are stochastic two layered neural networks which belong to a category of energy based models that can detect inherent patterns automatically in the data by reconstructing input. They have two layers visible and hidden. Visible layer has input nodes (nodes which receive input data) and the hidden layer is formed by nodes which extract feature information from the data and the output at the hidden layer is a weighted sum of input layers. They don’t have any output nodes and they don’t have typical binary output through which patterns are learnt. The learning process happens without that capability which makes them different. We only take care of input nodes and don’t worry about hidden nodes. Once the input is provided, RBM’s automatically capture all the patterns, parameters and correlation among the data.
What is Boltzman Machine?
Let’s first undertand what’s Boltzman Machine. Boltzmann Machine was first invented in 1985 by Geoffrey Hinton, a professor at the University of Toronto. He is a leading figure in the deep learning community and is referred to by some as the “Godfather of Deep Learning”.
Boltzmann Machine is a generative unsupervised model, which involves learning a probability distribution from an original dataset and using it to make inferences about never before seen data.
Boltzmann Machine has an input layer (also referred to as the visible layer) and one or several hidden layers (also referred to as the hidden layer).
Boltzmann Machine uses neural networks with neurons that are connected not only to other neurons in other layers but also to neurons within the same layer.
Everything is connected to everything. Connections are bidirectional, visible neurons connected to each other and hidden neurons also connected to each other
Boltzmann Machine doesn’t expect input data, it generates data. Neurons generate information regardless they are hidden or visible.
For Boltzmann Machine all neurons are the same, it doesn’t discriminate between hidden and visible neurons. For Boltzmann Machine whole things are system and its generating state of the system.
In Boltzmann Machine, we use our training data and feed into the Boltzmann Machine as input to help the system adjust its weights. It resembles our system not any such system in the world. It learns from the input, what are the possible connections between all these parameters, how do they influence each other and therefore it becomes a machine that represents our system. Boltzmann Machine consists of a neural network with an input layer and one or several hidden layers. The neurons in the neural network make stochastic decisions about whether to turn on or off based on the data we feed during training and the cost function the Boltzmann Machine is trying to minimize. By doing so, the Boltzmann Machine discovers interesting features about the data, which help model the complex underlying relationships and patterns present in the data.
This Boltzmann Machine uses neural networks with neurons that are connected not only to other neurons in other layers but also to neurons within the same layer. That makes training an unrestricted Boltzmann machine very inefficient and Boltzmann Machine had very little commercial success. Boltzmann Machines are primarily divided into two categories: Energy-based Models (EBMs) and Restricted Boltzmann Machines (RBM). When these RBMs are stacked on top of each other, they are known as Deep Belief Networks (DBN). Our focus of discussion here is the RBM.
Restricted Boltzmann Machines (RBM)
What makes RBMs different from Boltzmann machines is that visible node isn’t connected to each other, and hidden nodes aren’t connected with each other. Other than that, RBMs are exactly the same as Boltzmann machines.
It is a probabilistic, unsupervised, generative deep machine learning algorithm.
RBM’s objective is to find the joint probability distribution that maximizes the log-likelihood function.
RBM is undirected and has only two layers, Input layer, and hidden layer
All visible nodes are connected to all the hidden nodes. RBM has two layers, visible layer or input layer and hidden layer so it is also called an asymmetrical bipartite graph.
No intralayer connection exists between the visible nodes. There is also no intralayer connection between the hidden nodes. There are connections only between input and hidden nodes.
The original Boltzmann machine had connections between all the nodes. Since RBM restricts the intralayer connection, it is called a Restricted Boltzmann Machine.
Since RBMs are undirected, they don’t adjust their weights through gradient descent and backpropagation. They adjust their weights through a process called contrastive divergence. At the start of this process, weights for the visible nodes are randomly generated and used to generate the hidden nodes. These hidden nodes then use the same weights to reconstruct visible nodes. The weights used to reconstruct the visible nodes are the same throughout. However, the generated nodes are not the same because they aren’t connected to each other.
Simple Understanding of RBM
Problem Statement: Let’s take an example of a small café just across a street where people come in the evening to hang out. We see that normally three people: Geeta, Meeta and Paavit visit frequently. Not always all of them show up together. We have all the possible combinations of these three people showing up. It could be just Geeta, Meeta or Paavit show up or Geeta and Meeta come at the same time or Paavit and Meeta or Paavit and Geeta or all three of them show up or none of them show up on some days. All the possibilities are valid.
Let’s say, you watch them coming everyday and make a note of it. Let’s take first day, Meeta and Geeta comes and Paavit didn’t. Second day, Paavit comes but Geeta and Meeta doesn’t. After noticing for 15 days, you find that only these two possibilities are repeated. As represented in the table.
That’s an interesting finding and more so when we come to know that these three people are totally unknown to each other. You also find out that there are two café managers: Ratish and Satish. Lets tabulate it again with 5 people now (3 visitors and 2 managers).
We find that, Geeta and Meeta likes Ratish so they show up when Ratish is on duty. Paavit likes Satish so he shows up only when Satish is on duty. So, we look at the data we might say that Geeta and Meeta went to the café on the days Ratish is on duty and Paavit went when Satish is on duty. Lets add some weights.
Since we see that customers in our dataset, we call them as visible layer. Managers are not shown in the dataset, we call it as hidden layer. This is an example of Restricted Boltzmann Machine (RBM).
(… to be continued…)
Working of RBM
RBM is a Stochastic Neural Network which means that each neuron will have some random behavior when activated. There are two other layers of bias units (hidden bias and visible bias) in an RBM. This is what makes RBMs different from autoencoders. The hidden bias RBM produces the activation on the forward pass and the visible bias helps RBM to reconstruct the input during a backward pass. The reconstructed input is always different from the actual input as there are no connections among the visible units and therefore, no way of transferring information among themselves.
The above image shows the first step in training an RBM with multiple inputs. The inputs are multiplied by the weights and then added to the bias. The result is then passed through a sigmoid activation function and the output determines if the hidden state gets activated or not. Weights will be a matrix with the number of input nodes as the number of rows and the number of hidden nodes as the number of columns. The first hidden node will receive the vector multiplication of the inputs multiplied by the first column of weights before the corresponding bias term is added to it.
Here is the formula of the Sigmoid function shown in the picture:
So the equation that we get in this step would be,
where h(1) and v(0) are the corresponding vectors (column matrices) for the hidden and the visible layers with the superscript as the iteration v(0) means the input that we provide to the network) and a is the hidden layer bias vector.
(Note that we are dealing with vectors and matrices here and not one-dimensional values.)
Now this image shows the reverse phase or the reconstruction phase. It is similar to the first pass but in the opposite direction. The equation comes out to be:
where v(1) and h(1) are the corresponding vectors (column matrices) for the visible and the hidden layers with the superscript as the iteration and b is the visible layer bias vector.
Now, the difference v(0)−v(1) can be considered as the reconstruction error that we need to reduce in subsequent steps of the training process. So the weights are adjusted in each iteration so as to minimize this error and this is what the learning process essentially is.
In the forward pass, we are calculating the probability of output h(1) given the input v(0) and the weights W denoted by:
And in the backward pass, while reconstructing the input, we are calculating the probability of output v(1) given the input h(1) and the weights W denoted by:
The weights used in both the forward and the backward pass are the same. Together, these two conditional probabilities lead us to the joint distribution of inputs and the activations:
Reconstruction is different from regression or classification in that it estimates the probability distribution of the original input instead of associating a continuous/discrete value to an input example. This means it is trying to guess multiple values at the same time. This is known as generative learning as opposed to discriminative learning that happens in a classification problem (mapping input to labels).
Let us try to see how the algorithm reduces loss or simply put, how it reduces the error at each step. Assume that we have two normal distributions, one from the input data (denoted by p(x)) and one from the reconstructed input approximation (denoted by q(x)). The difference between these two distributions is our error in the graphical sense and our goal is to minimize it, i.e., bring the graphs as close as possible. This idea is represented by a term called the Kullback–Leibler divergence.
KL-divergence measures the non-overlapping areas under the two graphs and the RBM’s optimization algorithm tries to minimize this difference by changing the weights so that the reconstruction closely resembles the input. The graphs on the right-hand side show the integration of the difference in the areas of the curves on the left.
This gives us intuition about our error term. Now, to see how actually this is done for RBMs, we will have to dive into how the loss is being computed. All common training algorithms for RBMs approximate the log-likelihood gradient given some data and perform gradient ascent on these approximations.
Contrastive Divergence
Here is the pseudo-code for the CD algorithm:
Applications: * Pattern recognition : RBM is used for feature extraction in pattern recognition problems where the challenge is to understand the hand written text or a random pattern. * Recommendation Engines : RBM is widely used for collaborating filtering techniques where it is used to predict what should be recommended to the end user so that the user enjoys using a particular application or platform. For example : Movie Recommendation, Book Recommendation * Radar Target Recognition : Here, RBM is used to detect intra pulse in Radar systems which have very low SNR and high noise.
Source: wikipedia (https://en.wikipedia.org/wiki/Restricted_Boltzmann_machine)
Cloud computing is a computing term or metaphor that evolved in the late 2000s, based on utility and consumption of computer resources. Cloud computing is about moving computing from the single desktop PC/Data centers to Internet.
Cloud: The “Cloud” is the default symbol of the internet in diagrams
Computing: The broader term of “Computing” encompasses- computation, coordination logic and storage.
Let’s take an example, you wish to play Ninja Fighters game with your friend on your smartphone. You go to the app store, download the app, log in, find your friend and within five minutes, you’re having fun. This ability to request services for yourself when you need them in cloud computing terms is known as on-demand self-service. You didn’t need to go to a physical store, you didn’t need to call someone to place an order and you didn’t need to sit on hold or wait for anyone else to do anything for you. Another example is of Gmail. You don’t need to install any software nor do you need hard disk space to save your emails -It’s all in the “cloud” managed by Google. In cloud computing, you don’t care what kind of software it is, all you care about is that the service offered is available and reliable. As more users join the game, the cloud is able to quickly grow or shrink to meet the change in demand—elasticity in techie terms. This is possible because a cloud provider, like IBM, has a massive number of servers pooled together that can be balanced between its various customers. But ultimately, you don’t care as long as it’s available for you.
1.1.1Features of Cloud Computing
Figure 2: NIST Visual Model of Cloud Computing Definition
The generally accepted definition of Cloud Computing comes from the National Institute of Standards and Technology (NIST). The NIST definition runs to several hundred words but essentially says that:
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
What this simply means is the ability for end users to utilize parts of bulk resources and that these resources can be acquired quickly and easily. NIST also offers up several characteristics that it sees as essential for a service to be considered “Cloud”. These characteristics include;
On-demand self-service: The ability for an end user to sign up and receive services without the long delays that have characterized traditional IT
Broad network access: Ability to access the service via standard platforms (desktop, laptop, mobile etc)
Resource pooling: Resources are pooled across multiple customers
Rapid elasticity: Capability can scale to cope with demand peaks
Measured Service: Billing is metered and delivered as a utility service
1.1.2Types of Cloud
With cloud computing technology, large pools of resources can be connected through private or public networks.
What are the differences between these types of cloud computing, and how can you determine the right cloud path for your organization? Here are some fundamentals of each to help with the decision-making process().
Public
Public clouds are made available to the general public by a service provider who hosts the cloud infrastructure. Generally, public cloud providers like Amazon AWS, Microsoft and Google own and operate the infrastructure and offer access over the Internet. With this model, customers have no visibility or control over where the infrastructure is located. It is important to note that all customers on public clouds share the same infrastructure pool with limited configuration, security protections and availability variances.
Public Cloud customers benefit from economies of scale, because infrastructure costs are spread across all users, allowing each individual client to operate on a low-cost, “pay-as-you-go” model. Another advantage of public cloud infrastructures is that they are typically larger in scale than an in-house enterprise cloud, which provides clients with seamless, on-demand scalability. These clouds offer the greatest level of efficiency in shared resources; however, they are also more vulnerable than private clouds.
A public cloud is the obvious choice when:
Your standardized workload for applications is used by lot of people, such as e-mail.
You need to test and develop an application code.
You need incremental capacity i.e. the ability to add resources for peak times.
You’re doing collaboration projects.
Private
Private cloud is cloud infrastructure dedicated to a particular organization. Private clouds allow businesses to host applications in the cloud, while addressing concerns regarding data security and control, which is often lacking in a public cloud environment. It is not shared with other organizations, whether managed internally or by a third-party, and it can be hosted internally or externally.
There are two variations of private clouds:
On-Premise Private Cloud
Externally Hosted
This type of cloud is hosted within an organization’s own facility. A businesses IT department would incur the capital and operational costs for the physical resources with this model. On-Premise Private Clouds are best used for applications that require complete control and configurability of the infrastructure and security.
Externally hosted private clouds are also exclusively used by one organization, but are hosted by a third party specializing in cloud infrastructure. The service provider facilitates an exclusive cloud environment with full guarantee of privacy. This format is recommended for organizations that prefer not to use a public cloud infrastructure due to the risks associated with the sharing of physical resources.
Undertaking a private cloud project requires a significant level and degree of engagement to virtualize the business environment, and it will require the organization to reevaluate decisions about existing resources. Private clouds are more expensive but also more secure when compared to public clouds. An Info-Tech survey shows that 76% of IT decision-makers will focus exclusively on the private cloud, as these clouds offer the greatest level of security and control.
When to opt for a Private Cloud?
You need data sovereignty but want cloud efficiencies
You want consistency across services
You have more server capacity than your organization can use
Your data center must become more efficient
You want to provide private cloud services
Hybrid
Hybrid Clouds are a composition of two or more clouds (private, community or public) that remain unique entities but are bound together offering the advantages of multiple deployment models. In a hybrid cloud, you can leverage third party cloud providers in either a full or partial manner; increasing the flexibility of computing. Augmenting a traditional private cloud with the resources of a public cloud can be used to manage any unexpected surges in workload.
Hybrid cloud architecture requires both on-premise resources and off-site server based cloud infrastructure. By spreading things out over a hybrid cloud, you keep each aspect of your business in the most efficient environment possible. The downside is that you have to keep track of multiple cloud security platforms and ensure that all aspects of your business can communicate with each other.
Here are a couple of situations where a hybrid environment is best:
Your company wants to use a SaaS application but is concerned about security.
Your company offers services that are tailored for different vertical markets. You can use a public cloud to interact with the clients but keep their data secured within a private cloud.
You can provide public cloud to your customers while using a private cloud for internal IT.
Community
A community cloud is a is a multi-tenant cloud service model that is shared among several or organizations and that is governed, managed and secured commonly by all the participating organizations or a third party managed service provider. Community clouds are a hybrid form of private clouds built and operated specifically for a targeted group. These communities have similar cloud requirements and their ultimate goal is to work together to achieve their business objectives.
The goal of community clouds is to have participating organizations realize the benefits of a public cloud with the added level of privacy, security, and policy compliance usually associated with a private cloud. Community clouds can be either on-premise or off-premise.
Here are a couple of situations where a community cloud environment is best:
Government organizations within a state that need to share resources
A private HIPAA compliant cloud for a group of hospitals or clinics
Telco community cloud for telco DR to meet specific FCC regulations
Cloud computing is about shared IT infrastructure or the outsourcing of a company’s technology. It is essential to examine your current IT infrastructure, usage and needs to determine which type of cloud computing can help you best achieve your goals. Simply, the cloud is not one concrete term, but rather a metaphor for a global network and how to best utilize its advantages depends on your individual cloud focus.
1.1.3Advantages & Disadvantages
Advantagesof Cloud Computing
Cloud computing presents a huge opportunity for businesses.Let’s look at some of them:
Cost Efficient
Cloud computing is probably the most cost efficient method to use, maintain and upgrade. Traditional desktop software costs companies a lot in terms of finance. Adding up the licensing fees for multiple users can prove to be very expensive for the establishment concerned. The cloud, is available at much cheaper rates and can significantly lower the company’s IT expenses. Besides, there are many one-time-payment, pay-as-you-go and other scalable options available, which makes it reasonable.
Almost Unlimited Storage
Storing information in the cloud gives you almost unlimited storage capacity. Hence, you no more need to worry about running out of storage space or increasing your current storage space availability.
Backup and Recovery
Since all your data is stored in the cloud, backing it up and restoring the same is relatively much easier than storing the same on a physical device. Furthermore, most cloud service providers are usually competent enough to handle recovery of information. Hence, this makes the entire process of backup and recovery much simpler than other traditional methods of data storage.
Automatic Software Integration
In the cloud, software integration is usually something that occurs automatically. This means that you do not need to take additional efforts to customize and integrate your applications as per your preferences. This aspect usually takes care of itself. You can also handpick just those services and software applications that you think will best suit your particular enterprise.
Easy Access to Information
Once you register yourself in the cloud, you can access the information from anywhere, where there is an Internet connection.
Quick Deployment
Cloud computing gives you the advantage of quick deployment. Once you opt for this method of functioning, your entire system can be fully functional in a matter of a few minutes, dependingupon the exact kind of technology that you need for your business.
Disadvantages of Cloud Computing
Cloud computing also has some challengessuch as:
Technical Issues
Though it is true that information and data on the cloud can be accessed anytime and from anywhere at all, there are times when this system can have some serious dysfunction. Technology is always prone to outages and other technical issues. Even the best cloud service providers run into this kind of trouble, in spite of keeping up high standards of maintenance. Besides, you will need a very good Internet connection to be logged onto the server at all times. You will invariably be stuck in case of network and connectivity problems.
Security in the Cloud
The other major issue while in the cloud is that of security issues. Before adopting this technology, you should know that you will be surrendering all your company’s sensitive information to a third-party cloud service provider. This could potentially put your company to great risk. Hence, you need to make absolutely sure that you choose the most reliable service provider, who will keep your information totally secure.
Prone to Attack
Storing information in the cloud could make your company vulnerable to external hack attacks and threats. As you are well aware, nothing on the internet is completely secure and hence, there is always the lurking possibility of stealth of sensitive data.
Python is a high level computer programming language and famous for its plainness. Late in the 1980s Rossum shaped python and unconfined in 1991. Python ropes several programming paradigms, as well as ritual object-oriented and purposeful programming. Python is very vast and regular collection which provides numerous correspondence, framework and many practical relevance like web development, data analysis AI (artificial intelligence) scientific computing and much more.
I. Introduction
Why learn Python?
There are several reasons to learn Python:
effortlessness of Learning: Python’s straightforward and sparkling syntax makes it reachable for beginners.
resourcefulness: It’s applicable in diverse domains like web development, data analysis, machine learning, artificial intelligence, scientific computing, etc.
Large Community and Libraries: Python has a massive community that contributes to its ecosystem by creating libraries and frameworks, allowing developers to accomplish tasks more efficiently.
Career Opportunities: Python is widely used across industries, and proficiency in Python opens up job opportunities in software development, data science, machine learning, and more.
High Demand: Due to its versatility and ease of use, Python developers are in high demand in the job market.
C. Brief history and popularity
History: Python was conceived in the late 1980s by Guido van Rossum, and its implementation began in December 1989. It was officially released in 1991 as Python 0.9. Python 2.x and Python 3.x are the two major versions coexisting for some time, with Python 2.x being officially discontinued in 2020 in favor of Python 3.x.
Popularity: Python’s popularity has surged over the years due to its simplicity, readability, versatility, and an extensive community-driven ecosystem. It’s used by both beginners and experienced developers for various purposes, contributing to its widespread adoption across industries. Its popularity is evident in fields like web development (Django, Flask), data science (Pandas, NumPy), machine learning (TensorFlow, PyTorch), and more.
II. Setting Up Python
A. Installing Python:
Download Python: Visit the official Python website at python.org,
navigate to the Downloads section, and select the version of Python suitable for your operating system (Windows, macOS, or Linux).
Install Python: Run the installer and follow the installation instructions. Make sure to check the box that says “Add Python to PATH” during installation on Windows. This makes it easier to run Python from the command line.
B. Using Integrated Development Environments (IDEs) or Text Editors:
IDEs: Integrated Development Environments like PyCharm, VSCode with Python extensions, Jupyter Notebook, or Spyder provide an all-in-one solution with features like code highlighting, debugging tools, and project management. Install an IDE of your choice by downloading it from the respective website and follow the setup instructions.
Text Editors: Text editors like Sublime Text, Atom, or Notepad++ are simpler compared to IDEs but still support Python development. You write code and execute it separately. After installing a text editor, create a new file and save it with a .py extension (e.g., hello.py) to write Python code.
C. Running the First Python Program (Hello, World!):
Using IDEs:
Open your IDE.
Create a new Python file.
Type the following code:
python
print(“Hello, World!”)
Save the file.
Run the code using the “Run” or “Execute” button in the IDE. You should see “Hello, World!” printed in the output console.
Using Text Editors:
Open your chosen text editor.
Create a new file and type:
print(“Hello, World!”)
Save the file with a .py extension (e.g., hello.py).
Open a command line or terminal.
Navigate to the directory where your Python file is saved using cd (change directory) command.
Type python hello.py (replace hello.py with your file name) and press Enter.
You should see “Hello, World!” printed in the terminal.
Congratulations! You’ve successfully installed Python, chosen an environment to write code (IDE or text editor), and executed your first Python program displaying “Hello, World!”
III. Basics of Python Programming
A. Syntax and Indentation:
Syntax: Python’s syntax is clear and readable. It uses indentation to define blocks of code instead of using curly braces {} or keywords like end in other languages. Proper indentation (usually four spaces) is crucial for Python to understand the code structure correctly.
Example:
if 5 > 2: print(“Five is greater than two”)
B. Variables and Data Types:
Variables: In Python, variables are used to store data. They can be assigned different data types and values during the program’s execution.
Data Types: Python has several data types:
Integers (int): Whole numbers without decimals.
Floats (float): Numbers with decimals.
Strings (str): Ordered sequences of characters enclosed in single (‘ ‘) or double (” “) quotes.
Arithmetic Operators: Used for basic mathematical operations such as addition, subtraction, multiplication, division, etc.
python
# Examples of arithmetic operators a = 10 b = 5 print(a + b) # Addition print(a – b) # Subtraction print(a * b) # Multiplication print(a / b) # Division print(a % b) # Modulus (remainder) print(a ** b) # Exponentiation
Comparison Operators: Used to compare values and return True or False.
python
# Examples of comparison operators x = 10 y = 5 print(x == y) # Equal to print(x != y) # Not equal to print(x > y) # Greater than print(x < y) # Less than print(x >= y) # Greater than or equal to print(x <= y) # Less than or equal to
Logical Operators: Used to combine conditional statements.
python
# Examples of logical operators p = True q = False print(p and q) # Logical AND print(p or q) # Logical OR print(not p) # Logical NOT
D. Control Structures:
Conditionals (if, elif, else): Used to make decisions in the code based on certain conditions.
python
# Example of conditional statements age = 18 if age >= 18: print(“You are an adult”) elif age >= 13: print(“You are a teenager”) else: print(“You are a child”)
Loops (for, while): Used for iterating over a sequence (for loop) or executing a block of code while a condition is True (while loop).
python
# Example of loops # For loop for i in range(5): print(i) # While loop count = 0 while count < 5: print(count) count += 1
IV. Data Structures in Python
A. Lists:
Definition: Lists are ordered collections of items or elements in Python. They are mutable, meaning the elements within a list can be changed or modified after the list is created.
Syntax: Lists are created by enclosing elements within square brackets [], separated by commas.
Example:
python
# Creating a list my_list = [1, 2, 3, 4, 5]
B. Tuples:
Definition: Tuples are similar to lists but are immutable, meaning the elements cannot be changed once the tuple is created.
Syntax: Tuples are created by enclosing elements within parentheses (), separated by commas.
Example:
python
# Creating a tuple my_tuple = (1, 2, 3, 4, 5)
C. Dictionaries:
Definition: Dictionaries are unordered collections of key-value pairs. They are mutable and indexed by unique keys. Each key is associated with a value, similar to a real-life dictionary where words (keys) have definitions (values).
Syntax: Dictionaries are created by enclosing key-value pairs within curly braces {}, separated by commas and using a colon : to separate keys and values.
Definition: Sets are unordered collections of unique elements. They do not allow duplicate elements.
Syntax: Sets are created by enclosing elements within curly braces {}, separated by commas.
Example:
python
# Creating a set my_set = {1, 2, 3, 4, 5}
Key Points:
Lists and tuples are ordered collections, but lists are mutable while tuples are immutable.
Dictionaries use key-value pairs to store data, allowing quick retrieval of values using their associated keys.
Sets are unordered collections of unique elements; they are useful for mathematical set operations like union, intersection, etc., and do not allow duplicate elements.
These data structures provide flexibility in storing and manipulating data in Python, each with its own characteristics and best-use cases. Understanding how to use them effectively can greatly enhance your ability to work with data in Python programs.
V. Functions and Modules
A. Defining Functions:
Definition: Functions in Python are blocks of reusable code designed to perform a specific task. They improve code modularity and reusability.
Syntax: Functions are defined using the def keyword, followed by the function name and parentheses containing optional parameters. The block of code inside the function is indented.
Example:
python
# Defining a function def greet(): print(“Hello, welcome!”)
B. Passing Arguments and Returning Values:
Arguments: Functions can accept parameters (arguments) to perform their tasks dynamically.
Positional Arguments: Defined based on the order they are passed.
Keyword Arguments: Defined by specifying the parameter name when calling the function.
Return Values: Functions can return values using the return statement.
Example:
python
# Function with arguments and return value def add(a, b): return a + b result = add(3, 5) # Passing arguments print(“Result:”, result) # Output: Result: 8
C. Working with Modules and Libraries:
Modules: Python modules are files containing Python code, which can define functions, classes, and variables. They can be imported into other Python scripts to reuse the code.
Libraries: Libraries are collections of modules that provide pre-written functionalities to ease development tasks.
Importing Modules/Libraries: Use the import keyword to import modules and libraries in your Python script.
Example:
python
# Importing a module import math # Importing the math module # Using functions from the imported module print(math.sqrt(16)) # Output: 4.0 (square root function from math module)
Creating and Using Your Own Modules: You can create your own modules by writing Python code in a separate file and importing it into your script.
VI. File Handling in Python
A. Reading from and Writing to Files:
Reading from Files (open() and read()):
To read from a file, you can use the open() function in Python, which opens a file and returns a file object. The read() method is used to read the contents of the file.
Syntax for Reading:
python
# Reading from a file file = open(‘file.txt’, ‘r’) # Opens the file in read mode (‘r’) content = file.read() # Reads the entire file content print(content) file.close() # Close the file after reading
Writing to Files (open() and write()):
To write to a file, open it with the appropriate mode (‘w’ for write, ‘a’ for append). The write() method is used to write content to the file.
Syntax for Writing:
python
# Writing to a file file = open(‘file.txt’, ‘w’) # Opens the file in write mode (‘w’) file.write(‘Hello, World!\n’) # Writes content to the file file.close() # Close the file after writing
B. File Modes and Operations:
File Modes:
Read Mode (‘r’): Opens a file for reading. Raises an error if the file does not exist.
Write Mode (‘w’): Opens a file for writing. Creates a new file if it doesn’t exist or truncates the file if it exists.
Append Mode (‘a’): Opens a file for appending new content. Creates a new file if it doesn’t exist.
Read and Write Mode (‘r+’): Opens a file for both reading and writing.
Binary Mode (‘b’): Used in conjunction with other modes (e.g., ‘rb’, ‘wb’) to handle binary files.
File Operations:
read(): Reads the entire content of the file or a specified number of bytes.
readline(): Reads a single line from the file.
readlines(): Reads all the lines of a file and returns a list.
write(): Writes content to the file.
close(): Closes the file when finished with file operations.
Using with Statement (Context Manager):
The with statement in Python is used to automatically close the file when the block of code is exited. It’s a good practice to use it to ensure proper file handling.
Syntax:
python
with open(‘file.txt’, ‘r’) as file: content = file.read() print(content) # File is automatically closed outside the ‘with’ block
VII. Object-Oriented Programming (OOP) Basics
A. Classes and Objects:
Classes:
Classes are blueprints for creating objects in Python. They encapsulate data (attributes) and behaviors (methods) into a single unit.
Syntax for Class Declaration:
python
# Class declaration class MyClass: # Class constructor (initializer) def __init__(self, attribute1, attribute2): self.attribute1 = attribute1 self.attribute2 = attribute2 # Class method def my_method(self): return “This is a method in MyClass”
Objects:
Objects are instances of classes. They represent real-world entities and have attributes and behaviors defined by the class.
Creating Objects from a Class:
python
# Creating an object of MyClass obj = MyClass(“value1”, “value2”)
B. Inheritance and Polymorphism:
Inheritance:
Inheritance allows a class (subclass/child class) to inherit attributes and methods from another class (superclass/parent class).
Syntax for Inheritance:
python
# Parent class class Animal: def sound(self): return “Some sound” # Child class inheriting from Animal class Dog(Animal): def sound(self): # Overriding the method return “Woof!”
Polymorphism:
Polymorphism allows objects of different classes to be treated as objects of a common superclass. It enables the same method name to behave differently for each class.
Example of Polymorphism:
python
# Polymorphism example def animal_sound(animal): return animal.sound() # Same method name, different behaviors # Creating instances of classes animal1 = Animal() dog = Dog() # Calling the function with different objects print(animal_sound(animal1)) # Output: “Some sound” print(animal_sound(dog)) # Output: “Woof!”
VIII. Error Handling (Exceptions)
A. Understanding Exceptions:
What are Exceptions?
Exceptions are errors that occur during the execution of a program, disrupting the normal flow of the code.
Examples include dividing by zero, trying to access an undefined variable, or attempting to open a non-existent file.
Types of Exceptions:
Python has built-in exception types that represent different errors that can occur during program execution, like ZeroDivisionError, NameError, FileNotFoundError, etc.
B. Using Try-Except Blocks:
Handling Exceptions with Try-Except Blocks:
Try-except blocks in Python provide a way to handle exceptions gracefully, preventing the program from crashing when errors occur.
Syntax:
python
try: # Code that might raise an exception result = 10 / 0 # Example: Division by zero except ExceptionType as e: # Code to handle the exception print(“An exception occurred:”, e)
Handling Specific Exceptions:
You can catch specific exceptions by specifying the exception type after the except keyword.
Example:
python
try: file = open(‘nonexistent_file.txt’, ‘r’) except FileNotFoundError as e: print(“File not found:”, e)
Using Multiple Except Blocks:
You can use multiple except blocks to handle different types of exceptions separately.
Example:
python
try: result = 10 / 0 except ZeroDivisionError as e: print(“Division by zero error:”, e) except Exception as e: print(“An exception occurred:”, e)
Handling Exceptions with Else and Finally:
The else block runs if no exceptions are raised in the try block, while the finally block always runs, whether an exception is raised or not.
Example:
python
try: result = 10 / 2 except ZeroDivisionError as e: print(“Division by zero error:”, e) else: print(“No exceptions occurred!”) finally: print(“Finally block always executes”)
IX. Introduction to Python Libraries
A. Overview of Popular Libraries:
NumPy:
Description: NumPy is a fundamental package for scientific computing in Python. It provides support for arrays, matrices, and mathematical functions to operate on these data structures efficiently.
Key Features:
Multi-dimensional arrays and matrices.
Mathematical functions for array manipulation.
Linear algebra, Fourier transforms, and random number capabilities.
Example:
python
import numpy as np # Creating a NumPy array arr = np.array([1, 2, 3, 4, 5])
Pandas:
Description: Pandas is a powerful library for data manipulation and analysis. It provides data structures like Series and DataFrame, making it easy to handle structured data.
Key Features:
Data manipulation tools for reading, writing, and analyzing data.
Data alignment, indexing, and handling missing data.
Time-series functionality.
Example:
python
import pandas as pd # Creating a DataFrame data = {‘Name’: [‘Alice’, ‘Bob’, ‘Charlie’], ‘Age’: [25, 30, 35]} df = pd.DataFrame(data)
Matplotlib:
Description: Matplotlib is a comprehensive library for creating static, interactive, and animated visualizations in Python. It provides functionalities to visualize data in various formats.
Key Features:
Plotting 2D and 3D graphs, histograms, scatter plots, etc.
Customizable visualizations.
Integration with Jupyter Notebook for interactive plotting.
Example:
python
import matplotlib.pyplot as plt # Plotting a simple line graph x = [1, 2, 3, 4, 5] y = [2, 4, 6, 8, 10] plt.plot(x, y) plt.xlabel(‘X-axis’) plt.ylabel(‘Y-axis’) plt.title(‘Simple Line Graph’) plt.show()
B. Installing and Importing Libraries:
Installing Libraries using pip:
Open a terminal or command prompt and use the following command to install libraries:
pip install numpy pandas matplotlib
Importing Libraries in Python:
Once installed, import the libraries in your Python script using import statements:
Python import numpy as np import pandas as pd import matplotlib.pyplot as plt
After importing, you can use the functionalities provided by these libraries in your Python code.
X. Real-life Examples and Projects
A. Simple Projects for Practice:
To-Do List Application:
Create a command-line to-do list application that allows users to add tasks, mark them as completed, delete tasks, and display the list.
Temperature Converter:
Build a program that converts temperatures between Celsius and Fahrenheit or other temperature scales.
Web Scraper:
Develop a web scraper that extracts information from a website and stores it in a structured format like a CSV file.
Simple Calculator:
Create a basic calculator that performs arithmetic operations such as addition, subtraction, multiplication, and division.
Hangman Game:
Implement a command-line version of the Hangman game where players guess letters to reveal a hidden word.
Address Book:
Develop an address book application that stores contacts with details like name, phone number, and email address.
File Organizer:
Write a script that organizes files in a directory based on their file extensions or other criteria.
B. Exploring Python’s Applications in Different Fields:
Web Development (Django, Flask):
Python is widely used for web development. Explore frameworks like Django or Flask to build web applications, REST APIs, or dynamic websites.
Data Science and Machine Learning:
Use libraries like NumPy, Pandas, Scikit-learn, or TensorFlow to perform data analysis, create machine learning models, or work on predictive analytics projects.
Scientific Computing:
Python is used extensively in scientific computing for simulations, modeling, and solving complex mathematical problems. Use libraries like SciPy or SymPy for scientific computations.
Natural Language Processing (NLP):
Explore NLP with Python using libraries like NLTK or spaCy for text processing, sentiment analysis, or language translation tasks.
Game Development:
Develop simple games using Python libraries like Pygame, allowing you to create 2D games and learn game development concepts.
Automation and Scripting:
Create scripts to automate repetitive tasks like file manipulation, data processing, or system administration using Python’s scripting capabilities.
IoT (Internet of Things) and Raspberry Pi Projects:
Experiment with Python for IoT projects by controlling sensors, actuators, or devices using Raspberry Pi and Python libraries like GPIO Zero.
XI. Conclusion
A. Recap of Key Points:
Python Basics: Python is a high-level, versatile programming language known for its simplicity, readability, and vast ecosystem of libraries and frameworks.
Core Concepts: Understanding Python’s syntax, data types, control structures, functions, and handling exceptions is crucial for effective programming.
Popular Libraries: Libraries like NumPy, Pandas, Matplotlib, etc., offer specialized functionalities for data manipulation, scientific computing, visualization, and more.
Project Ideas: Simple projects, such as to-do lists, calculators, web scrapers, etc., provide practical experience and reinforce learning.
Real-world Applications: Python’s applications span diverse fields like web development, data science, machine learning, scientific computing, automation, IoT, and more.
B. Encouragement for Further Exploration:
Continuous Learning: Python’s versatility and vast ecosystem offer endless opportunities for learning and growth.
Practice and Projects: Build upon your knowledge by working on more complex projects, contributing to open-source, and experimenting with different libraries and domains.
Community Engagement: Engage with the Python community through forums, meetups, conferences, and online platforms to learn, share experiences, and collaborate.
Stay Curious: Python evolves continuously, and exploring new libraries, updates, or trends keeps your skills up-to-date and opens doors to new possibilities.
Persistence: Embrace challenges as learning opportunities. Persistence and dedication in learning Python will yield rewarding results in the long run.
C. Final Thoughts:
Python is an exceptional programming language renowned for its simplicity, readability, and versatility. Its applications span across numerous fields, from web development to scientific computing, data analysis, machine learning, and beyond. Whether you’re a beginner starting your programming journey or an experienced developer seeking new avenues, Python offers a rich ecosystem and a supportive community to aid your exploration and growth.
# to give your own set of values, you need to provide in terms of list l1 = [[1,5,7],[2,4,9],[1,1,3],[3,3,2]] # array is a function to convert list into numpy mat1 = np.array(l1) print(mat1)
# NUMPY import numpy as np # to give your own set of values, you need to provide in terms of list l1 = [[1,5,7],[2,4,9],[1,1,3],[3,3,2]] # array is a function to convert list into numpy mat1 = np.array(l1) # 4 * 3 – shape print(mat1) l2 = [[2,3,4],[2,1,2],[5,2,3],[3,2,2]] # array is a function to convert list into numpy mat2 = np.array(l2) print(mat2)
# actual matrix multiplication is done using matmul() l3 = [[2,3,4],[2,1,2],[5,2,3]] # array is a function to convert list into numpy mat3 = np.array(l3) print(mat3) print(“Matrix Multiplication”) print(np.matmul(mat1, mat3)) print(mat1 @ mat3) ## calculating determinant l4 = [[1,3,5],[1,3,1],[2,3,4]] mat5 = np.array(l4) det_mat5 = np.linalg.det(mat5) print(“Determinant of matrix 5 is”,det_mat5) print(“Inverse of matrix 5 is: \n“,np.linalg.inv(mat5))
”’ Linear Algebra Equation: x1 + 5×2 = 7 -2×1 – 7×2 = -5 x1 = -8, x2= 3, ”’ coeff_mat = np.array([[1,5],[-2,-7]]) #var_mat = np.array([[x1],[x2]]) result_mat = np.array([[7],[-5]]) # equation here is coeff_mat * var_mat = result_mat [eg: 5 * x = 10] # which is, var_mat = coeff_mat inv * result_mat det_coeff_mat = np.linalg.det(coeff_mat) if det_coeff_mat !=0: var_mat = np.linalg.inv(coeff_mat) @ result_mat print(“X1 = “,var_mat[0,0]) print(“X2 = “,var_mat[1,0]) else: print(“Solution is not possible”)
# # scipy = scientific python # pip install scipy ”’ #Inequality = OPTIMIZATION or MAXIMIZATION / MINIMIZATION PROBLEM Computer Parts Assembly: Laptops & Desktops profit: 1000, 600 objective: either maximize profit or minimize cost constraints: 1. Demand: 500, 600 2. Parts: Memory card: 5000 cards available 3. Manpower: 25000 minutes ”’
”’ Optimization using Scipy let’s assume d = desktop, n = notebooks Constraints: 1. d + n <= 10000 2. 2d + n <= 15000 3. 3d + 4n <= 25000 profit: 1000 d + 750 n => maximize -1000d – 750 n =>minimize ”’ import numpy as np from scipy.optimize import minimize, linprog d = 1 n = 1 profit_d = 1000 profit_n = 750 profit = d * profit_d + n * profit_n obj = [-profit_d, -profit_n] lhs_con = [[1,1],[2,1],[3,4]] rhs_con = [10000, 15000, 25000]
boundary = [(0, float(“inf”)), # boundary condition for # of desktops (10, 200000)] # we just added some limit for notebooks opt = linprog(c=obj, A_ub=lhs_con, b_ub=rhs_con, bounds=boundary, method=“revised simplex”) print(opt) if opt.success: print(f”Number of desktops = {opt.x[0]} and number of laptops = {opt.x[1]}“) print(“Maximum profit that can be generated = “,-1 * opt.fun) else: print(“Solution can not be generated”)
df2.iloc[2,0] = 14000 print(df2) print(“========= DF1 =============”) df1[‘Avg’] = df1[‘Runs’] / df1[“Wickets”] print(df1) print(“Reading data from DF1: “) df4 = df1[df1.Player !=‘Sachin’] #filter where clause print(“\n\n New dataset without Sachin: \n“, df4) df1 = df1.drop(“Player”,axis=1) # axis default is 0 # unlike pop() and del – drop() returns a new dataframe print(df1)
print(“Average Wickets of all the players = “,df1[‘Wickets’].mean()) print(“Average Wickets of players by type = \n\n“,df1.groupby(‘Type’).mean()) # axis = 0 refers to rows # axis = 1 refers to columns print(“\n\nDropping columns from DF1: “) del df1[‘Wickets’] #dropping column Wickets using del print(df1)
df1.pop(‘Runs’) #dropping column using pop print(df1) #
## Working with Pandas – Example ## import pandas as pd import numpy as np df = pd.read_csv(“D:/datasets/gitdataset/hotel_bookings.csv”) print(df.shape) print(df.dtypes) ”’ numeric – int, float categorical – 1) Nominal – there is no order 2) Ordinal – here order is imp ”’ df_numeric = df.select_dtypes(include=[np.number]) print(df_numeric)
df_object= df.select_dtypes(exclude=[np.number]) print(df_object) # categorical and date columns print(df.columns) for col in df.columns: missing = np.mean(df[col].isnull()) if missing >0: print(f”{col} – {missing}“)
”’ Phases: 1. Business objective 2. Collect the relevant data 3. Preprocessing – making data ready for use a. Handle missing values b. Feature scaling – scale the values in the column to similar range c. Outliers / data correction d. handling categorical data: i. Encode the data to convert text to number East = 0, North = 1, South = 2, West = 3 ii. Column Transform into multple columns iii. Delete any one column 4. EDA- Exploratory Data Analysis: to understand the data 5. MODEL BUILDING – Divide the train and test ”’ import pandas as pd df = pd.read_csv(“https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/1_Data_PreProcessing.csv”) print(df)
Phases: 1. Business objective 2. Collect the relevant data 3. Preprocessing – making data ready for use a. Handle missing values b. Feature scaling – scale the values in the column to similar range c. Outliers / data correction d. handling categorical data: i. Encode the data to convert text to number East = 0, North = 1, South = 2, West = 3 ii. Column Transform into multple columns iii. Delete any one column 4. EDA- Exploratory Data Analysis: to understand the data 5. MODEL BUILDING – a. Divide the train and test b. Run the model 6. EVALUATE THE MODEL: a. Measure the performance of each algorithm on the test data b. Metric to compare: based on Regression (MSE, RMSE, R square) or classification (confusion matrix -accuracy, sensitivity..) c. select the best performing model 7. DEPLOY THE BEST PERFORMING MODEL Hypothesis test: 1. Null Hypothesis (H0): starting statement (objective) Alternate Hypethesis (H1): Alternate of H0 Z or T test: Chi square test: both are categorical e.g. North zone: 50 WIN 5 LOSS – p = 0.005 # simple (single value) v composite (specifies range) # two tailed test v one tailed test [H0: mean = 0, H1 Left Tailed: mean <0 H1 Right Tailed: mean >0 # level of significance: alpha value: confidence interval – 95% p value: p value <0.05 – we reject Null Hypothesis
”’ Regression: Output (Marks) is a continous variable Algorithm: Simple (as it has only 1 X column) Linear (assuming that dataset is linear) Regression X – independent variable(s) Y – dependent variable ”’ import pandas as pd import matplotlib.pyplot as plt link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/2_Marks_Data.csv” df = pd.read_csv(link) X = df.iloc[:,:1].values y = df.iloc[:,1].values
”’ # 1. Replace the missing values with mean value from sklearn.impute import SimpleImputer import numpy as np imputer = SimpleImputer(missing_values=np.nan, strategy=’mean’) imputer = imputer.fit(X[:,1:3]) X[:,1:3] = imputer.transform(X[:,1:3]) #print(X) # 2. Handling categorical values # encoding from sklearn.preprocessing import LabelEncoder, OneHotEncoder lc = LabelEncoder() X[:,0] = lc.fit_transform(X[:,0]) from sklearn.compose import ColumnTransformer transform = ColumnTransformer([(‘one_hot_encoder’, OneHotEncoder(),[0])],remainder=’passthrough’) X=transform.fit_transform(X) X = X[:,1:] # dropped one column #print(X) ”’ # EDA – Exploratory Data Analysis plt.scatter(x=df[‘Hours’],y=df[‘Marks’]) plt.show() ”’ Scatter plots – shows relationship between X and Y variables. You can have: 1. Positive correlation: 2. Negative correlation: 3. No Correlation 4. Correlation: 0 to +/- 1 5. Correlation value: 0 to +/- 0.5 : no correlation 6. Strong correlation value will be closer to +/- 1 7. Equation: straight line => y = mx + c ”’ # 3. splitting it into train and test test from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2, random_state=100) print(X_train)
”’ # 4. Scaling / Normalization from sklearn.preprocessing import StandardScaler scale = StandardScaler() X_train = scale.fit_transform(X_train[:,3:]) X_test = scale.fit_transform(X_test[:,3:]) print(X_train) ”’ ## RUN THE MODEL from sklearn.linear_model import LinearRegression regressor = LinearRegression() # fit – train the model regressor.fit(X_train, y_train) print(f”M/Coefficient/Slope = {regressor.coef_} and the Constant = {regressor.intercept_}“)
# y = 7.5709072 X + 20.1999196152844 # M/Coefficient/Slope = [7.49202113] and the Constant = 21.593606679699406 y_pred = regressor.predict(X_test) result_df =pd.DataFrame({‘Actual’: y_test, ‘Predicted’: y_pred}) print(result_df)
”’ Regression: Output (Marks) is a continous variable Algorithm: Simple (as it has only 1 X column) Linear (assuming that dataset is linear) Regression X – independent variable(s) Y – dependent variable ”’ import pandas as pd import matplotlib.pyplot as plt link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/2_Marks_Data.csv” df = pd.read_csv(link) X = df.iloc[:,:1].values y = df.iloc[:,1].values
”’ # 1. Replace the missing values with mean value from sklearn.impute import SimpleImputer import numpy as np imputer = SimpleImputer(missing_values=np.nan, strategy=’mean’) imputer = imputer.fit(X[:,1:3]) X[:,1:3] = imputer.transform(X[:,1:3]) #print(X) # 2. Handling categorical values # encoding from sklearn.preprocessing import LabelEncoder, OneHotEncoder lc = LabelEncoder() X[:,0] = lc.fit_transform(X[:,0]) from sklearn.compose import ColumnTransformer transform = ColumnTransformer([(‘one_hot_encoder’, OneHotEncoder(),[0])],remainder=’passthrough’) X=transform.fit_transform(X) X = X[:,1:] # dropped one column #print(X) ”’ # EDA – Exploratory Data Analysis plt.scatter(x=df[‘Hours’],y=df[‘Marks’]) plt.show() ”’ Scatter plots – shows relationship between X and Y variables. You can have: 1. Positive correlation: 2. Negative correlation: 3. No Correlation 4. Correlation: 0 to +/- 1 5. Correlation value: 0 to +/- 0.5 : no correlation 6. Strong correlation value will be closer to +/- 1 7. Equation: straight line => y = mx + c ”’ # 3. splitting it into train and test test from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2, random_state=100) print(X_train)
”’ # 4. Scaling / Normalization from sklearn.preprocessing import StandardScaler scale = StandardScaler() X_train = scale.fit_transform(X_train[:,3:]) X_test = scale.fit_transform(X_test[:,3:]) print(X_train) ”’ ## RUN THE MODEL from sklearn.linear_model import LinearRegression regressor = LinearRegression() # fit – train the model regressor.fit(X_train, y_train) print(f”M/Coefficient/Slope = {regressor.coef_} and the Constant = {regressor.intercept_}“)
# y = 7.5709072 X + 20.1999196152844 # M/Coefficient/Slope = [7.49202113] and the Constant = 21.593606679699406 y_pred = regressor.predict(X_test) result_df =pd.DataFrame({‘Actual’: y_test, ‘Predicted’: y_pred}) print(result_df)
# Analyze the output from sklearn import metrics mse = metrics.mean_squared_error(y_true=y_test, y_pred=y_pred) print(“Root Mean Squared Error (Variance) = “,mse**0.5) mae = metrics.mean_absolute_error(y_true=y_test, y_pred=y_pred) print(“Mean Absolute Error = “,mae) print(“R Square is (Variance)”,metrics.r2_score(y_test, y_pred))
## Bias is based on training data y_pred_tr = regressor.predict(X_train) mse = metrics.mean_squared_error(y_true=y_train, y_pred=y_pred_tr) print(“Root Mean Squared Error (Bias) = “,mse**0.5) print(“R Square is (Bias)”,metrics.r2_score(y_train, y_pred_tr)) ## Bias v Variance
import pandas as pd import matplotlib.pyplot as plt link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv” df = pd.read_csv(link) print(df.describe()) X = df.iloc[:,:4].values y = df.iloc[:,4].values
”’ # 1. Replace the missing values with mean value from sklearn.impute import SimpleImputer import numpy as np imputer = SimpleImputer(missing_values=np.nan, strategy=’mean’) imputer = imputer.fit(X[:,1:3]) X[:,1:3] = imputer.transform(X[:,1:3]) #print(X) ”’ # 2. Handling categorical values # encoding from sklearn.preprocessing import LabelEncoder, OneHotEncoder lc = LabelEncoder() X[:,3] = lc.fit_transform(X[:,3])
from sklearn.compose import ColumnTransformer transform = ColumnTransformer([(‘one_hot_encoder’, OneHotEncoder(),[3])],remainder=‘passthrough’) X=transform.fit_transform(X) X = X[:,1:] # dropped one column print(X)
# 3. splitting it into train and test test from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2, random_state=100) print(X_train)
”’ # 4. Scaling / Normalization from sklearn.preprocessing import StandardScaler scale = StandardScaler() X_train = scale.fit_transform(X_train[:,3:]) X_test = scale.fit_transform(X_test[:,3:]) print(X_train) ”’ ## RUN THE MODEL from sklearn.linear_model import LinearRegression regressor = LinearRegression() # fit – train the model regressor.fit(X_train, y_train) print(f”M/Coefficient/Slope = {regressor.coef_} and the Constant = {regressor.intercept_}“)
# y = -3791.2 x Florida -3090.1 x California + 0.82 R&D – 0.05 Admin + 0.022 Marketing+ 56650 y_pred = regressor.predict(X_test) result_df =pd.DataFrame({‘Actual’: y_test, ‘Predicted’: y_pred}) print(result_df)
# Analyze the output from sklearn import metrics mse = metrics.mean_squared_error(y_true=y_test, y_pred=y_pred) print(“Root Mean Squared Error (Variance) = “,mse**0.5) mae = metrics.mean_absolute_error(y_true=y_test, y_pred=y_pred) print(“Mean Absolute Error = “,mae) print(“R Square is (Variance)”,metrics.r2_score(y_test, y_pred))
## Bias is based on training data y_pred_tr = regressor.predict(X_train) mse = metrics.mean_squared_error(y_true=y_train, y_pred=y_pred_tr) print(“Root Mean Squared Error (Bias) = “,mse**0.5) print(“R Square is (Bias)”,metrics.r2_score(y_train, y_pred_tr))
”’ Case 1: All the columns are taken into account: Mean Absolute Error = 8696.887641252619 R Square is (Variance) 0.884599945166969 Root Mean Squared Error (Bias) = 7562.5657508560125 R Square is (Bias) 0.9624157828452926 ”’ ## Testing import statsmodels.api as sm import numpy as np X = np.array(X, dtype=float) print(“Y:\n“,y) summ1 = sm.OLS(y,X).fit().summary() print(“Summary of All X \n—————-\n:”,summ1)
import pandas as pd import matplotlib.pyplot as plt link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv” df = pd.read_csv(link) print(df.describe()) X = df.iloc[:,:4].values y = df.iloc[:,4].values
”’ # 1. Replace the missing values with mean value from sklearn.impute import SimpleImputer import numpy as np imputer = SimpleImputer(missing_values=np.nan, strategy=’mean’) imputer = imputer.fit(X[:,1:3]) X[:,1:3] = imputer.transform(X[:,1:3]) #print(X) ”’ # 2. Handling categorical values # encoding from sklearn.preprocessing import LabelEncoder, OneHotEncoder lc = LabelEncoder() X[:,3] = lc.fit_transform(X[:,3])
from sklearn.compose import ColumnTransformer transform = ColumnTransformer([(‘one_hot_encoder’, OneHotEncoder(),[3])],remainder=‘passthrough’) X=transform.fit_transform(X) X = X[:,1:] # dropped one column print(X)
”’ After doing Backward elemination method we realized that all the state columns are not significantly impacting the analysis hence removing those 2 columns too. ”’ X = X[:,2:] # after backward elemination # EDA – Exploratory Data Analysis plt.scatter(x=df[‘Administration’],y=df[‘Profit’]) plt.show() plt.scatter(x=df[‘R&D Spend’],y=df[‘Profit’]) plt.show() plt.scatter(x=df[‘Marketing Spend’],y=df[‘Profit’]) plt.show()
# 3. splitting it into train and test test from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2, random_state=100) print(X_train)
”’ # 4. Scaling / Normalization from sklearn.preprocessing import StandardScaler scale = StandardScaler() X_train = scale.fit_transform(X_train[:,3:]) X_test = scale.fit_transform(X_test[:,3:]) print(X_train) ”’ ## RUN THE MODEL from sklearn.linear_model import LinearRegression regressor = LinearRegression() # fit – train the model regressor.fit(X_train, y_train) print(f”M/Coefficient/Slope = {regressor.coef_} and the Constant = {regressor.intercept_}“)
# y = -3791.2 x Florida -3090.1 x California + 0.82 R&D – 0.05 Admin + 0.022 Marketing+ 56650 y_pred = regressor.predict(X_test) result_df =pd.DataFrame({‘Actual’: y_test, ‘Predicted’: y_pred}) print(result_df)
# Analyze the output from sklearn import metrics mse = metrics.mean_squared_error(y_true=y_test, y_pred=y_pred) print(“Root Mean Squared Error (Variance) = “,mse**0.5) mae = metrics.mean_absolute_error(y_true=y_test, y_pred=y_pred) print(“Mean Absolute Error = “,mae) print(“R Square is (Variance)”,metrics.r2_score(y_test, y_pred))
## Bias is based on training data y_pred_tr = regressor.predict(X_train) mse = metrics.mean_squared_error(y_true=y_train, y_pred=y_pred_tr) print(“Root Mean Squared Error (Bias) = “,mse**0.5) print(“R Square is (Bias)”,metrics.r2_score(y_train, y_pred_tr))
”’ Case 1: All the columns are taken into account: Mean Absolute Error = 8696.887641252619 R Square is (Variance) 0.884599945166969 Root Mean Squared Error (Bias) = 7562.5657508560125 R Square is (Bias) 0.9624157828452926 ”’ ## Testing import statsmodels.api as sm import numpy as np X = np.array(X, dtype=float) #X = X[:,[2,3,4]] print(“Y:\n“,y) summ1 = sm.OLS(y,X).fit().summary() print(“Summary of All X \n—————-\n:”,summ1)
## Test for linearity # 1. All features (X) should be correlated to Y # 2. Multicollinearity: Within X there should not be any correlation, # if its there then take any one for the analysis
import pandas as pd import matplotlib.pyplot as plt link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/4_Position_Salaries.csv” df = pd.read_csv(link) print(df.describe()) X = df.iloc[:,1:2].values y = df.iloc[:,2].values
”’ # 1. Replace the missing values with mean value from sklearn.impute import SimpleImputer import numpy as np imputer = SimpleImputer(missing_values=np.nan, strategy=’mean’) imputer = imputer.fit(X[:,1:3]) X[:,1:3] = imputer.transform(X[:,1:3]) #print(X) ”’ ”’ # 2. Handling categorical values # encoding from sklearn.preprocessing import LabelEncoder, OneHotEncoder lc = LabelEncoder() X[:,3] = lc.fit_transform(X[:,3]) from sklearn.compose import ColumnTransformer transform = ColumnTransformer([(‘one_hot_encoder’, OneHotEncoder(),[3])],remainder=’passthrough’) X=transform.fit_transform(X) X = X[:,1:] # dropped one column print(X) ”’ ”’ After doing Backward elemination method we realized that all the state columns are not significantly impacting the analysis hence removing those 2 columns too. X = X[:,2:] # after backward elemination ”’ ”’ # EDA – Exploratory Data Analysis plt.scatter(x=df[‘Level’],y=df[‘Salary’]) plt.show() ”’ # 3. splitting it into train and test test from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_state=100) print(X_train)
from sklearn.linear_model import LinearRegression from sklearn import metrics ”’ # 4. Scaling / Normalization from sklearn.preprocessing import StandardScaler scale = StandardScaler() X_train = scale.fit_transform(X_train[:,3:]) X_test = scale.fit_transform(X_test[:,3:]) print(X_train) ”’ ”’ #Since dataset is too small, lets take entire data for training X_train, y_train = X,y X_test, y_test = X,y ”’ ”’ ## RUN THE MODEL regressor = LinearRegression() # fit – train the model regressor.fit(X_train, y_train) print(f”M/Coefficient/Slope = {regressor.coef_} and the Constant = {regressor.intercept_}”) # y = y_pred = regressor.predict(X_test) result_df =pd.DataFrame({‘Actual’: y_test, ‘Predicted’: y_pred}) print(result_df) # Analyze the output mse = metrics.mean_squared_error(y_true=y_test, y_pred=y_pred) print(“Root Mean Squared Error (Variance) = “,mse**0.5) mae = metrics.mean_absolute_error(y_true=y_test, y_pred=y_pred) print(“Mean Absolute Error = “,mae) print(“R Square is (Variance)”,metrics.r2_score(y_test, y_pred)) ## Bias is based on training data y_pred_tr = regressor.predict(X_train) mse = metrics.mean_squared_error(y_true=y_train, y_pred=y_pred_tr) print(“Root Mean Squared Error (Bias) = “,mse**0.5) print(“R Square is (Bias)”,metrics.r2_score(y_train, y_pred_tr)) # Plotting the data for output plt.scatter(x=df[‘Level’],y=df[‘Salary’]) plt.plot(X,y_pred) plt.xlabel(“Level”) plt.ylabel(“Salary”) plt.show() ”’ # 3. Model – Polynomial regression analysis # y = C + m1 * X + m2 * x square from sklearn.preprocessing import PolynomialFeatures from sklearn.pipeline import Pipeline
for i in range(1,10): #prepare the parameters parameters = [(‘polynomial’, PolynomialFeatures(degree=i)),(‘modal’,LinearRegression())] pipe = Pipeline(parameters) pipe.fit(X_train,y_train) y_pred = pipe.predict(X) ## Bias is based on training data y_pred_tr = pipe.predict(X_train) mse = metrics.mean_squared_error(y_true=y_train, y_pred=y_pred_tr) rmse_tr = mse ** 0.5 print(“Root Mean Squared Error (Bias) = “,rmse_tr) print(“R Square is (Bias)”,metrics.r2_score(y_train, y_pred_tr))
## Variance is based on validation data y_pred_tt = pipe.predict(X_test) mse = metrics.mean_squared_error(y_true=y_test, y_pred=y_pred_tt) rmse_tt = mse ** 0.5 print(“Root Mean Squared Error (Variance) = “, rmse_tt) print(“R Square is (Variance)”, metrics.r2_score(y_test, y_pred_tt)) print(“Difference Between variance and bias = “,rmse_tt – rmse_tr) # Plotting the data for output plt.scatter(x=df[‘Level’],y=df[‘Salary’]) plt.plot(X,y_pred) plt.title(“Polynomial Analysis degree =”+str(i)) plt.xlabel(“Level”) plt.ylabel(“Salary”) plt.show()
import pandas as pd import matplotlib.pyplot as plt #link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/4_Position_Salaries.csv” link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv” df = pd.read_csv(link) print(df.describe()) X = df.iloc[:,0:4].values y = df.iloc[:,4].values
”’ # 1. Replace the missing values with mean value from sklearn.impute import SimpleImputer import numpy as np imputer = SimpleImputer(missing_values=np.nan, strategy=’mean’) imputer = imputer.fit(X[:,1:3]) X[:,1:3] = imputer.transform(X[:,1:3]) #print(X) ”’ # 2. Handling categorical values # encoding from sklearn.preprocessing import LabelEncoder, OneHotEncoder lc = LabelEncoder() X[:,3] = lc.fit_transform(X[:,3])
from sklearn.compose import ColumnTransformer transform = ColumnTransformer([(‘one_hot_encoder’, OneHotEncoder(),[3])],remainder=‘passthrough’) X=transform.fit_transform(X) X = X[:,1:] # dropped one column print(X)
”’ After doing Backward elemination method we realized that all the state columns are not significantly impacting the analysis hence removing those 2 columns too. X = X[:,2:] # after backward elemination ”’ ”’ # EDA – Exploratory Data Analysis plt.scatter(x=df[‘Level’],y=df[‘Salary’]) plt.show() ”’ # 3. splitting it into train and test test from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3, random_state=100) print(X_train)
from sklearn.linear_model import LinearRegression from sklearn import metrics ”’ # 4. Scaling / Normalization from sklearn.preprocessing import StandardScaler scale = StandardScaler() X_train = scale.fit_transform(X_train[:,3:]) X_test = scale.fit_transform(X_test[:,3:]) print(X_train) ”’ ”’ #Since dataset is too small, lets take entire data for training X_train, y_train = X,y X_test, y_test = X,y ”’ ## RUN THE MODEL – Support Vector Machine Regressor (SVR) from sklearn.svm import SVR #regressor = SVR(kernel=’linear’) #regressor = SVR(kernel=’poly’,degree=2,C=10) # Assignment – Best value for gamma: 0.01 to 1 (0.05) regressor = SVR(kernel=“rbf”,gamma=0.1,C=10) # fit – train the model regressor.fit(X_train, y_train)
# Analyze the output mse = metrics.mean_squared_error(y_true=y_test, y_pred=y_pred) print(“Root Mean Squared Error (Variance) = “,mse**0.5) mae = metrics.mean_absolute_error(y_true=y_test, y_pred=y_pred) print(“Mean Absolute Error = “,mae) print(“R Square is (Variance)”,metrics.r2_score(y_test, y_pred))
## Bias is based on training data y_pred_tr = regressor.predict(X_train) mse = metrics.mean_squared_error(y_true=y_train, y_pred=y_pred_tr) print(“Root Mean Squared Error (Bias) = “,mse**0.5) print(“R Square is (Bias)”,metrics.r2_score(y_train, y_pred_tr))
# Plotting the data for output plt.scatter(X_train[:,2],y_pred_tr) #plt.plot(X_train[:,2],y_pred_tr) plt.show()
#Decision Tree & Random Forest import pandas as pd link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/3_Startups.csv” link = “D:\\datasets\\3_Startups.csv” df = pd.read_csv(link) print(df)
#X = df.iloc[:,:4].values X = df.iloc[:,:1].values y = df.iloc[:,:-1].values from sklearn.model_selection import train_test_split X_train, X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=100)
# Classifications algorithm: supervised algo which predicts the class ”’ classifier: algorithm that we develop model: training and predicting the outcome features: the input data (columns) target: class that we need to predict classification: binary (2 class outcome) or multiclass (more than 2 classes) Steps to run the model: 1. get the data 2. preprocess the data 3. eda 4. train the model 5. predict the model 6. evaluate the model ”’ #1. Logistic regression link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv” import pandas as pd df = pd.read_csv(link) X = df.iloc[:,1:4].values y = df.iloc[:,4].values
# Scaling as Age and Salary are in different range of values from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.fit_transform(X_test)
## Build the model from sklearn.linear_model import LogisticRegression classifier = LogisticRegression() classifier.fit(X_train,y_train) y_pred = classifier.predict(X_test)
# Classifications algorithm: supervised algo which predicts the class ”’ classifier: algorithm that we develop model: training and predicting the outcome features: the input data (columns) target: class that we need to predict classification: binary (2 class outcome) or multiclass (more than 2 classes) Steps to run the model: 1. get the data 2. preprocess the data 3. eda 4. train the model 5. predict the model 6. evaluate the model ”’ #1. Logistic regression link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv” link = “D:\\datasets\\5_Ads_Success.csv” import pandas as pd df = pd.read_csv(link) X = df.iloc[:,1:4].values y = df.iloc[:,4].values
# Scaling as Age and Salary are in different range of values from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.fit_transform(X_test)
## Build the model ”’ ## LOGISTIC REGRESSION from sklearn.linear_model import LogisticRegression classifier = LogisticRegression() classifier.fit(X_train,y_train) y_pred = classifier.predict(X_test) ”’ from sklearn.svm import SVC ”’ ## Support Vector Machine – Classifier classifier = SVC(kernel=’linear’) classifier = SVC(kernel=’rbf’,gamma=100, C=100) ”’ from sklearn.neighbors import KNeighborsClassifier ## Refer types of distances: # https://designrr.page/?id=200944&token=2785938662&type=FP&h=7229 classifier = KNeighborsClassifier(n_neighbors=9, metric=‘minkowski’) classifier.fit(X_train,y_train) y_pred = classifier.predict(X_test)
#Now we will plot training data for i, j in enumerate(np.unique(y_set)): plt.scatter(x_set[y_set==j,0], x_set[y_set==j,1], color=ListedColormap((“red”,“green”))(i), label=j) plt.show()
## Model Evaluation using Confusion Matrix from sklearn.metrics import classification_report, confusion_matrix, accuracy_score cm = confusion_matrix(y_test, y_pred) print(“Confusion Matrix: \n“,cm) cr = classification_report(y_test, y_pred) accs = accuracy_score(y_test, y_pred) print(“classification_report: \n“,cr) print(“accuracy_score: “,accs)
link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv” link = “D:\\datasets\\5_Ads_Success.csv” import pandas as pd df = pd.read_csv(link) X = df.iloc[:,1:4].values y = df.iloc[:,4].values
# Scaling as Age and Salary are in different range of values from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.fit_transform(X_test)
## Build the model ”’ from sklearn.tree import DecisionTreeClassifier classifier = DecisionTreeClassifier(criterion=”gini”) ”’ from sklearn.ensemble import RandomForestClassifier classifier = RandomForestClassifier(n_estimators=39, criterion=“gini”) classifier.fit(X_train,y_train) y_pred = classifier.predict(X_test)
#Now we will plot training data for i, j in enumerate(np.unique(y_set)): plt.scatter(x_set[y_set==j,0], x_set[y_set==j,1], color=ListedColormap((“red”,“green”))(i), label=j) plt.show()
## Model Evaluation using Confusion Matrix from sklearn.metrics import classification_report, confusion_matrix, accuracy_score cm = confusion_matrix(y_test, y_pred) print(“Confusion Matrix: \n“,cm) cr = classification_report(y_test, y_pred) accs = accuracy_score(y_test, y_pred) print(“classification_report: \n“,cr) print(“accuracy_score: “,accs)
”’ # Show decision tree created output = sklearn.tree.export_text(classifier) print(output) # visualize the tree fig = plt.figure(figsize=(40,60)) tree_plot = sklearn.tree.plot_tree(classifier) plt.show() ”’ ”’ In Ensemble Algorithms – we run multiple algorithms to improve the performance of a given business objective: 1. Boosting: When you run same algorithm – Input varies based on weights 2. Bagging: When you run same algorithm – average of all 3. Stacking: Over different algorithms – average of all ”’
link = “https://raw.githubusercontent.com/swapnilsaurav/MachineLearning/master/5_Ads_Success.csv” link = “D:\\datasets\\5_Ads_Success.csv” import pandas as pd df = pd.read_csv(link) X = df.iloc[:,1:4].values y = df.iloc[:,4].values
# Scaling as Age and Salary are in different range of values from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.fit_transform(X_test)
## Build the model ”’ from sklearn.tree import DecisionTreeClassifier classifier = DecisionTreeClassifier(criterion=”gini”) from sklearn.ensemble import RandomForestClassifier classifier = RandomForestClassifier(n_estimators=39, criterion=”gini”) ”’ from sklearn.ensemble import AdaBoostClassifier classifier = AdaBoostClassifier(n_estimators=7) classifier.fit(X_train,y_train) y_pred = classifier.predict(X_test)
#Now we will plot training data for i, j in enumerate(np.unique(y_set)): plt.scatter(x_set[y_set==j,0], x_set[y_set==j,1], color=ListedColormap((“red”,“green”))(i), label=j) plt.show()
## Model Evaluation using Confusion Matrix from sklearn.metrics import classification_report, confusion_matrix, accuracy_score cm = confusion_matrix(y_test, y_pred) print(“Confusion Matrix: \n“,cm) cr = classification_report(y_test, y_pred) accs = accuracy_score(y_test, y_pred) print(“classification_report: \n“,cr) print(“accuracy_score: “,accs)
”’ # Show decision tree created output = sklearn.tree.export_text(classifier) print(output) # visualize the tree fig = plt.figure(figsize=(40,60)) tree_plot = sklearn.tree.plot_tree(classifier) plt.show() ”’ ”’ In Ensemble Algorithms – we run multiple algorithms to improve the performance of a given business objective: 1. Boosting: When you run same algorithm – Input varies based on weights 2. Bagging: When you run same algorithm – average of all 3. Stacking: Over different algorithms – average of all ”’
distortion = [] max_centers = 30 for i in range(1,max_centers): km = KMeans(n_clusters=i, init=“random”, max_iter=100) y_cluster = km.fit(X) distortion.append(km.inertia_)
import pandas as pd import matplotlib.pyplot as plt link = “D:\\Datasets\\USArrests.csv” df = pd.read_csv(link) #print(df) X = df.iloc[:,1:] from sklearn.preprocessing import normalize data = normalize(X) data = pd.DataFrame(data) print(data)
link = “D:\\datasets\\Market_Basket_Optimisation.csv” import pandas as pd df = pd.read_csv(link) print(df) from apyori import apriori transactions = [] for i in range(len(df)): if i%100==0: print(“I = “,i) transactions.append([str(df.values[i,j]) for j in range(20)])
## remove nan from the list print(“Transactions:\n“,transactions)
”’ Time Series Forecasting – ARIMA method 1. Read and visualize the data 2. Stationary series 3. Optimal parameters 4. Build the model 5. Prediction ”’ import pandas as pd #Step 1: read the data link = “D:\\datasets\\gitdataset\\AirPassengers.csv” air_passengers = pd.read_csv(link)
”’ #Step 2: visualize the data import plotly.express as pe fig = pe.line(air_passengers,x=”Month”,y=”#Passengers”) fig.show() ”’ # Cleaning the data from datetime import datetime air_passengers[‘Month’] = pd.to_datetime(air_passengers[‘Month’]) air_passengers.set_index(‘Month’,inplace=True)
#converting to time series data import numpy as np ts_log = np.log(air_passengers[‘#Passengers’]) #creating rolling period – 12 months import matplotlib.pyplot as plt ”’ moving_avg = ts_log.rolling(12).mean plt.plot(ts_log) plt.plot(moving_avg) plt.show() ”’ #Step 3: Decomposition into: trend, seasonality, error ( or residual or noise) ”’ Additive decomposition: linear combination of above 3 factors: Y(t) =T(t) + S(t) + E(t) Multiplicative decomposition: product of 3 factors: Y(t) =T(t) * S(t) * E(t) ”’ from statsmodels.tsa.seasonal import seasonal_decompose decomposed = seasonal_decompose(ts_log,model=“multiplicative”) decomposed.plot() plt.show()
# Step 4: Stationary test ”’ To make Time series analysis, the TS should be stationary. A time series is said to be stationary if its statistical properties (mean, variance, autocorrelation) doesnt change by a large value over a period of time. Types of tests: 1. Augmented Dickey Fuller test (ADH Test) 2. Kwiatkowski Phillips Schnidt Shin (KPSS) test 3. Phillips Perron (PP) Test Null Hypothesis: The time series is not stationary Alternate Hypothesis: Time series is stationary If p >0.05 we reject Null Hypothesis ”’ from statsmodels.tsa.stattools import adfuller result = adfuller(air_passengers[‘#Passengers’]) print(“ADF Stats: \n“,result[0]) print(“p value = “,result[1]) ”’ To reject Null hypothesis, result[0] less than 5% critical region value and p > 0.05 ”’ # Run the model ”’ ARIMA model: Auto-Regressive Integrative Moving Average AR: p predicts the current value I: d integrative by removing trend and seasonality component from previous period MA: q represents Moving Average AIC- Akaike’s Information Criterion (AIC) – helps to find optimal p,d,q values BIC – Bayesian Information Criterion (BIC) – alternative to AIC ”’ from statsmodels.graphics.tsaplots import plot_acf, plot_pacf plot_acf(air_passengers[‘#Passengers’].diff().dropna()) plot_pacf(air_passengers[‘#Passengers’].diff().dropna()) plt.show() ”’ How to read above graph: To find q (MA), we look at the Autocorrelation graph and see where there is a drastic change: here, its at 1, so q = 1 (or 2 as at 2, it goes to -ve) To find p (AR) – sharp drop in Partial Autocorrelation graph: here, its at 1, so p = 1 (or 2 as at 2, it goes to -ve) for d (I) – we need to try with multiple values intially we will take as 1 ”’
”’ Time Series Forecasting – ARIMA method 1. Read and visualize the data 2. Stationary series 3. Optimal parameters 4. Build the model 5. Prediction ”’ import pandas as pd #Step 1: read the data link = “D:\\datasets\\gitdataset\\AirPassengers.csv” air_passengers = pd.read_csv(link)
”’ #Step 2: visualize the data import plotly.express as pe fig = pe.line(air_passengers,x=”Month”,y=”#Passengers”) fig.show() ”’ # Cleaning the data from datetime import datetime air_passengers[‘Month’] = pd.to_datetime(air_passengers[‘Month’]) air_passengers.set_index(‘Month’,inplace=True)
#converting to time series data import numpy as np ts_log = np.log(air_passengers[‘#Passengers’]) #creating rolling period – 12 months import matplotlib.pyplot as plt ”’ moving_avg = ts_log.rolling(12).mean plt.plot(ts_log) plt.plot(moving_avg) plt.show() ”’ #Step 3: Decomposition into: trend, seasonality, error ( or residual or noise) ”’ Additive decomposition: linear combination of above 3 factors: Y(t) =T(t) + S(t) + E(t) Multiplicative decomposition: product of 3 factors: Y(t) =T(t) * S(t) * E(t) ”’ from statsmodels.tsa.seasonal import seasonal_decompose decomposed = seasonal_decompose(ts_log,model=“multiplicative”) decomposed.plot() plt.show()
# Step 4: Stationary test ”’ To make Time series analysis, the TS should be stationary. A time series is said to be stationary if its statistical properties (mean, variance, autocorrelation) doesnt change by a large value over a period of time. Types of tests: 1. Augmented Dickey Fuller test (ADH Test) 2. Kwiatkowski Phillips Schnidt Shin (KPSS) test 3. Phillips Perron (PP) Test Null Hypothesis: The time series is not stationary Alternate Hypothesis: Time series is stationary If p >0.05 we reject Null Hypothesis ”’ from statsmodels.tsa.stattools import adfuller result = adfuller(air_passengers[‘#Passengers’]) print(“ADF Stats: \n“,result[0]) print(“p value = “,result[1]) ”’ To reject Null hypothesis, result[0] less than 5% critical region value and p > 0.05 ”’ # Run the model ”’ ARIMA model: Auto-Regressive Integrative Moving Average AR: p predicts the current value I: d integrative by removing trend and seasonality component from previous period MA: q represents Moving Average AIC- Akaike’s Information Criterion (AIC) – helps to find optimal p,d,q values BIC – Bayesian Information Criterion (BIC) – alternative to AIC ”’ from statsmodels.graphics.tsaplots import plot_acf, plot_pacf plot_acf(air_passengers[‘#Passengers’].diff().dropna()) plot_pacf(air_passengers[‘#Passengers’].diff().dropna()) plt.show() ”’ How to read above graph: To find q (MA), we look at the Autocorrelation graph and see where there is a drastic change: here, its at 1, so q = 1 (or 2 as at 2, it goes to -ve) To find p (AR) – sharp drop in Partial Autocorrelation graph: here, its at 1, so p = 1 (or 2 as at 2, it goes to -ve) for d (I) – we need to try with multiple values intially we will take as 1 ”’ from statsmodels.tsa.arima.model import ARIMA model = ARIMA(air_passengers[‘#Passengers’], order=(1,1,1)) result = model.fit() plt.plot(air_passengers[‘#Passengers’]) plt.plot(result.fittedvalues) plt.show() print(“ARIMA Model Summary”) print(result.summary())
model = ARIMA(air_passengers[‘#Passengers’], order=(4,1,4)) result = model.fit() plt.plot(air_passengers[‘#Passengers’]) plt.plot(result.fittedvalues) plt.show() print(“ARIMA Model Summary”) print(result.summary())
# Prediction using ARIMA model air_passengers[‘Forecasted’] = result.predict(start=120,end=246) air_passengers[[‘#Passengers’,‘Forecasted’]].plot() plt.show()
# predict using SARIMAX Model import statsmodels.api as sm model = sm.tsa.statespace.SARIMAX(air_passengers[‘#Passengers’],order=(7,1,1), seasonal_order=(1,1,1,12)) result = model.fit() air_passengers[‘Forecast_SARIMAX’] = result.predict(start=120,end=246) air_passengers[[‘#Passengers’,‘Forecast_SARIMAX’]].plot() plt.show()
”’ NLP – Natural Language Processing – analysing review comment to understand reasons for positive and negative ratings. concepts like: unigram, bigram, trigram Steps we generally perform with NLP data: 1. Convert into lowercase 2. decompose (non unicode to unicode) 3. removing accent: encode the content to ascii values 4. tokenization: will break sentence to words 5. Stop words: not important words for analysis 6. Lemmetization (done only on English words): convert the words into dictionary words 7. N-grams: set of one word (unigram), two words (bigram), three words (trigrams) 8. Plot the graph based on the number of occurrences and Evaluate ”’ ”’ cardboard mousepad. Going worth price! Not bad ”’ link=“https://raw.githubusercontent.com/swapnilsaurav/OnlineRetail/master/order_reviews.csv” import pandas as pd import unicodedata import nltk import matplotlib.pyplot as plt df = pd.read_csv(link) print(list(df.columns)) ”’ [‘review_id’, ‘order_id’, ‘review_score’, ‘review_comment_title’, ‘review_comment_message’, ‘review_creation_date’, ‘review_answer_timestamp’] ”’ df[‘review_creation_date’] = pd.to_datetime(df[‘review_creation_date’])
”’ NLP – Natural Language Processing – analysing review comment to understand reasons for positive and negative ratings. concepts like: unigram, bigram, trigram Steps we generally perform with NLP data: 1. Convert into lowercase 2. decompose (non unicode to unicode) 3. removing accent: encode the content to ascii values 4. tokenization: will break sentence to words 5. Stop words: not important words for analysis 6. Lemmetization (done only on English words): convert the words into dictionary words 7. N-grams: set of one word (unigram), two words (bigram), three words (trigrams) 8. Plot the graph based on the number of occurrences and Evaluate ”’ ”’ cardboard mousepad. Going worth price! Not bad ”’ link=“D:/datasets/OnlineRetail/order_reviews.csv” import pandas as pd import unicodedata import nltk import matplotlib.pyplot as plt df = pd.read_csv(link) print(list(df.columns)) ”’ [‘review_id’, ‘order_id’, ‘review_score’, ‘review_comment_title’, ‘review_comment_message’, ‘review_creation_date’, ‘review_answer_timestamp’] ”’ #df[‘review_creation_date’] = pd.to_datetime(df[‘review_creation_date’]) #df[‘review_answer_timestamp’] = pd.to_datetime(df[‘review_answer_timestamp’]) # data preprocessing – making data ready for analysis reviews_df = df[df[‘review_comment_message’].notnull()].copy() #print(reviews_df) # remove accents def remove_accent(text): return unicodedata.normalize(‘NFKD’,text).encode(‘ascii’,errors=‘ignore’).decode(‘utf-8’) #STOP WORDS LIST: STOP_WORDS = set(remove_accent(w) for w in nltk.corpus.stopwords.words(‘portuguese’))
”’ Write a function to perform basic preprocessing steps ”’ def basic_preprocessing(text): #converting to lower case txt_pp = text.lower() #print(txt_pp) #remove the accent #txt_pp = unicodedata.normalize(‘NFKD’,txt_pp).encode(‘ascii’,errors=’ignore’).decode(‘utf-8’) txt_pp =remove_accent(txt_pp) #print(txt_pp) #tokenize txt_token = nltk.tokenize.word_tokenize(txt_pp) #print(txt_token) # removing stop words txt_token = (w for w in txt_token if w not in STOP_WORDS and w.isalpha()) return txt_token
#get positive reviews – all 5 ratings in review_score reviews_5 = reviews_df[reviews_df[‘review_score’]==5]
#get negative reviews – all 1 ratings reviews_1 = reviews_df[reviews_df[‘review_score’]==1]
## write a function to creaet unigram, bigram, trigram def create_ngrams(words): unigram,bigrams,trigram = [],[],[] for comment in words: unigram.extend(comment) bigrams.extend(”.join(bigram) for bigram in nltk.bigrams(comment)) trigram.extend(‘ ‘.join(trigram) for trigram in nltk.trigrams(comment)) return unigram,bigrams,trigram
#create ngrams for rating 5 and rating 1 uni_5, bi_5, tri_5 = create_ngrams(reviews_5[‘review_comment_words’]) print(uni_5) print(‘””””””””””””””””””‘) print(bi_5) print(” =========================================”) print(tri_5)
”’ NLP – Natural Language Processing – analysing review comment to understand reasons for positive and negative ratings. concepts like: unigram, bigram, trigram Steps we generally perform with NLP data: 1. Convert into lowercase 2. decompose (non unicode to unicode) 3. removing accent: encode the content to ascii values 4. tokenization: will break sentence to words 5. Stop words: not important words for analysis 6. Lemmetization (done only on English words): convert the words into dictionary words 7. N-grams: set of one word (unigram), two words (bigram), three words (trigrams) 8. Plot the graph based on the number of occurrences and Evaluate ”’ ”’ cardboard mousepad. Going worth price! Not bad ”’ link=“D:/datasets/OnlineRetail/order_reviews.csv” import pandas as pd import unicodedata import nltk import matplotlib.pyplot as plt df = pd.read_csv(link) print(list(df.columns)) ”’ [‘review_id’, ‘order_id’, ‘review_score’, ‘review_comment_title’, ‘review_comment_message’, ‘review_creation_date’, ‘review_answer_timestamp’] ”’ #df[‘review_creation_date’] = pd.to_datetime(df[‘review_creation_date’]) #df[‘review_answer_timestamp’] = pd.to_datetime(df[‘review_answer_timestamp’]) # data preprocessing – making data ready for analysis reviews_df = df[df[‘review_comment_message’].notnull()].copy() #print(reviews_df) # remove accents def remove_accent(text): return unicodedata.normalize(‘NFKD’,text).encode(‘ascii’,errors=‘ignore’).decode(‘utf-8’) #STOP WORDS LIST: STOP_WORDS = set(remove_accent(w) for w in nltk.corpus.stopwords.words(‘portuguese’))
”’ Write a function to perform basic preprocessing steps ”’ def basic_preprocessing(text): #converting to lower case txt_pp = text.lower() #print(txt_pp) #remove the accent #txt_pp = unicodedata.normalize(‘NFKD’,txt_pp).encode(‘ascii’,errors=’ignore’).decode(‘utf-8’) txt_pp =remove_accent(txt_pp) #print(txt_pp) #tokenize txt_token = nltk.tokenize.word_tokenize(txt_pp) #print(txt_token) # removing stop words txt_token = tuple(w for w in txt_token if w not in STOP_WORDS and w.isalpha()) return txt_token
## write a function to creaet unigram, bigram, trigram def create_ngrams(words): unigrams,bigrams,trigrams = [],[],[] for comment in words: unigrams.extend(comment) bigrams.extend(‘ ‘.join(bigram) for bigram in nltk.bigrams(comment)) trigrams.extend(‘ ‘.join(trigram) for trigram in nltk.trigrams(comment))
#get positive reviews – all 5 ratings in review_score reviews_5 = reviews_df[reviews_df[‘review_score’]==5]
#get negative reviews – all 1 ratings reviews_1 = reviews_df[reviews_df[‘review_score’]==1] #create ngrams for rating 5 and rating 1 uni_5, bi_5, tri_5 = create_ngrams(reviews_5[‘review_comment_words’]) print(uni_5) print(bi_5) print(tri_5)
# Assignment: perform similar tasks for reviews that are negative (review score = 1) #uni_1, bi_1, tri_1 = create_ngrams(reviews_1[‘review_comment_words’]) #print(uni_5) # distribution plot def plot_dist(words, color): nltk.FreqDist(words).plot(20,cumulative=False, color=color)
plot_dist(tri_5, “red”)
#NLP – Natural Language processing: # sentiments: Positive, Neutral, Negative # ”’ we will use nltk library for NLP: pip install nltk ”’ import nltk #1. Convert into lowercase text = “Product is great but I amn’t liking the colors as they are worst” text = text.lower()
”’ 2. Tokenize the content: break it into words or sentences ”’ text1 = text.split() #using nltk from nltk.tokenize import sent_tokenize,word_tokenize text = word_tokenize(text) #print(“Text =\n”,text) #print(“Text =\n”,text1)
”’ 3. Removing Stop words: Words which are not significant for your analysis. E.g. an, a, the, is, are ”’ my_stopwords = [‘is’,‘i’,‘the’] text1 = text for w in text1: if w in my_stopwords: text.remove(w) print(“Text after my stopwords:”,text1)
nltk.download(“stopwords”) from nltk.corpus import stopwords nltk_eng_stopwords = set(stopwords.words(“english”)) #print(“NLTK list of stop words in English: “,nltk_eng_stopwords) ”’ Just for example: we see the word but in the STOP WORDS but we want to include it, then we need to remove the word from the set ”’ # removing but from the NLTK stop words nltk_eng_stopwords.remove(‘but’)
for w in text: if w in nltk_eng_stopwords: text.remove(w) print(“Text after NLTK stopwords:”,text)
”’ 4. Stemming: changing the word to its root eg: {help: [help, helped, helping, helper]}
One of the method is Porter stemmer ”’ from nltk.stem import PorterStemmer stemmer = PorterStemmer() text = [stemmer.stem(w) for w in text] ”’ above line is like below: t_list=[] for w in text: a = stemmer.stem(w) t_list.append(a) ”’ print(“Text after Stemming:”,text) ”’ 5. Part of Speech Tagging (POS Tagging) grammatical word which deals with the roles they place like – 8 parts of speeches – noun, verb, …
Reference: https://www.educba.com/nltk-pos-tag/ POS Tagging will give Tags like
CC: It is the conjunction of coordinating CD: It is a digit of cardinal DT: It is the determiner EX: Existential FW: It is a foreign word IN: Preposition and conjunction JJ: Adjective JJR and JJS: Adjective and superlative LS: List marker MD: Modal NN: Singular noun NNS, NNP, NNPS: Proper and plural noun PDT: Predeterminer WRB: Adverb of wh WP$: Possessive wh WP: Pronoun of wh WDT: Determiner of wp VBZ: Verb VBP, VBN, VBG, VBD, VB: Forms of verbs UH: Interjection TO: To go RP: Particle RBS, RB, RBR: Adverb PRP, PRP$: Pronoun personal and professional
But to perform this, we need to download any one tagger: e.g. averaged_perceptron_tagger nltk.download(‘averaged_perceptron_tagger’) ”’ nltk.download(‘averaged_perceptron_tagger’)
”’ 6. Lemmetising takes a word to its core meaning We need to download: wordnet ”’ nltk.download(‘wordnet’) from nltk.stem import WordNetLemmatizer lemmatizer = WordNetLemmatizer() print(“Very good = “,lemmatizer.lemmatize(“very good”)) print(“Halves = “,lemmatizer.lemmatize(“halves”))
text = “Product is great but I amn’t liking the colors as they are worst” text = word_tokenize(text) text = [lemmatizer.lemmatize(w) for w in text] print(“Text after Lemmatizer: “,text)
# Sentiment analysis – read the sentiments of each sentence ”’ If you need more data for your analysis, this is a good source: https://github.com/pycaret/pycaret/tree/master/datasets We will use Amazon.csv for this program ”’ import pandas as pd from nltk.corpus import stopwords from nltk.tokenize import word_tokenize from nltk.stem import WordNetLemmatizer from nltk.sentiment.vader import SentimentIntensityAnalyzer
link = “https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/amazon.csv” df = pd.read_csv(link) print(df)
#Let’s create a function to perform all the preprocessing steps # of a nlp analysis def preprocess_nlp(text): #tokenise #print(“0”) text = text.lower() #lowercase #print(“1”) text = word_tokenize(text) #tokenize #print(“2”) text = [w for w in text if w not in stopwords.words(“english”)] #lemmatize #print(“3”) lemm = WordNetLemmatizer() #print(“4”) text = [lemm.lemmatize(w) for w in text] #print(“5”) # now join all the words as we are predicting on each line of text text_out = ‘ ‘.join(text) #print(“6”) return text_out
# NLTK Sentiment Analyzer # we will now define a function get_sentiment() which will return # 1 for positive and 0 for non-positive analyzer = SentimentIntensityAnalyzer() def get_sentiment(text): score = analyzer.polarity_scores(text) sentiment = 1 if score[‘pos’] > 0 else 0 return sentiment
# Visualization import matplotlib.pyplot as plt import numpy as np data = np.random.randn(1000) plt.hist(data, bins=30, histtype=‘stepfilled’, color=“red”) plt.title(“Histogram Display”) plt.xlabel(“Marks”) plt.ylabel(“Number of Students”) plt.show()
# Analyzing Hotel Bookings data # https://github.com/swapnilsaurav/Dataset/blob/master/hotel_bookings.csv link=“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/hotel_bookings.csv” import pandas as pd df = pd.read_csv(link) #print(“Shape of the data: “,df.shape) #print(“Data types of the columns:”,df.dtypes) import numpy as np df_numeric = df.select_dtypes(include=[np.number]) #print(df_numeric) numeric_cols = df_numeric.columns.values #print(“Numeric column names: “,numeric_cols) df_nonnumeric = df.select_dtypes(exclude=[np.number]) #print(df_nonnumeric) nonnumeric_cols = df_nonnumeric.columns.values #print(“Non Numeric column names: “,nonnumeric_cols) #### #preprocessing the data import seaborn as sns import matplotlib.pyplot as plt colors = [“#091AEA”,“#EA5E09”] cols = df.columns sns.heatmap(df[cols].isnull(), cmap=sns.color_palette(colors)) plt.show()
cols_to_drop = [] for col in cols: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: #print(f”{col} -> {pct_miss}”) cols_to_drop.append(col) #column list to drop # remove column since it has more than 80% missing value df = df.drop(cols_to_drop, axis=1)
for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: print(f”{col} -> {pct_miss}“) # check for rows to see the missing values missing = df[col].isnull() num_missing = np.sum(missing) if num_missing >0: df[f’{col}_ismissing’] = missing print(f”Created Missing Indicator for {cols}“)
### keeping track of the missing values ismissing_cols = [col for col in df.columns if ‘_ismissing’ in col] df[‘num_missing’] = df[ismissing_cols].sum(axis=1) print(df[‘num_missing’])
# drop rows with > 12 missing values ind_missing = df[df[‘num_missing’] > 12].index df = df.drop(ind_missing,axis=0) # ROWS DROPPED #count for missing values for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >0: print(f”{col} -> {pct_miss}“)
”’ Still we are left with following missing values: children -> 2.0498257606219004 babies -> 11.311318858061922 meal -> 11.467129071170085 country -> 0.40879238707947996 deposit_type -> 8.232810615199035 agent -> 13.687005763302507 ”’
# Analyzing Hotel Bookings data # https://github.com/swapnilsaurav/Dataset/blob/master/hotel_bookings.csv link=“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/hotel_bookings.csv” import pandas as pd df = pd.read_csv(link) #print(“Shape of the data: “,df.shape) #print(“Data types of the columns:”,df.dtypes) import numpy as np df_numeric = df.select_dtypes(include=[np.number]) #print(df_numeric) numeric_cols = df_numeric.columns.values print(“Numeric column names: “,numeric_cols) df_nonnumeric = df.select_dtypes(exclude=[np.number]) #print(df_nonnumeric) nonnumeric_cols = df_nonnumeric.columns.values print(“Non Numeric column names: “,nonnumeric_cols)
#### #preprocessing the data import seaborn as sns import matplotlib.pyplot as plt colors = [“#091AEA”,“#EA5E09”] cols = df.columns sns.heatmap(df[cols].isnull(), cmap=sns.color_palette(colors)) plt.show()
cols_to_drop = [] for col in cols: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: #print(f”{col} -> {pct_miss}”) cols_to_drop.append(col) #column list to drop # remove column since it has more than 80% missing value df = df.drop(cols_to_drop, axis=1)
for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: print(f”{col} -> {pct_miss}“) # check for rows to see the missing values missing = df[col].isnull() num_missing = np.sum(missing) if num_missing >0: df[f’{col}_ismissing’] = missing #print(f”Created Missing Indicator for {cols}”) ### keeping track of the missing values ismissing_cols = [col for col in df.columns if ‘_ismissing’ in col] df[‘num_missing’] = df[ismissing_cols].sum(axis=1) print(df[‘num_missing’])
# drop rows with > 12 missing values ind_missing = df[df[‘num_missing’] > 12].index df = df.drop(ind_missing,axis=0) # ROWS DROPPED #count for missing values for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >0: print(f”{col} -> {pct_miss}“)
”’ Still we are left with following missing values: children -> 2.0498257606219004 # numeric babies -> 11.311318858061922 #numeric meal -> 11.467129071170085 # non-numeric country -> 0.40879238707947996 # non-numeric deposit_type -> 8.232810615199035 # non-numeric agent -> 13.687005763302507 #numeric ”’ #HANDLING NUMERIC MISSING VALUES df_numeric = df.select_dtypes(include=[np.number]) for col in df_numeric.columns.values: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss > 0: med = df[col].median() df[col] = df[col].fillna(med)
#HANDLING non-NUMERIC MISSING VALUES df_nonnumeric = df.select_dtypes(exclude=[np.number]) for col in df_nonnumeric.columns.values: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss > 0: mode = df[col].describe()[‘top’] df[col] = df[col].fillna(mode)
print(“#count for missing values”) for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >0: print(f”{col} -> {pct_miss}“)
#drop duplicate values print(“Shape before dropping duplicates: “,df.shape) df = df.drop(‘id’,axis=1).drop_duplicates() print(“Shape after dropping duplicates: “,df.shape)
Spark SQL is Apache Spark’s module for working with structured data. The SQL Syntax section describes the SQL syntax in detail along with usage examples when applicable. This document provides a list of Data Definition and Data Manipulation Statements, as well as Data Retrieval and Auxiliary Statements.
”’ print() – displays the content on the screen functions have () after the name python commands are case sensitive- Print is not same as print ”’ # idgjdsigjfigj # comments mean that you are asking computer to ignore them print(5) print(5+3) print(‘5+3’) print(“5+3”) print(‘5+2*3=’,5+2*3,“and 4*3=”,4*3)
print(“Hello How are you?”);print(‘Hello How are you?’) # print always starts from a new line # escape sequence: \n (newline) \t for tab spaces print(“How are you doing? \nWhere are you \tgoing?”); # What’s your name? print(“What’s your name?”) # He asked me,”What’s your name?” print(“He asked me,\”What’s your name?\”“,end=“\n“)
# He asked me,\”What’s your name?\” print(“He asked me,\\\”What’s your name?\\\”“,end=“\n“) print(“Hello”,end=” – “) print(“How are you?”)
print(“Basic data types in Python”) # numeric – int (integer)- -99, -4,0,5,888: no decimal values marks1 = 43 marks2 = 87 print(“Marks1 =”,marks1) marks1 = 99 print(marks1) # function: type() – it gives the datatype print(type(marks1)) #<class ‘int’> marks = 87.0 # <class ‘float’> print(type(marks))
# complex: square root of -1: j calc = 3j * 4j print(calc) # 12 j-square = -12 + 0j print(‘Data type of calc = ‘,type(calc))
# int float complex a = –55 print(type(a)) a = –55.0 print(type(a)) a = –55j print(type(a))
# str – string – text print(“HELLO”) name=“Sachin” print(name) print(“type = “,type(name)) name=‘Virat kohli leads \nbangalore team in IPL’ print(name) print(“type = “,type(name))
name=”’Rohit is the captain of Indian team He opens in the ODIs”’ print(name) print(“type = “,type(name))
name=“””Rohit led the Indian team in 2023 ODI World cup and reached finals””” print(name) print(“type = “,type(name))
#5th data type – Bool boolean – 2 values: True and False val1 = True # False print(type(val1))
# Formatting the print statement quantity = 12 price = 39 total = quantity * price print(“Total cost of”,quantity,“books which costs per copy Rs”,price,“will be Rs”,total) # f – string is used to format the output print(f”Total cost of {quantity} books which costs per copy Rs {price} will be Rs {total}“)
# f-string is used to format float values as well quantity, total = 12, 231.35 price = total/quantity print(f”Total cost of {quantity} books which costs per copy Rs {price:.1f} will be Rs {total}“)
# f-string for string values name,country,title=“Rohit”,“India”,“Captain” print(f”Player {name:<12} plays for {country:^10} and is the {title:>15} of the team”) name,country,title=“Mangbwabe”,“Zimbabwe”,“Wicket-keeper” print(f”Player {name:<12} plays for {country:^10} and is the {title:>15} of the team”)
### INPUT ## to take input from the user ## input can take no or at max 1 parameter inp_val = int(input(“Enter first number: “)) print(inp_val) print(“Datatype of input=”,type(inp_val)) inp_val2 = int(input(“Enter second number: “)) print(“Sum of two numbers=”,inp_val+inp_val2)
## change below programs to accept the values from the user using input # 1. write a program to calculate area and perimeter of a rectangle l=50 b=20 area = l*b peri = 2*(l+b) print(f”Area and perimeter of a rectangle with length {l} and breadth {b} is {area} and {peri} respectively”) # 2. write a program to calculate area and perimeter of a square #### Assignment ## # 3. write a program to calculate volume and surface area of a cone #### Assignment ## # 4. write a program to calculate volume and surface area of a cylinder #### Assignment ## # 5. write a program to calculate area and circumference of a circle r=50 pi = 3.12 area = pi*r**2 cir = 2*pi*r print(f”Area and circumference of a circle with radius {r} is {area} and {cir} respectively”)
# input() – read input from the user num1 = int(input(“Enter first number:”)) print(“type = “,type(num1)) num2 = int(input(“Enter second number:”)) print(“Sum is “,num1+num2)
# calculate area and perimeter for a rectangle length=float(input(“Enter length of the rectangle:”)) breadth=float(input(“Enter breadth of the rectangle:”)) perimeter = (length+breadth)*2 print(“Perimeter of the rectangle is”,perimeter)
# int() -to convert to int #similarly you can use float(), str() bool() complex() # operators: # Arithmatic operators: + – * / ** // % (modulo – remainder) num1 = 11 #assignment operator = we are assigning value 11 to num1 num2 = 3 print(num1 + num2) print(num1 – num2) print(num1 * num2) print(num1 / num2) print(num1 ** num2) #power print(num1 // num2) #integer division print(num1 % num2) # remainder ## relational operators (comparision) ## > >= < <= == != (is it?) ## output is always bool (True or False) num1,num2,num3 = 11,9,11 print(“Relational : “, num1 > num2) # T print(“Relational : “, num1 >= num3) # T print(“Relational : “, num1 < num2) # F print(“Relational : “, num1 <= num3) # T print(“Relational : “, num1 == num2) # F print(“Relational : “, num1 == num3) # T print(“Relational : “, num1 != num2) # T print(“Relational : “, num1 != num3) # F print(“Relational : “, num1 > num3) # F print(“Relational : “, num1 < num3) # F # Logical operators: and or not # input and output are both bool values ”’ Prediction 1: Rohit and Ishan will open the batting Prediction 2: Rohit or Ishan will open the batting Actual: Rohit and Gill opened the batting Prediction 1 False Prediction 2 True Truth Table: AND (*) T and T = T T and F = F F and T = F F and F = F OR (+) T or T = T T or F = T F or T = T F or F = F not T = F not F = T ”’ num1,num2,num3 = 11,9,11 print( not(num1 > num2 and num1 >= num3 or num1 < num2 or num1 <= num3 and num1 == num2 and num1 == num3 or num1 != num2 or num1 != num3 and num1 > num3 or num1 < num3)) # T and T or F or T and F and T or T or F and F or F # T or F or F or T or F or F # T # int to binary and vice-versa num1 = 34 print(“Binary of num1=”,bin(34)) num2 = 0b100010 print(“Integer of num2=”,int(num2)) print(oct(34)) # 0o42 print(hex(34)) # 0x22 #Bitwise: & (bitwise and) | (bitwise or) >> (right shift) << (left shift) num1 = 23 #0b10111 num2 = 31 #0b11111 print(bin(num1),“and”,bin(num2)) ”’ bitwise & 10111 11111 ——– 10111 ”’ print(int(0b10111)) # 23 print(“23 & 31 = “,23 & 31) # 23 ”’ bitwise | 10111 11111 ——– 11111 ”’ print(“23 | 31 = “,23 | 31) # 31 ”’ THTO 54320 ”’ print(“23 << 2:”,23 << 2) # 92 ”’ 1011100 << 2 ”’ print(int(0b1011100))
# conditions ”’ display message after checking if the student has passed or failed the exam condition is avg >= 40 to pass if command checks the condition is Python syntax: if condition : # perform things when the condition is true Title * sub o ss i. ii. ”’ avg =82 if avg >=40: print(“Congratulations!”) print(“You’ve passed!”)
print(“Thank you”) ”’ Check avg and print Pass or Fail ”’ avg = 19 if avg >=40: print(“Pass”) else: print(“Fail”)
# IF – condition – will always result into True or False num1 = 71.000000001 num2 = 71 # if num1 is greater than num2 then I want to print How are you? otherwise do nothing if num1 > num2: print(“How are you?”) print(“Where are you going?”)
print(“Thank you”)
# if num1 is greater than num2 then I want to print How are you? otherwise print Do nothing if num1 > num2: print(“How are you?”) print(“Where are you going?”) else: print(“Do Nothing”)
”’ Input a number from the user and check if its +ve, -ve or zero ”’ val = int(input(“Enter a number: “)) print(“Type of data =”,type(val))
# IF – ELIF – ELSE if val==0: # == is to check the equality print(“Its Zero”) elif val <= 0: print(“Its -ve number”) else: print(“Its +ve number”)
if val==0: print(“Its Zero”) if val<=0: print(“Its -ve number”) if val>=0: print(“Its +ve number”)
”’ Write a program to take 2 inputs from the user and check if the first number is greater, smaller or equal to the second one ”’ num1 = int(input(“Enter first number: “)) num2 = int(input(“Enter second number: “)) if num1 > num2: print(num1,“is greater than”,num2) elif num1 < num2: print(num1,“is less than”,num2) else: print(num1, “and”, num2,“are equal”)
”’ WAP to take marks in 5 subjects as input, calculate total and average and assign grade based on below condition: a. avg 85 – Grade A b. avg 70-85 – Grade B c. avg 60-70 – Grade C d. avg 50-60 – Grade D e. avg 40 -50 – Grade E f. avg <40 – Grade F ”’ marks1 = float(input(“Enter the marks in subject 1: “)) marks2 = float(input(“Enter the marks in subject 2: “)) marks3 = float(input(“Enter the marks in subject 3: “)) marks4 = float(input(“Enter the marks in subject 4: “)) marks5 = float(input(“Enter the marks in subject 5: “)) total = marks1 + marks5 + marks4 + marks3 + marks2 avg = total / 5 print(f”Total marks is {total:.2f} and average is {avg:.2f}“) if avg>=85: print(“Grade A”) elif avg>=70: print(“Grade B”) elif avg>=60: print(“Grade C”) elif avg>=50: print(“Grade D”) elif avg>=40: print(“Grade E”) else: print(“Grade F”)
”’ Let’s write a program to read length and breadth from the user check if its square or rectangle and calculate area and perimeter ”’ length = int(input(‘Enter the length: ‘)) breadth = int(input(‘Enter the breadth: ‘)) #and & or are logical operator which connects you conditonal statements # and: both the statements need to be true for True else its false # or: both the statements need to be false for False else its True if length>0 and breadth >0: print(“Rectangle and Square both possible”) if length==breadth: print(“Square”) print(f”Area is {length**2} and the perimeter is {4*length}“) else: print(“Rectangle”) print(f”Area is {length * breadth} and the perimeter is {2 * (length+breadth)}“) else: print(“Neither Rectangle nor Square possible”)
”’ check if a number is positive, negative or zero if the number is -ve, find the square root if number is positive, check if its 2 digit or not if 2 digits then interchange the values otherwise, check if its divisible by 15, ”’ num1 = int(input(“Enter a number: “)) if num1<0: print(“This is negative”) print(f”Square root of {num1} is {num1**0.5}“) elif num1==0: print(“This is zero”) else: print(“This is positive”) if num1>9 and num1<100: #interchange the values: eg 35 = 53 # divide number by 10 = d = num1 // 10 r = num1 % 10 new_num1 = r*10+d print(f”{num1} is now made into {new_num1}“)
else: if num1 % 15==0: # % mod – will give you remainder print(“Number is divisible by 15”) else: print(“Number is not divisible by 15”)
#LOOPS – repeat the give block of code multiple times # when you know exactly how many times to run – for # repeatition is done based on a certain condition – while # range(start,end,increment)- generates range of values from start upto end # by increasing each element ny increment # range(6,18,3): 6,9,12, 15 # range(start,end): increment is default 1 # range(15,19): 15,16,17,18 # range(end): start = 0, increment = 1 # range(6): 0, 1, 2, 3, 4, 5 #print(), input(), type(), int(),str(),complex(),bool(), float() for var in range(6,18,3): print(“Hello from the loop!”) print(“Value of var is”,var)
for count in range(15,19): print(“Hello from the loop2!”) print(“Value of var is”,count)
for count in range(4): print(“Hello from the loop3!”) print(“Value of var is”,count)
### for i in range(5): print(“*”,end=” “) print() for i in range(1,101): print(i,end=“, “) print() ”’ Generate odd numbers between 1 and 30 ”’ for i in range(1,30,2): print(i,end=“, “) print() ”’ Generate first 10 even numbers ”’ start = 0 for i in range(10): print(start,end=“, “) start=start+2 print()
# for loop examples ”’ Print all the numbers between 1 and 1000 which is perfectly divisible by 19 and 51 ”’ start,end = 1, 10001 num1,num2 = 19,51 for n in range(start,end): if n%num1==0 and n%num2==0: print(n,end=“, “) print() ”’ Generate prime numbers between 10000 and 50000 ”’ start,end = 40000, 42000 for n in range(start,end): isPrime = True for num in range(2,n//2+1): if n %num==0: isPrime = False break if isPrime: print(n,end=“, “)
”’ Print different * patterns ”’ for i in range(5): print(“*”)
”’ * * * * * * * * * * * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(5): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(1+j): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(5-j): print(“*”,end=” “) print() ”’ * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(j): print(” “,end=“”) for i in range(5-j): print(“*”,end=” “) print()
”’ Assignment: * * * * * * * * * * * * * * * Solve assignments from the website ”’ ## WHILE Loop ”’ WAP to print hello till user says no ”’ while True: print(“HELLO 1”) usr_inp=input(“Enter N to stop: “) if usr_inp.lower()==“n”: break print(“====”) usr_inp=input(“Enter N to stop: “) while usr_inp.lower() !=‘n’: print(“HELLO 2”) usr_inp = input(“Enter N to stop: “)
”’ A company offers dearness allowance (DA) of 40% of basic pay and house rent allowance (HRA) of 10% of basic pay. Input basic pay of an employee, calculate his/her DA, HRA and Gross pay (Gross = Basic Pay + DA+ HRA). a. Modify the above scenario, such that the DA and HRA percentages are also given as inputs. b. Update the program such that the program uses a user-defined function for calculating the Gross pay. The function takes Basic pay, DA percentage and HRA percentage as inputs and returns the gross pay. ”’ #Case 1 basic_pay = int(input(“Enter your basic pay:”)) da = basic_pay *0.4 hra = basic_pay*0.1 gross_pay = basic_pay + da + hra print(“Your gross pay for this month is Rs”,gross_pay)
#Case 2 basic_pay = int(input(“Enter your basic pay:”)) da = int(input(“Enter the dearness allowance (%): “)) da = da/100 hra = int(input(“Enter the House rent allowance (%): “)) hra = hra/100 gross_pay = basic_pay + basic_pay*da + basic_pay*hra print(“Your gross pay for this month is Rs”,gross_pay) #case 3 # defining a user defined function (udf) # input taken by the function – passing the value # and anything returned from the function – function returns the output def calc_gross_pay(bp,da,hra=10): hra = hra / 100 da = da / 100 gross_pay = bp + bp * da + bp * hra return gross_pay
basic_pay = int(input(“Enter your basic pay:”)) da = int(input(“Enter the dearness allowance (%): “)) hra = int(input(“Enter the House rent allowance (%): “))
result = calc_gross_pay(basic_pay,da,hra) print(“Your gross pay for this month is Rs”,result)
result = calc_gross_pay(basic_pay,da) print(“Your gross pay with default hra for this month is Rs”,result)
result = calc_gross_pay(da=da,bp=basic_pay,hra=hra) print(“Your gross pay with non-positional for this month is Rs”,result)
# required positional arguments # default (non-required) ”’ You have a monthly income of Rs 1100. Your monthly outgoings are as follows. • Rent – Rs.500 • Food – Rs.300 • Electricity – Rs.40 • Phone – Rs 60 • Cable TV – Rs 30. Calculate the Monthly Expenses and the remainder (what’s left over each month). a. Modify the above program by inputting the income as well as values for expenses and calculate Monthly expense. b. Include a function to check whether you will have savings or you have to borrow money based on the monthly income and total expenses. The function should print an appropriate message for each case. ”’ #case 1 income = 1100 Rent=500 Food=300 Electricity=40 Phone=60 Cable=30 expenses = Rent+Food+Electricity+Phone+Cable remainder = income-expenses print(“Your expenses for this month is”,expenses) print(“You remainder for this month is”,remainder)
#case 2 income = int(input(“Enter your Income:”)) Rent= int(input(“Enter your rent:”)) Food= int(input(“Enter your food expenses:”)) Electricity= int(input(“Enter your Electricity charges:”)) Phone= int(input(“Enter your Phone expenses:”)) Cable= int(input(“Enter your Cable TV expenses:”)) expenses = Rent+Food+Electricity+Phone+Cable remainder = income-expenses print(“Your expenses for this month is”,expenses) print(“You remainder for this month is”,remainder)
# case 3 def check_remainder(income,expenses): remainder = income-expenses if remainder<0: print(f”You need to borrow Rs {remainder} for this month”) elif remainder>0: print(f”You have a savings of Rs {remainder} for this month”) else: print(“This month you neither have savings nor need to borrow any money”)
income = int(input(“Enter your Income:”)) Rent= int(input(“Enter your rent:”)) Food= int(input(“Enter your food expenses:”)) Electricity= int(input(“Enter your Electricity charges:”)) Phone= int(input(“Enter your Phone expenses:”)) Cable= int(input(“Enter your Cable TV expenses:”)) expenses = Rent+Food+Electricity+Phone+Cable check_remainder(income,expenses)
########## PRACTICE ################# # defining a user defined function (udf) # input taken by the function – passing the value # and anything returned from the function – function returns the output def calc_gross_pay(n1,n2): print(“Hi, I am in calc_gross_pay_function”) total = n1 + n2 #print(total) return total
val1 = 100 val2 = 150 ret_val = calc_gross_pay(val1,val2) #calling the function pass the value print(“Value returned from the function is”,ret_val) val1 = 10 val2 = 50 result = calc_gross_pay(val1,val2) #calling the function pass the value print(“Value returned from the function is”,result)
# Guessing the number game: Computer v Human # computer will think of the number and we will guess it import random num1 = random.randint(1,100) attempts = 0 fouls = 0 while True: guess = int(input(“Guess the number between 1 and 100: “)) if guess<1 or guess>100: print(“Your guess is outside the valid number range! “,end=” “) if fouls==0: print(“This is your first foul, so you can continue but another foul will make you lose.”) else: print(“This is your second foul, sorry you lose.”) break fouls+=1 continue attempts+=1 if num1 == guess: print(f”You guessed it right in {attempts} attempts!”) break elif num1 > guess: print(“Sorry! Its Incorrect! Guess a higher number”) else: print(“Sorry! Its Incorrect! Guess a lower number”)
############# # Guessing the number game: Computer v Computer # computer will think of the number and it will only guess it import random import time # date, datetime start = time.time() ### finding average attempts of running this program total_attempts = 0 for i in range(10000): num1 = random.randint(1,100) attempts = 0 fouls = 0 low,high=1,100 while True: guess = random.randint(low,high) if guess<1 or guess>100: print(“Your guess is outside the valid number range! “,end=” “) if fouls==0: print(“This is your first foul, so you can continue but another foul will make you lose.”) else: print(“This is your second foul, sorry you lose.”) break fouls+=1 continue attempts+=1 if num1 == guess: print(f”You guessed it right in {attempts} attempts!”) total_attempts+=attempts # a+=b => a = a+b ; a/=c => a =a/c break elif num1 > guess: print(“Sorry! Its Incorrect! Guess a higher number”) low=guess+1 else: print(“Sorry! Its Incorrect! Guess a lower number”) high=guess-1 end = time.time() print(f”On average this program has taken {total_attempts/10000:.1f} attempts”) print(f”Total time taken by the program to run 10000 times is {end-start} seconds”)
############# # LIST # collections: list, tuple, sets, dictionary, numpy, pandas l1 = [10,20,“30”,False,“Hello”,[1,3,5]] print(“Type of the variable = “,type(l1)) print(“Size/Length of the list = “,len(l1)) # read the values of a list: print(l1[0])
#indexing – forward print(“First value – “,list1[0]) print(“third value – “,list1[2]) list1[0] = 55.5 print(list1) # backward indexing – right to left print(“Last element – indexed as -1: “,list1[-1]) print(“First value – “,list1[-3]) print(“1,3,5 values – “,list1[0],list1[2],list1[4]) # print(“First to third values – “,list1[0:3],list1[:3]) print(“First to third values – “,list1[0:5:2]) print(“First to last values – “,list1[:]) print(“last three values – “,list1[-3:])
## using list in a for loop for counter in list1: print(“HELLO : “,counter,“has a data type of”,type(counter))
## Properties of a list l1 = [2,3,4] l1.pop() #pop without index will remove last element from the list print(“1. l1 = “,l1) l1.pop(0) #pop will remove the element at the given index print(“2. l1 = “,l1) l1.append(5) #append always adds the value at the end of the list print(“3. l1 = “,l1) l1.append(1) print(“4. l1 = “,l1) l1.append(8) print(“5. l1 = “,l1) l1.sort() #default it sorts in ascending print(“6. l1 = “,l1) l1.sort(reverse=True) #will sort in descending print(“6. l1 = “,l1) ## creating duplicate list l2 = l1 #deep copy – both variables point to the same data l3 = l1.copy() # shallow copy – you create a different copy print(“11 L1 = “,l1) print(“11 L2 = “,l2) print(“11 L3 = “,l3) l1.append(33) l2.append(43) l3.append(53) l1.append(3) print(“12 L1 = “,l1) print(“12 L2 = “,l2) print(“12 L3 = “,l3) # (value, start, stop) – whats the index of the value between start and stop # start =0, stop default is -1 print(“Index of 3: “,l1.index(3,3,10)) # REMEMBER: Index will throw error when value not the in list # count() will do the count and its used exactly like index num= l1.count(3) print(“Number of 3 in the list is”,num) l1_dup = l1[3:11] num = l1_dup.count(3) # above 2 statements can be clubbed as one shown below: num = l1[3:11].count(3) print(“Number of 3 in the given range is”,num) print(“List before reverse is: “,l1) l1.reverse() print(“List after reverse is: “,l1)
# + will perform: c = a+ b #extend will be like a = a+b list4 = [11,22,33] l1.extend(list4) print(“L1 after extend: “,l1)
#pop takes index – remove takes value to remove/delete from the list l1.remove(3) cnt = l1.count(18) if cnt>0: l1.remove(18) print(“1. After remove: “,l1) #append() will always add at the end, insert takes the position also along with the values # first it takes index, then the value to add l1.insert(2,32) print(“1. INSERT 1=”,l1) l1.insert(2,42) print(“2. INSERT 2=”,l1) l1.clear() # will clear the data from the list print(“99 List1 = “,l1)
# I want to input marks of 5 students in 5 subjects students_marks = []
for j in range(5):
all_marks=[] for i in range(5): marks = int(input(“Enter the marks in subject “+str(i+1)+“: “)) all_marks.append(marks) print(f”Marks obtained by student {j+1}: {all_marks}“) students_marks.append(all_marks) print(“Marks obtained by students are:\n“,students_marks)
students_marks=[[66, 55, 77, 88, 99], [45, 65, 76, 78, 98], [90, 80, 45, 55, 55], [54, 64, 74, 84, 94], [34, 53, 99, 66, 76]] subjects = [“Maths”,“Stats”,“Physics”,“Programming”,“SQL”] for k in range(len(students_marks)): total = sum(students_marks[k]) print(f”Total marks obtained by student {k+1} is {total} and average “ f”is {total/len(students_marks)}“) max_marks = max(students_marks[k]) print(f”Highest marks obtained by student {k + 1} is {max_marks} “ f”in subject {subjects[students_marks[k].index(max_marks)]}“)
# TUPLE: linear ordered immutable collection # tuple declared using () t1 = () print(“Type of t1 = “,type(t1)) t1 = (“hello”,) # (5+3)*2 = print(t1) print(“Type of t1 = “,type(t1))
t1 = (5,4,6,9,1) print(t1) print(“Type of t1 = “,type(t1)) # indexing is exactly same as list #t1[0]=8 – ‘tuple’ object does not support item assignment for i in t1: print(“from tuple: “,i)
t1=list(t1) # converting tuple to list t1=tuple(t1) #converting list to tuple ############ ## STRING – str ########### # there is no difference between declaring string using ‘ or ” quotes # and there is no difference between ”’ and “”” strings # ‘ or ” declares only 1 line of text but ”’ and “”” can be used # to declare multi line of text str1 = “hello” #str1[0]=”H” – ‘str’ object does not support item assignment # strings are immutable # strings are same as list or tuple # 0 to n-1 indexing and -1 to -n indexing str2 = ‘hi there’ str3 = “””How are you? Where are you? What are you doing?””” str4 = ”’I am fine I am here I am doing nnothing”’ print(type(str1), type(str2),type(str3), type(str4)) print(“Str1 \n————“) print(str1) print(“Str2 \n————“) print(str2) print(“Str3 \n————“) print(str3) print(“Str4 \n————“) print(str4)
str11=str1.upper() print(str1,str11) str22 = “Hello ” + “There ” * 2 print(“Str22 = “,str22) # str are used in for loop exactly same way as list or tuple for i in str1: print(“STR = “,i)
# Strings – in python str1 = “HELLO” str3 = ”’How Are YoU?”’ str2 = “123456” # methods with is…() – is it … ? print(“isupper: “,str1.isupper()) print(“islower: “,str3.islower()) print(“istitle: “,str3.istitle()) print(“isnumeric: “,str2.isnumeric()) print(“”,str2.isalnum()) print(“title: “,str3.title()) print(“lower: “,str3.lower()) print(“upper: “,str3.upper())
str3 = ”’How Are YoU?”’ print(“startswith: “,str3.startswith(“H”)) print(“endswith: “,str3.endswith(“?”)) usname = input(“Enter your username (only text and numbers allowed: “) if usname.isalnum(): print(“Username accepted”) else: print(“Invalid username!”) num1 = input(“Enter length: “) if num1.isnumeric(): num1 = int(num1) else: print(“Invalid number”)
str4=“abcdefghijklmnopqrstuvwxyz” # I want to check if the starting # character is A and ending is Z if str4.upper().startswith(“A”) and str4.upper().endswith(“Z”): print(“Your condition is true”) else: print(“Incorrect condition”)
while True: inp = input(“Enter Yes to stop and any key to continue: “) if inp.title()==“Yes”: break str5 = “Enter Yes to stop and any key to continue: “ str_words = str5.split() print(str_words) # join() will take list as input print(“JOIN: “,” “.join(str_words)) str_hyphen = “-:-“.join(str_words) print(“New Statement: “,str_hyphen) # need to split this special text str_words = str_hyphen.split(“-:-“) print(“STR HYPHEN: “,str_words)
str1 = “How are you going?” str1 = str1.replace(“g”,“d”) print(“1. Str1 = “,str1)
str1 = “How are you going you?” str1 = str1.replace(“g”,“d”,1) print(“2. Str1 = “,str1)
# you in str1 or not # -1 indicates value not found # positive number indicates first matching index print(str1) print(str1.upper().find(“YOU”,9,21))
str1 = ” How are you going you? “ print(str1.strip()) str1 = str1.split() str1 = ” “.join(str1) print(str1)
######### DICTIONARY ########## ## mutable unordered collection: pair of key and value (key:value) dict1 = {1:“Hello”,“Name”:“Sachin”,“Runs”:35000} print(dict1) print(dict1[1]) print(dict1[“Runs”]) print(dict1.values()) print(dict1.keys()) print(dict1.items())
”’ Write a program to store marks of 5 subjects along with names ”’ master_data = {} for i in range(3): name=input(“Enter the student’s name: “) marks = [] for j in range(5): m1=int(input(“Enter the marks in Subject “+str(j+1)+“: “)) marks.append(m1) t_dict={name:marks} master_data.update(t_dict)
# deep & shallow data2 = data # both will point to the same memory location data3 = data.copy() # shallow – create photocopy- another dict object data2.update({‘Rohit’:[66,67,78,77,82]}) print(“Data: “,data) print(“Data 2: “,data2) print(“Data 3: “,data3)
# Functions ”’ Write a function to check if the number is prime or not and use this to generate prime numbers ”’ def check_prime(val=53): ”’ This is a user defined function which takes a value as input and checks if its a prime or not @Written by Sachin Kohli :param val: :return: ”’ isPrime = True for i in range(2,val//2+1): if val%i==0: isPrime = False break ”’ if isPrime: print(f”{val} is a prime number”) else: print(f”{val} is not a prime number”) ”’ return isPrime # required positional argument # default # keyword # check if val1 is greater than val2 then subtract #otherwise add them #SyntaxError: non-default argument follows default argument def myfunction(val1, val2=50): ”’ :param val1: :param val2: :return: ”’ print(f”input values are: {val1} and {val2}“) if val1 > val2: print(“Subtraction = “,val1-val2) else: print(“Addition = “,val1+val2)
# write a function to add all the given numbers #* against the argument makes it take values as a tuple #** against the argument makes it take values as a dictionary def add_all_num(*values, **data): print(“add_all_num: values Values passed are: “,values) print(“add_all_num: **data Values passed are: “, data)
if __name__==“__main__”: # current file is running res = check_prime(41) myfunction(43,33) myfunction(val2=10,val1=20) #keywords, use exact same variable name res = check_prime() add_all_num(5,6,10,12,15, name=“Sachin”, game=“Cricket”,runs=50000)
”’ generate prime numbers between 10,000 to 15,000”’ for num in range(10000,15001): res = check_prime(num) if res: print(num,end=“, “) print()
# Class and Objects class Books: #functions which are part of a class are called methods # members of class can be variables and methods #object level members & class level members total_books = 0 def __init__(self,title,author,price): self.title = title self.author = author self.cost = price Books.total_books +=1 def print_info(self): print(“Title of the book is”,self.title) print(“Author of the book is”, self.author) print(“Cost of the book is”, self.cost)
@classmethod def print_total(cls): print(“Total books in the library are”,cls.total_books)
class Library: total_lib = 0 def __init__(self,name,loc,pincode): self.name = name self.location = loc self.pin = pincode
”’ Create a class called MyMathOps and add functionalities for Addition, Subtraction, Power, Multiplication and Division You should have following methods: 1. init() – to get 2 values 2. calc_add() – perform addition 3. display_add() – to print total 4. calc_sub() – perform addition 5. display_sub() – to print total 6. calc_power() – perform addition 7. display_power() – to print total 8. calc_mul() – perform addition 9. display_mul() – to print total 10. calc_div() – perform addition 11. display_div() – to print total ”’
”’ Properties of class & objects: 1. Encapsulation 2. Inheritance 3. Polymorphism 4. Abstraction #Accessibility: public (var), private (__var), protected (_var) ”’ #magazines class LibraryContent: def __init__(self,title,price): self.title = title self.cost = price
def __print_data(self): print(“data from Library Content”)
def print_info(self): print(“info from Library content”)
def display_something(self): print(“Do Nothing”) class Magazines(LibraryContent): total_mags = 0 def __init__(self,title,issn,price): LibraryContent.__init__(self, title, price) self.issn = issn Books.total_books +=1 def print_info(self): print(“Title of the book is”,self.title) print(“ISSN of the book is”, self.issn) print(“Cost of the book is”, self.cost)
@classmethod def print_total(cls): print(“Total books in the library are”,cls.total_books)
class Books(LibraryContent): #functions which are part of a class are called methods # members of class can be variables and methods #object level members & class level members total_books = 0 def __init__(self,title,author,price): LibraryContent.__init__(self,title,price) self.author = author Books.total_books +=1 def print_info(self): print(“Title of the book is”,self.title) print(“Author of the book is”, self.author) print(“Cost of the book is”, self.cost)
@classmethod def print_total(cls): print(“Total books in the library are”,cls.total_books)
class Library: total_lib = 0 def __init__(self,name,loc,pincode): self.name = name self.location = loc self.pin = pincode
if __name__==“__main__”: m1 = Magazines(“International Journal for Robotics”,“247-9988”,19800) m1.print_info() b1 = Books(“Python Book”,“Virat”,299) b1.print_info() m1.print_info() #m1.__print_data() #b1.print_data() m1.display_something()
############ ANOTHER FILE ################# #Working with files: # modes: r(read), w(write), a (append) # r+, w+, a+ filename = “17DEC.txt” fileobj = open(filename,“a+”) #by default read mode fileobj.write(”’Twinkle Twinkle little star How I wonder what you are”’) fileobj.seek(0) fileobj.write(”’Twinkle X Twinkle X little star How I wonder what you are”’) cont = fileobj.read() print(cont) fileobj.close()
l1 = [10,40,20,50,30,60] # l1 = [10,20,30,40,50,60] ”’ 5 4 3 2 1 0 ”’ #Print – Bubble Sort l1 = [60,50,40,30,20,10] for i in range(len(l1)-1): for j in range(len(l1)-1-i): if l1[j] > l1[j+1]: l1[j],l1[j + 1] = l1[j + 1], l1[j] print(“Sorted L1 = “,l1)
l1 = [10,40,20,50,30,60] l1 = [60,50,40,30,20,10] #Print – Selection Sort for i in range(len(l1)-1): for j in range(1+i, len(l1)): if l1[i] > l1[j]: l1[i],l1[j] = l1[j], l1[i] print(“Sorted L1 = “,l1)
## if 55 is in the list or not element = 50 l1 = [10,40,20,50,30,60] found = False for i in l1: if element==i: found = True if found: print(“Element is in the list!”) else: print(“Element is not in the list”) # sequential sort – method found = False for i in range(len(l1)): if element==l1[i]: found = True break if found: print(“Element is in the list!”) else: print(“Element is not in the list”)
# Binary search works on sorted list L1 = [10, 20, 30, 40, 50, 60] low,high=0,len(L1)-1 element = 51 found = False while low<=high: mid = (low + high) // 2 if L1[mid]==element: found=True break else: if element > L1[mid]: low=mid+1 else: high=mid-1 if found: print(f”Binary Search: {element} is in the list!”) else: print(f”Binary Search: {element} is not in the list”)
”’ two types: 1. OLTP – Online Transaction Processing 2. OLAP – Online Analytical Processing RDBMS – Relational Database Management System SQL – Structured Query Language – language of Database SELECT, INSERT, UPDATE, DELETE CREATE, DROP Table: EMPLOYEES EMPID ENAME EPHONE EEMAIL DEPT DHEAD DCODE DLOC 1 AA 123 aa@aa.com Executive AA E01 NY 2 AB 223 ab@aa.com Executive AA E01 ML Relationship: 1:1 – All the columns in same table 1:M / M:1 – Put them in 2 tables and connect them using Foreign Key M:M – Pu them in 2 different tables and connect them using 3rd table — MYSQL database https://dev.mysql.com/downloads/mysql/ Download and install 8.0.35 1. Server 2. Client 3. Workbench Connection requires: you know the server location, username, password, database_name # Data types in MYSQL: https://dev.mysql.com/doc/refman/8.0/en/data-types.html use ouremployees; create table departments ( DID integer primary key, dname varchar(20), dhod varchar(20), dcode varchar(10)); insert into departments values(101,’PD’,’MR PD’,’234AWER439′); select * from departments; select * from employees; ”’ # connect to MYSQL import pymysql hostname,dbname,username,password = “localhost”,“ouremployees”,“root”,“learnSQL” db_con = pymysql.connect(host=hostname,database=dbname, user=username,password=password)
db_cursor = db_con.cursor()
sql1=”’Create table employees(empid integer primary key, name varchar(30), phone varchar(10), did integer, Foreign key (DID) references Departments(DID)) ”’ #db_cursor.execute(sql1) sql1 = ”’Insert into Employees values( 101, ‘Sachin T’,’3456555′,101)”’ #db_cursor.execute(sql1) db_con.commit()
sql1=”’Select * from Employees”’ db_cursor.execute(sql1) results = db_cursor.fetchall() for data in results: print(data) db_con.close()
# print – print the content on the screen print(“5 + 3”) print(5 + 3) print(“5 + 3 =”, 5 + 3, “and 6+2 is also”,6+2) #below we defined a variable called num1 and assgined a value num1=30 num1 = 90 # there are 5 basic types of data: # 1. integer (int): -infinity to + infinity no decimal # 45, -99, 0, 678 num1=30 # type() which gives the type of the data (datatype) print(type(num1)) #<class ‘int’> # 2. Float (float) – decimal data # 45.8, -99.9, 0.0, 678.123456678 num1=30.0 print(type(num1)) #<class ‘float’> # 3. complex [square root of -1]: 3+2j, -7j num1=30j print(type(num1)) #<class ‘complex’> print(30j* 30j)
# 4. string (str) – text data # strings are defined using ‘ or ” or ”’ or “”” word1 = ‘hello’ print(type(word1)) #<class ‘str’> word1 = ‘5’ print(type(word1)) #<class ‘str’> # 5. boolean (bool) – just 2 values: True and False b1 = ‘True’ print(type(b1)) #<class ‘str’> b1 = True print(type(b1)) #<class ‘bool’> ### #### print(“Hello”) a = 45;print(a);print(‘a’)
## # variables – can have alphabets, numbers and _ # variable name cant start with a number cost_potato = 35 cost_tomato = 55 qty_potato = 37 total_cost_potato = cost_potato * qty_potato print(“Cost of potato is Rs”,cost_potato,“/kg so for the”, qty_potato,“kg cost would be Rs”,total_cost_potato)
# f – string (format string) print(f”Cost of potato is Rs {cost_potato} /kg so for the “ f”{qty_potato} kg cost would be Rs {total_cost_potato}“)
# escape sequence: \ print(“Hello \\hi”) print(“\n will generate newline”) print(“\t will give tab spaces”)
print(“\\n will generate newline”) print(“\\t will generate tab space”)
# \\n is not generate newline print(“\\\\n is not generate newline”)
# f string q = 50 p = 25 t = q * p print(f”The cost of each pen is {p} so for {q} quantities, the total cost will be {t}.”)
t = 200 q = 23 p = t / q # how to format decimal places using f-string print(f”The cost of each pen is {p:.2f} so for {q} quantities, the total cost will be {t}.”)
# format your string values pos,country, name = “Batsman”,“India”, “Virat” print(f”Player {name:<16} plays for {country:^16} as a {pos:>16} in the team”) pos,country, name = “Wicket-keeper”,“South Africa”, “Mangabwabe” print(f”Player {name:<16} plays for {country:^16} as a {pos:>16} in the team”)
### relational operations # comparators – comparison operators: > < <= >=, == , != # output is bool values # 6 > 9 : is 6 greater than 9? False n1,n2,n3 = 10,5, 10 print(n1 > n2) # T print(n1 < n2) # F print(n1 >= n2) # T print(n1 <= n2) # F print(n1 == n2) # F print(n1 != n2) # T print(“checking with n3….”) print(n1 > n3) # F print(n1 < n3) # F print(n1 >= n3) # T print(n1 <= n3) # T print(n1 == n3) # T print(n1 != n3) # F # Assignment operation = n1 = 10 print(n1) # value is 10 #logical operators: i/p and o/p both bool #operators are: and or not ”’ Prediction: Rohit and Gill will open the batting Actual: Rohit and Kishan opened the batting Prediction ? False Prediction: Rohit or Gill will open the batting Actual: Rohit and Kishan opened the batting Prediction ? True AND: T and T => True and rest everything is False OR: F or F => False and rest everything is True NOT: Not True is false and Not false is true ”’ n1,n2,n3 = 10, 5, 10 print(n1 == n2 and n1 != n2 and n1 > n3 or n1 < n3 or n1 >= n3 and n1 <= n3 or
n1 == n3 and n1 != n3) # F and T and F or F or T and T or T and F # F or F or T or F # T # BODMAS: AND take priority over OR ## converting one type to another: # int() str() bool() complex() float() n1 = 50 print(n1) print(type(n1)) n2 = str(n1) print(type(n2)) print(n2)
# input() – is to read the content from the user # by default, input() will read as string value val1 = input() print(val1) print(type(val1))
#WAP to add two numbers by taking input from the user a = input(“Enter first number: “) b = input(“Enter second number: “) c = int(a) + int(b) print(“Sum is “,c)
# WAP to read length and breadth of a rectangle from the user # and display area and perimeter.
len = int(input(“Enter length: “)) breadth = int(input(“Enter breadth: “)) area = len * breadth peri = 2*(len + breadth) print(f”Area = {area} and Perimeter = {peri}“)
# calculate area and circunference of a circle pi = 3.14 rad = float(input(“Enter the radius of the circle: “)) area = pi*(rad**2) cir = 2*pi*rad print(f”A circle with radius {rad:.1f} has an area of {area:.2f} and circunference of {cir:.2f}“)
#program to calculate total and average of marks obtained in 5 subjects sub1 = int(input(“Enter the marks in subject 1: “)) sub2 = int(input(“Enter the marks in subject 2: “)) sub3 = int(input(“Enter the marks in subject 3: “)) sub4 = int(input(“Enter the marks in subject 4: “)) sub5 = int(input(“Enter the marks in subject 5: “)) total = sub1 + sub2 + sub3 + sub4 + sub5 avg = total / 5 print(“Total marks obtained is”,total,“with an average of”,avg)
# WAP to indicate number of each value of currency note you will pay # based on the total demand ”’ Currency notes available: 500, 100, 50, 20, 10, 5, 2, 1 Total bill = 537 500 – 1 20 – 1 10 – 1 5 – 1 2 – 1 ”’ five00, one00,fifty,twenty, ten,five,two,one = 0,0,0,0,0,0,0,0 total_amount= int(input(“Enter total bill amount = “)) five00 = total_amount // 500 total_amount = total_amount % 500 one00 = total_amount // 100 total_amount = total_amount % 100 fifty = total_amount // 50 total_amount = total_amount % 50 twenty = total_amount // 20 total_amount = total_amount % 20 ten = total_amount // 10 total_amount = total_amount % 10 five = total_amount // 5 total_amount = total_amount % 5 two = total_amount // 2 total_amount = total_amount % 2 one = total_amount print(“Total currency that would be paid:”) print(f”500s = {five00}, 100s = {one00}, 50s ={fifty},20s = {twenty}, “ f”10s = {ten},5s = {five},2s ={two},1s={one}“) # Conditions: if command is used to check for the conditions # if command is followed by condition (conditional operator) and if # the condition result in true it will get into If block num1 = 9 q = int(input(“Enter the quantity: “)) if q > 0: print(“Quantity accepted”) print(“Sales confirmed”) print(“Thank You”) #but lets say you want to have alternate condition, that means when # its True then print True part and when its not then you want to #print false part if q > 0: #True then go to below print(“Given Quantity accepted”) else: # if condition is false then comes here print(“Quantity rejected”)
num = 0 if num>0: print(“Number is positive”) else: print(“Number is not positive”)
# WAP to indicate number of each value of currency note you will pay # based on the total demand ”’ Currency notes available: 500, 100, 50, 20, 10, 5, 2, 1 Total bill = 537 500 – 1 20 – 1 10 – 1 5 – 1 2 – 1 ”’ five00, one00,fifty,twenty, ten,five,two,one = 0,0,0,0,0,0,0,0 total_amount= int(input(“Enter total bill amount = “)) five00 = total_amount // 500 total_amount = total_amount % 500 one00 = total_amount // 100 total_amount = total_amount % 100 fifty = total_amount // 50 total_amount = total_amount % 50 twenty = total_amount // 20 total_amount = total_amount % 20 ten = total_amount // 10 total_amount = total_amount % 10 five = total_amount // 5 total_amount = total_amount % 5 two = total_amount // 2 total_amount = total_amount % 2 one = total_amount print(“Total currency that would be paid:”) if five00>0: print(f”500s = {five00}“) if one00>0: print(f”100s = {one00}“) if fifty>0: print(f”50s ={fifty}“) if twenty>0: print(f”20s = {twenty}“) if ten>0: print(f”10s = {ten}“)
if five>0: print(f”5s = {five}“) if two>0: print(f”2s ={two}“) if one>0: print(f”1s={one}“)
#program to calculate total and average of marks obtained in 5 subjects sub1 = int(input(“Enter the marks in subject 1: “)) sub2 = int(input(“Enter the marks in subject 2: “)) sub3 = int(input(“Enter the marks in subject 3: “)) sub4 = int(input(“Enter the marks in subject 4: “)) sub5 = int(input(“Enter the marks in subject 5: “)) total = sub1 + sub2 + sub3 + sub4 + sub5 avg = total / 5 print(“Total marks obtained is”,total,“with an average of”,avg)
# we need to check if the student has passed or failed # avg > 50 – pass otherwise fail if avg>=50: print(“Student has passed”) else: print(“Student has failed”) # check if a number is positive or negative or neither num = int(input(“Enter the number: “)) if num > 0: print(“Its positive”) elif num < 0: print(“Its negative”) else: print(“Its neither – its zero”)
#program to calculate total and average of marks obtained in 5 subjects sub1 = int(input(“Enter the marks in subject 1: “)) sub2 = int(input(“Enter the marks in subject 2: “)) sub3 = int(input(“Enter the marks in subject 3: “)) sub4 = int(input(“Enter the marks in subject 4: “)) sub5 = int(input(“Enter the marks in subject 5: “)) total = sub1 + sub2 + sub3 + sub4 + sub5 avg = total / 5 print(“Total marks obtained is”,total,“with an average of”,avg)
# we need to check if the student has passed or failed # avg > 50 – pass otherwise fail if avg>=50: print(“Student has passed”) else: print(“Student has failed”)
# Grading of the student: ”’ avg >= 90 : Grade A avg >= 80 : Grade B avg >= 70: Grade C avg >= 60: Grade D avg >= 50: Grade E avg <50: Grade F ”’ if avg >= 90 : print(“Grade A”) elif avg >= 80 : print(“Grade B”) elif avg >= 70: print(“Grade C”) elif avg >= 60: print(“Grade D”) elif avg >= 50: print(“Grade E”) else: print(“Grade F”)
# input a number and check if the number is odd or even num = int(input(“Enter the number: “)) if num<0: print(“Invalid number!”) elif num%2==0: print(“Even number”) else: print(“Odd Number”)
# input a number and check if the number is odd or even # example of nested condition num = int(input(“Enter the number: “)) if num>0: if num%2==0: print(“Even number”) else: print(“Odd Number”)
#program to calculate total and average of marks obtained in 5 subjects sub1 = int(input(“Enter the marks in subject 1: “)) sub2 = int(input(“Enter the marks in subject 2: “)) sub3 = int(input(“Enter the marks in subject 3: “)) sub4 = int(input(“Enter the marks in subject 4: “)) sub5 = int(input(“Enter the marks in subject 5: “)) total = sub1 + sub2 + sub3 + sub4 + sub5 avg = total / 5 print(“Total marks obtained is”,total,“with an average of”,avg) # we need to check if the student has passed or failed # avg > 50 – pass otherwise fail # and also assign Grades if avg>=50: print(“Student has passed”) if avg >= 90: print(“Grade A”) if avg>95: print(“You win President’s Medal”) if avg >98: print(“You win State Governor’s Award!”) elif avg >= 80: print(“Grade B”) elif avg >= 70: print(“Grade C”) elif avg >= 60: print(“Grade D”) else: print(“Grade E”) else: print(“Student has failed”) print(“Grade F”)
########## ########### ## LOOPS ########## ########### # Loops : repeatition # hello 10 times # for loop: used when we know how many times to repeat # while loop: used when we know when to repeat and when not to # range() works with if # range(=start, <end, =step): start = 5, end =11, step=2: # 5, 7, 9, # range(=start, <end): step default = 1 # range(5,11): 5,6,7,8,9,10 # range(<end): step default = 1, start default = 0 # range(5): 0,1,2,3,4 for i in range(5,11,2): print(i,“hello 1”)
# Loops – repeatitive tasks # for loop and while loop # generate numbers from 1 to 20 for i in range(1,21): print(i,end=“, “) print() # generate first 10 odd numbers for i in range(10): print(i*2+1,end=“, “) print() # generate even numbers between 10 and 20 for i in range(10,21,2): print(i,end=“, “) print() #multiplication table of 8 upto 10 multiples num = 8 for i in range(1,11): print(f”{num} * {i} = {num*i}“)
#multiplication table of 1 to 10 upto 10 multiples for num in range(1,11): for i in range(1,11): print(f”{num} * {i} = {i*num}“)
#multiplication table of 1 to 10 upto 10 multiples # print them side by side ”’ 1×1=1 2×1=2 … 10×1=10 1×2=2 2×2=2 … 1×10=10 ”’ for num in range(1,11): for i in range(1,11): print(f”{i} * {num} = {num*i}“,end=“\t“) print()
print(“Thank you”) ######### ####### ## More examples of For Loop ####### ####### for i in range(5): print(“*”,end=” “) print() print(“\n#print square pattern of stars”) num = 5 for j in range(num): for i in range(num): print(“*”,end=” “) print()
print(“\n#right angled triangle pattern”) num = 5 for j in range(num): #tracking rows for i in range(j+1): #column print(“*”,end=” “) print()
print(“\n#inverted right angled triangle pattern”) num = 5 for j in range(num): #tracking rows for i in range(num-j): #column print(“*”,end=” “) print()
print(“\n#Isoceles triangle pattern”) ”’ * * * * * * * * * * * * * * * ”’ num = 5 for j in range(num): #tracking rows for i in range(j): #column print(” “,end=“”) for k in range(num-j): #column print(“*”,end=” “) print()
print(“\nASSIGNMENT: Inverted Isoceles triangle pattern”) ”’ * * * * * * * * * * * * * * * ”’ # Write your code here # Calcualte total of 5 subjects marks total = 0 for i in range(5): m1 = int(input(“Marks in subject “+ str(i+1)+ “: “)) total += m1 #total = total + m1 print(total)
# Loops – While loop: we know the condition when to start/stop # generate numbers from 1 to 20 i=1 while i<21: print(i,end=“, “) i=i+1 print() # generate first 10 odd numbers i = 0 while i <10: print(i*2+1,end=“, “) i+=1 print() # generate even numbers between 10 and 20 i=10 while i<21: print(i,end=“, “) i+=2 print()
## ## ## # Calcualte total of 5 subjects marks for given number of students cont = 1 while cont ==1: total = 0 for i in range(5): m1 = int(input(“Marks in subject “+ str(i+1)+ “: “)) total += m1 #total = total + m1 print(total) inp = input(“Do you want to add more students (y for yes/anyother key to stop):”) if inp!=“y”: cont = 0 ## rewriting above program while True: total = 0 for i in range(5): m1 = int(input(“Marks in subject “+ str(i+1)+ “: “)) total += m1 #total = total + m1 print(total) inp = input(“Do you want to add more students (y for yes/anyother key to stop):”) if inp!=“y”: break ”’ break: it will throw you out of the loop ”’ ”’ Create a set of Menu options for a Library ”’ while True: print(“Menu:”) print(“1. Add books to the library”) print(“2. Remove books from the library”) print(“3. Issue the book to the member”) print(“4. Take the book from the member”) print(“5. Quit”) choice = input(“Enter your option:”) if choice==“1”: print(“Adding books to the library”) elif choice==“2”: print(“Removing books from the library”) elif choice==“3”: print(“Issuing the book to the member”) elif choice==“4”: print(“Taking the book from the member”) elif choice==“5”: break else: print(“Invalid option! try again…”)
”’ Based on this program, create a Menu option for performing basic arithematic operations like + – * / // ** % ”’
”’ Develop a guessing number Game ”’ import random num1 = random.randint(1,100) attempts = 0 while True: guess = int(input(“Guess the number (1-100): “)) if guess >100 or guess < 1: print(“INVALID NUMBER!”) continue attempts+=1 if num1 ==guess: print(f”Good job! You guessed it correctly in {attempts} attempts.”) break elif num1 < guess: print(“Sorry! Try guessing lower number”) else: print(“Sorry! Try guessing higher number”)
# # # # # # ”’ Develop a guessing number Game ”’ import random num1 = random.randint(1,100) attempts = 0 low,high = 1,100 while True: #lets make computer guess the number guess = random.randint(low,high) if guess >100 or guess < 1: print(“INVALID NUMBER!”) continue attempts+=1 if num1 ==guess: print(f”Good job! You guessed {guess} correctly in {attempts} attempts.”) break elif num1 < guess: print(f”Sorry! Try guessing lower number than {guess}“) high=guess – 1 else: print(f”Sorry! Try guessing higher number than {guess}“) low= guess + 1 # # # # # # # # # # LIST # # # # # l1 = [25,45.9,“1”,[29,41]] print(“Data type of L1 : “,type(l1)) print(“Number of values in L1 : “,len(l1)) print(“Values in the list = “,l1)
# Indexing #indexing starts with 0, print(“4th value from the list: “,l3[3]) l5 = l3[3] print(l5[1]) print(l3[3][1]) print(“last member of the list = “,l3[len(l3)-1]) print(“last member of the list = “,l3[-1]) print(“first member of the list = “,l3[0]) print(“first member of the list = “,l3[-len(l3)]) print(“First 3 members of the list: “,l3[0:3]) print(“First 3 members of the list: “,l3[:3]) print(“Last 3 members of the list: “,l3[:])
”’ List – linear ordered mutable collection ”’ l1 = [5,15,25,35] l1[1] = 20 #mutable print(l1) print(l1[1:3]) print(l1[:]) #left of : blank -0 index, blank on right indicate – last index print(“Last 2 members of L1=”,l1[2:4], “or”,l1[2:],“or”,l1[-2:]) var1 = ‘hello’ print(var1[1]) # strings are immutable #var1[1]=”E” – TypeError: ‘str’ object does not support item assignment l1 = [100] print(type(l1)) l1.append(10) # to add members to the list – it will add at the end l1.append(20) l1.append(30) #insert() – also adds but it needs the position l1.insert(40,1) l1.insert(1,50) l1.insert(1,20) l1.insert(1,20)
print(l1) #remove(value_to_be_removed) , pop(index_to_be_deleted) # remove all 20s #first count the # of 20 num_20 = l1.count(20) for i in range(num_20): l1.remove(20) print(l1) l1.pop(1) print(l1) l1.append(100) # check if 100 is in the list count_100 = l1.count(100) if count_100 >0: print(“100 is in the list and at position”,l1.index(100)) print(“100 is in the list and at position”, l1.index(100,1,7)) else: print(“100 is not in the list”)
## write a program to read marks of 5 subjects # and store the values and display total and avg marks=[] for i in range(5): m = int(input(“Enter marks: “)) marks.append(m)
print(“Total marks = “,sum(marks),“and avg=”,sum(marks)/5) l2 = [1,2,3,4,5,6,2,3,4] l2.reverse() print(“Reverse: “,l2) l2.sort() #sort in increasing order print(“Sort: “,l2) l2.sort(reverse=True) #sorting in decreasing order print(l2) #### print(“======= ========”)
l2.clear() # delete all the members from the list print(“After clear: “,l2)
”’ Stack – First In Last Out: Assignment – Implement Stack concept using List: use: while True to create a menu with 4 options: 1. Add to the stack 2. Remove to the stack 3. Print the values from the stack 4. Quit Start with an empty list and perform addition/deletion to the list based on user selection ”’
# TUPLE val1 = (1, “hello”, [3, 4, 5]) # packing print(“Data type = “, type(val1)) # immutable l1 = [2, 3, 4] l1[1] = 33 print(“L1 = “, l1) # val1[1]=”There” TypeError: ‘tuple’ object does not support item assignment # so tuples are immutable # indexing – reading values from tuple is exactly like list for t in val1: print(t)
# unpacking: v1, v2, v3 = val1 print(v1, v2, v3) print(“Number of values in tuple=”, len(val1))
val2 = (39, 68) val3 = (14, 98, 254, 658) if val2 > val3: print(“Val2 is greater”) else: print(“Val3 is greater”)
# converting tuple to list val1 = list(val1) val1 = tuple(val1) #STRINGS # handling text data str1 = “HELLO” str2 = ‘Hi there’ str3 = ”’Hello how are you? Are doing well? See you soon”’ str4 = “””I am fine I am doing well hope you are doing great too””” print(str3)
# indexing is very similar to list and tuple print(str1[0],str1[-1],str1[:]) print(str1+” “+str2) print(str1*3) # strings are also (like tuple) immutable # you cant edit the existing string value #str1[0]=”Z” TypeError: ‘str’ object does not support item assignment str1 = “Z”+str1[1:] print(“New Value = “,str1)
# you can use str in for loop just like List or Tuple for s in str1: print(s)
#methods in str class str1 = “heLLO how aRE YOU?” # case related – convering to a specific case print(str1.lower()) print(str1.upper()) print(str1.title())
#checking the existing str value str1 = “heLLO how aRE YOU?” str2= “helohowru” print(str1.isalpha()) print(str2.isalpha()) print(str1.isspace()) print(str2.islower()) print(str2.isalnum())
”’ NLP – Natural Language Processing – analysing review comment to understand reasons for positive and negative ratings. concepts like: unigram, bigram, trigram Steps we generally perform with NLP data: 1. Convert into lowercase 2. decompose (non unicode to unicode) 3. removing accent: encode the content to ascii values 4. tokenization: will break sentence to words 5. Stop words: not important words for analysis 6. Lemmetization (done only on English words): convert the words into dictionary words 7. N-grams: set of one word (unigram), two words (bigram), three words (trigrams) 8. Plot the graph based on the number of occurrences and Evaluate ”’ ”’ cardboard mousepad. Going worth price! Not bad ”’ link=“D:/datasets/OnlineRetail/order_reviews.csv” import pandas as pd import unicodedata import nltk import matplotlib.pyplot as plt df = pd.read_csv(link) print(list(df.columns)) ”’ [‘review_id’, ‘order_id’, ‘review_score’, ‘review_comment_title’, ‘review_comment_message’, ‘review_creation_date’, ‘review_answer_timestamp’] ”’ #df[‘review_creation_date’] = pd.to_datetime(df[‘review_creation_date’]) #df[‘review_answer_timestamp’] = pd.to_datetime(df[‘review_answer_timestamp’]) # data preprocessing – making data ready for analysis reviews_df = df[df[‘review_comment_message’].notnull()].copy() #print(reviews_df) # remove accents def remove_accent(text): return unicodedata.normalize(‘NFKD’,text).encode(‘ascii’,errors=‘ignore’).decode(‘utf-8’) #STOP WORDS LIST: STOP_WORDS = set(remove_accent(w) for w in nltk.corpus.stopwords.words(‘portuguese’))
”’ Write a function to perform basic preprocessing steps ”’ def basic_preprocessing(text): #converting to lower case txt_pp = text.lower() #print(txt_pp) #remove the accent #txt_pp = unicodedata.normalize(‘NFKD’,txt_pp).encode(‘ascii’,errors=’ignore’).decode(‘utf-8’) txt_pp =remove_accent(txt_pp) #print(txt_pp) #tokenize txt_token = nltk.tokenize.word_tokenize(txt_pp) #print(txt_token) # removing stop words txt_token = tuple(w for w in txt_token if w not in STOP_WORDS and w.isalpha()) return txt_token
## write a function to creaet unigram, bigram, trigram def create_ngrams(words): unigrams,bigrams,trigrams = [],[],[] for comment in words: unigrams.extend(comment) bigrams.extend(‘ ‘.join(bigram) for bigram in nltk.bigrams(comment)) trigrams.extend(‘ ‘.join(trigram) for trigram in nltk.trigrams(comment))
#get positive reviews – all 5 ratings in review_score reviews_5 = reviews_df[reviews_df[‘review_score’]==5]
#get negative reviews – all 1 ratings reviews_1 = reviews_df[reviews_df[‘review_score’]==1] #create ngrams for rating 5 and rating 1 uni_5, bi_5, tri_5 = create_ngrams(reviews_5[‘review_comment_words’]) print(uni_5) print(bi_5) print(tri_5)
# Assignment: perform similar tasks for reviews that are negative (review score = 1) #uni_1, bi_1, tri_1 = create_ngrams(reviews_1[‘review_comment_words’]) #print(uni_5) # distribution plot def plot_dist(words, color): nltk.FreqDist(words).plot(20,cumulative=False, color=color)
plot_dist(tri_5, “red”)
#NLP – Natural Language processing:
# sentiments: Positive, Neutral, Negative
#
”’
we will use nltk library for NLP:
pip install nltk
”’ import nltk #1. Convert into lowercase text = “Product is great but
I amn’t liking the colors as they are worst” text = text.lower()
”’
2. Tokenize the content: break it into words or sentences
”’ text1 = text.split() #using nltk from nltk.tokenize import
sent_tokenize,word_tokenize
text = word_tokenize(text) #print(“Text =\n”,text)
#print(“Text =\n”,text1)
”’
3. Removing Stop words: Words which are not significant
for your analysis. E.g. an, a, the, is, are
”’ my_stopwords = [‘is’,‘i’,‘the’]
text1 = text for w in text1: if w in my_stopwords:
text.remove(w) print(“Text
after my stopwords:”,text1)
nltk.download(“stopwords”) from nltk.corpus import
stopwords
nltk_eng_stopwords = set(stopwords.words(“english”)) #print(“NLTK list of stop words in English:
“,nltk_eng_stopwords) ”’
Just for example: we see the word but in the STOP WORDS but
we want to include it, then we need to remove the word from the set
”’ # removing but from the NLTK stop words nltk_eng_stopwords.remove(‘but’)
for w in text: if w in nltk_eng_stopwords:
text.remove(w) print(“Text
after NLTK stopwords:”,text)
”’
4. Stemming: changing the word to its root
eg: {help: [help, helped, helping, helper]}
One of the method is Porter stemmer
”’ from nltk.stem import
PorterStemmer
stemmer = PorterStemmer()
text = [stemmer.stem(w) for w in text] ”’ above line is like below:
t_list=[]
for w in text:
a = stemmer.stem(w)
t_list.append(a)
”’ print(“Text
after Stemming:”,text) ”’
5. Part of Speech Tagging (POS Tagging)
grammatical word which deals with the roles they place
like – 8 parts of speeches – noun, verb, …
Reference: https://www.educba.com/nltk-pos-tag/
POS Tagging will give Tags like
CC: It is the conjunction of coordinating
CD: It is a digit of cardinal
DT: It is the determiner
EX: Existential
FW: It is a foreign word
IN: Preposition and conjunction
JJ: Adjective
JJR and JJS: Adjective and superlative
LS: List marker
MD: Modal
NN: Singular noun
NNS, NNP, NNPS: Proper and plural noun
PDT: Predeterminer
WRB: Adverb of wh
WP$: Possessive wh
WP: Pronoun of wh
WDT: Determiner of wp
VBZ: Verb
VBP, VBN, VBG, VBD, VB: Forms of verbs
UH: Interjection
TO: To go
RP: Particle
RBS, RB, RBR: Adverb
PRP, PRP$: Pronoun personal and professional
But to perform this, we need to download any one tagger:
e.g. averaged_perceptron_tagger
nltk.download(‘averaged_perceptron_tagger’)
”’ nltk.download(‘averaged_perceptron_tagger’)
”’
6. Lemmetising
takes a word to its core meaning
We need to download: wordnet
”’ nltk.download(‘wordnet’) from nltk.stem import
WordNetLemmatizer
lemmatizer = WordNetLemmatizer() print(“Very
good = “,lemmatizer.lemmatize(“very good”)) print(“Halves
= “,lemmatizer.lemmatize(“halves”))
text = “Product is great but I amn’t liking the colors
as they are worst” text = word_tokenize(text)
text = [lemmatizer.lemmatize(w) for w in text] print(“Text
after Lemmatizer: “,text)
# Sentiment analysis – read the sentiments of each sentence ”’ If you need more data for your analysis, this is a good source: https://github.com/pycaret/pycaret/tree/master/datasets We will use Amazon.csv for this program ”’ import pandas as pd from nltk.corpus import stopwords from nltk.tokenize import word_tokenize from nltk.stem import WordNetLemmatizer from nltk.sentiment.vader import SentimentIntensityAnalyzer
link = “https://raw.githubusercontent.com/pycaret/pycaret/master/datasets/amazon.csv” df = pd.read_csv(link) print(df)
#Let’s create a function to perform all the preprocessing steps # of a nlp analysis def preprocess_nlp(text): #tokenise #print(“0”) text = text.lower() #lowercase #print(“1”) text = word_tokenize(text) #tokenize #print(“2”) text = [w for w in text if w not in stopwords.words(“english”)] #lemmatize #print(“3”) lemm = WordNetLemmatizer() #print(“4”) text = [lemm.lemmatize(w) for w in text] #print(“5”) # now join all the words as we are predicting on each line of text text_out = ‘ ‘.join(text) #print(“6”) return text_out
# NLTK Sentiment Analyzer # we will now define a function get_sentiment() which will return # 1 for positive and 0 for non-positive analyzer = SentimentIntensityAnalyzer() def get_sentiment(text): score = analyzer.polarity_scores(text) sentiment = 1 if score[‘pos’] > 0 else 0 return sentiment
# Visualization import matplotlib.pyplot as plt import numpy as np data = np.random.randn(1000) plt.hist(data, bins=30, histtype=‘stepfilled’, color=“red”) plt.title(“Histogram Display”) plt.xlabel(“Marks”) plt.ylabel(“Number of Students”) plt.show()
# Analyzing Hotel Bookings data # https://github.com/swapnilsaurav/Dataset/blob/master/hotel_bookings.csv link=“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/hotel_bookings.csv” import pandas as pd df = pd.read_csv(link) #print(“Shape of the data: “,df.shape) #print(“Data types of the columns:”,df.dtypes) import numpy as np df_numeric = df.select_dtypes(include=[np.number]) #print(df_numeric) numeric_cols = df_numeric.columns.values #print(“Numeric column names: “,numeric_cols) df_nonnumeric = df.select_dtypes(exclude=[np.number]) #print(df_nonnumeric) nonnumeric_cols = df_nonnumeric.columns.values #print(“Non Numeric column names: “,nonnumeric_cols) #### #preprocessing the data import seaborn as sns import matplotlib.pyplot as plt colors = [“#091AEA”,“#EA5E09”] cols = df.columns sns.heatmap(df[cols].isnull(), cmap=sns.color_palette(colors)) plt.show()
cols_to_drop = [] for col in cols: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: #print(f”{col} -> {pct_miss}”) cols_to_drop.append(col) #column list to drop # remove column since it has more than 80% missing value df = df.drop(cols_to_drop, axis=1)
for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: print(f”{col} -> {pct_miss}“) # check for rows to see the missing values missing = df[col].isnull() num_missing = np.sum(missing) if num_missing >0: df[f’{col}_ismissing’] = missing print(f”Created Missing Indicator for {cols}“)
### keeping track of the missing values ismissing_cols = [col for col in df.columns if ‘_ismissing’ in col] df[‘num_missing’] = df[ismissing_cols].sum(axis=1) print(df[‘num_missing’])
# drop rows with > 12 missing values ind_missing = df[df[‘num_missing’] > 12].index df = df.drop(ind_missing,axis=0) # ROWS DROPPED #count for missing values for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >0: print(f”{col} -> {pct_miss}“)
”’ Still we are left with following missing values: children -> 2.0498257606219004 babies -> 11.311318858061922 meal -> 11.467129071170085 country -> 0.40879238707947996 deposit_type -> 8.232810615199035 agent -> 13.687005763302507 ”’
# Analyzing Hotel Bookings data # https://github.com/swapnilsaurav/Dataset/blob/master/hotel_bookings.csv link=“https://raw.githubusercontent.com/swapnilsaurav/Dataset/master/hotel_bookings.csv” import pandas as pd df = pd.read_csv(link) #print(“Shape of the data: “,df.shape) #print(“Data types of the columns:”,df.dtypes) import numpy as np df_numeric = df.select_dtypes(include=[np.number]) #print(df_numeric) numeric_cols = df_numeric.columns.values print(“Numeric column names: “,numeric_cols) df_nonnumeric = df.select_dtypes(exclude=[np.number]) #print(df_nonnumeric) nonnumeric_cols = df_nonnumeric.columns.values print(“Non Numeric column names: “,nonnumeric_cols)
#### #preprocessing the data import seaborn as sns import matplotlib.pyplot as plt colors = [“#091AEA”,“#EA5E09”] cols = df.columns sns.heatmap(df[cols].isnull(), cmap=sns.color_palette(colors)) plt.show()
cols_to_drop = [] for col in cols: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: #print(f”{col} -> {pct_miss}”) cols_to_drop.append(col) #column list to drop # remove column since it has more than 80% missing value df = df.drop(cols_to_drop, axis=1)
for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >80: print(f”{col} -> {pct_miss}“) # check for rows to see the missing values missing = df[col].isnull() num_missing = np.sum(missing) if num_missing >0: df[f’{col}_ismissing’] = missing #print(f”Created Missing Indicator for {cols}”) ### keeping track of the missing values ismissing_cols = [col for col in df.columns if ‘_ismissing’ in col] df[‘num_missing’] = df[ismissing_cols].sum(axis=1) print(df[‘num_missing’])
# drop rows with > 12 missing values ind_missing = df[df[‘num_missing’] > 12].index df = df.drop(ind_missing,axis=0) # ROWS DROPPED #count for missing values for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >0: print(f”{col} -> {pct_miss}“)
”’ Still we are left with following missing values: children -> 2.0498257606219004 # numeric babies -> 11.311318858061922 #numeric meal -> 11.467129071170085 # non-numeric country -> 0.40879238707947996 # non-numeric deposit_type -> 8.232810615199035 # non-numeric agent -> 13.687005763302507 #numeric ”’ #HANDLING NUMERIC MISSING VALUES df_numeric = df.select_dtypes(include=[np.number]) for col in df_numeric.columns.values: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss > 0: med = df[col].median() df[col] = df[col].fillna(med)
#HANDLING non-NUMERIC MISSING VALUES df_nonnumeric = df.select_dtypes(exclude=[np.number]) for col in df_nonnumeric.columns.values: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss > 0: mode = df[col].describe()[‘top’] df[col] = df[col].fillna(mode)
print(“#count for missing values”) for col in df.columns: pct_miss = np.mean(df[col].isnull()) * 100 if pct_miss >0: print(f”{col} -> {pct_miss}“)
#drop duplicate values print(“Shape before dropping duplicates: “,df.shape) df = df.drop(‘id’,axis=1).drop_duplicates() print(“Shape after dropping duplicates: “,df.shape)
# print() – print(“5+3=”,5+3,“and 6+4=”,6+4,end=” : “); # every print() statement has an invisible newline \n print( “5+5=”,5+5,“and 6+14=”,6+14);
print(“Twinkle Twinkle”, end=” ” ) # sample comment fgjdtjgdhjcghjgh print(“Little star”) # indentation is key! #semicolon ; exist in Python but its not mandatory print(“Hi there”); print(“How are you?”) # syntax : grammer of human language #variables – stores temporary values var1 = 5 # create a variable by name var1 which currently has the value 5 var2 = 10 var1 = 20 print(var1 + var2) #data which variables have # data types- what kind of values a variable has #basic data types: int (integer – no decimal part): -5, -2,0,5,999… # type() gives the current data types print(“Type of var1 = “,type(var1))
# complex numbers – square root of minus numbers # square root of -1 is i (in maths) – j (in Python) # sq root of -16: 16 * -1 = 4j var4 = 3+4j print(“Data type of var4 = “,type(var4)) # (3+4j)*(3-4j) = 9 – (-16)= 9+16 = 25 + 0j print((3+4j)*(3–4j)) # 3.2 + 2.8 =6.0 # 4th datatype – text type is called – str (string) var5 = “hello” #string data will always be in quotes: ‘ or “ print(“datatype(var5) =”,type(var5))
# 5th data type = boolean (bool) # bool = True or False only var6 = False print(“Data type of var6 =”,type(var6)) # var6 is an object of class bool – meaning var6 inherits all the # properties of class bool var7 = ‘FALSE’ # program is run successfully when exit code is 0 quantity = 17 price = 48 total_cost = quantity * price print(“Cost of each pen is”,price,“so the total cost of”,quantity,“pens would be”,total_cost) # format string print(f”Cost of each pen is {price} so the total cost of {quantity} pens would be {total_cost}“) ”’ refer to lines 51 to 56, write below programs: 1. WAP to calulcate area and perimeter of a rectangle 2. WAP to calculate area and circunference of a circle ”’ var1 = 5 print(var1) var1 = “Five” print(var1)
”’ #about variables 1. variable name should start with a text 2. variable name can have digits and _ ”’ cost = 17 quantity = 5 price = cost * quantity print(“The cost of each pen is”,cost,“so the total cost of”,quantity,“pens will be”,price) # format string / f-string print(f”The cost of each pen is {cost} so the total cost of {quantity} pens will be {price}“)
price = 100 quantity = 33 cost = price / quantity print(“The cost of each pen is”,cost,“so the total cost of”,quantity,“pens will be”,price) # format string / f-string print(f”The cost of each pen is {cost:.1f} so the total cost of {quantity} pens will be {price}“)
player,country,position = “Kohli”,“India”,“Opener” print(f”Player {player:<12} plays for the country {country:>10} and is {position:^15} for the team.”)
player,country,position = “Manbwange”,“Zimbabwe”,“Wicket-Keeper” print(f”Player {player:<12} plays for the country {country:>10} and is {position:^15} for the team.”) print(f”Player {player:.<12} plays for the country {country:>10} and is {position:X^15} for the team.”)
# input dynamic value from the user a = input(“Enter a number: “) print(“You have entered:”,a) print(f”1. Data type of {a} is {type(a)}“) b= 100 #assigning an integer c=“100” #assigning a string # input() since it cant predict what value is being entered, # it assumes that all the input value is a string a = input(“Enter a number: “) a= int(a) #explicit conversion of data types print(f”2. Data type of {a} is {type(a)}“)
# int(), float(), str(), bool(), complex() ## escape sequeence – \ works only for one character after it appears ## to add or remove super power print(“We are talking about\nis for new line”) print(“We are talking about\tis for new line”) print(“We are talking about\\tis for new line”) print(“We are talking about\\nis for new line”)
# \\n in python prints \n print(“\\\\n in python prints \\n”)
# Different operators # Arithematic operators num1,num2 = 59,10 print(num1 + num2) #addition print(num1 – num2) # subtraction print(num1 * num2) # multiplication print(num1 / num2) #division – 5.0 print(2 ** 3) #power print(num1 // num2) #integer division print(num1 % num2) # % mod – remainder # comparison /relational operators # > < >= <= ==(is it equal?) != #anything as input – output is always bool num1,num2 = 59,10 print(“Greater”) print(num1 > num2) #True print(num1 >= num1) #True print(num1 > num1) # False print(“Smaller”) print(num1 < num2) # False print(num1 <= num2) # False print(num1 < num1) # False print(“Equal”) print(num1 == num2) # False print(num1 == num1) # True print(num1 != num2) # True print(num1 != num1) # False # logical operators: # operators: and or not #input and output are all bool values # prediction: Rahul and Rohit will open the batting # actual: Rohit and Gill opened the batting # and (*): True and True = True , rest all the 3 options are False # or (+): False or False = False , rest all the 3 options are True # prediction: Rahul or Rohit will open the batting # actual: Rohit and Gill opened the batting print(num1 > num2 or num1 >= num1 and num1 > num1 or num1 < num2 and num1 <= num2 or num1 < num1 and num1 == num2 or num1 == num1 and num1 != num2 or num1 != num1) ”’ True ”’ print(not True) print(not False)
#### #### ”’ Condition and iterations Conditions are handled using if ”’ avg = 39 # if avg >= 40 then say pass otherwise say fail if avg >=40: print(“You have passed!”)
if avg >=40: print(“You have passed!”) print(“PASSSSSSSSSSSSS”) else: print(“You have failed!”)
# conditions in Python # WAP to find sum, avg of 3 subjects marks and check if pass or fail ”’ Any program is made up of: 3 parts: input, process, output ”’ # 1. input marks in 3 subjects m1 = int(input(“Enter the marks in subject 1: “)) m2 = int(input(“Enter the marks in subject 2: “)) m3 = int(input(“Enter the marks in subject 3: “)) #process: calculating total and avg total = m1 + m2 + m3 avg = total /3 if avg >=40: # output – when if condition becomes True print(“You have passed”) else: # when if is False print(“You have failed”)
#WAP to check if a number is positive or not num1 = int(input(“Enter a number: “)) if num1 >0: print(f”{num1} is positive”) elif num1 <0: print(f”{num1} is negative”) else: print(f”{num1} is neither positive nor negative”)
#WAP to check if a number is odd or even num1 = int(input(“Enter a number to check if its odd or even: “)) if num1>=0: print(“Can be odd or even”)
if num1 % 2 == 0: print(f”{num1} is an even number”) else: print(f”{num1} is an odd number”) else: print(“Negative numbers are not fit”)
”’ Assignment: input a number and check if its positive or negative. If positive check if the number is divisible by 2 , 3, 7. if its divisible by 7 then check if its even or odd. If its odd check if its greater than 100 or not ”’ ”’ Take average and check if pass or fail and then assign grade based on: A : avg > 85% B : avg > 75% C : avg > 60% D: avg > 40% E: avg < 40% ”’ # 1. input marks in 3 subjects m1 = int(input(“Enter the marks in subject 1: “)) m2 = int(input(“Enter the marks in subject 2: “)) m3 = int(input(“Enter the marks in subject 3: “)) #process: calculating total and avg total = m1 + m2 + m3 avg = total /3 if avg >=40: # output – when if condition becomes True print(“You have passed”) if avg>=85: print(“Grade: A”) elif avg>=75: print(“Grade: B”) elif avg>=50: print(“Grade: C”) else: print(“Grade: D”) else: # when if is False print(“You have failed”) print(“Grade: E”)
”’ Loops – execute same block of code more than once 1. For loop: when you know exactly how many times to run the loop 2. While loop: when you dont know exactly how many times but you know a perticular condition till when to run ”’ # working of range(): generates range of values ”’ range(start,end,increment): range() takes 3 values. Start indicates the starting value of the series. End indicates last but one value(upto end, end is not included). increment will increase the start value range(3,12,3) – 3,6,9 range(start,end): increment is default 1 range(3,8): 3,4,5,6,7 range(end): default start is 0 and increment is 1 range(5): 0,1,2,3,4 ”’ for i in range(3,12,3): print(“Hello and the value of i is”,i)
for i in range(3,8): print(“Hello there and the value of i is”,i)
for i in range(5): print(“Hello and the value of i is”,i)
#while will have always have condition, as long as the condition # is true, while loop will repeat count = 0 while count < 5: print(“HELLO”) count=count+1 ch=“y” while ch==“y”: print(“Do you want to continue?”) ch=input(“Enter y to continue or any other key to stop: “) # for loop example using range for i in range(5): print(“*”,end=” “)
”’ * * * * * * * * * * * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(5): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(j+1): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(5): for i in range(5-j): print(“*”,end=” “) print()
”’ * * * * * * * * * * * * * * * ”’ for j in range(5): for k in range(5-j-1): print(” “,end=“”)
for i in range(j+1): print(“*”,end=” “) print() ”’ Assignment: * * * * * * * * * * * * * * * ”’
# While loop in Python ”’ Write a program to keep checking if a number is odd or even until user enters a negative number ”’ num1 = int(input(“Enter the number: “)) while num1 >=0: if num1%2 ==0: print(“Its even!”) else: print(“Its odd!”) num1 = int(input(“Enter the number: “))
cont = “y” while cont ==“y”: num1 = int(input(“Enter first number: “)) num2 = int(input(“Enter second number: “)) print(“Menu:”) print(“1. Addition”) print(“2. Subtraction”) print(“3. Multiplication”) print(“4. Division”) print(“5. Quit”) op = input(“Enter the option from above menu (1 or 2 or 3 or 4 or 5): “) if op==“1”: print(“Sum of the given two numbers are: “,num1 + num2) elif op==“2”: print(“Difference of the given two numbers are: “,num1 – num2) elif op==“3”: print(“Product of the given two numbers are: “,num1 * num2) elif op==“4”: print(“Ratio of the given two numbers are: “,num1 / num2) elif op==“5”: cont = “n” else: print(“You have not given a valid option, start from beginning!”)
”’ Assignment: Modify the above program so that user is asked to enter numbers only when they choose options 1 to 4. ”’ while True: num1 = int(input(“Enter first number: “)) num2 = int(input(“Enter second number: “)) print(“Menu:”) print(“1. Addition”) print(“2. Subtraction”) print(“3. Multiplication”) print(“4. Division”) print(“5. Quit”) op = input(“Enter the option from above menu (1 or 2 or 3 or 4 or 5): “) if op==“1”: print(“Sum of the given two numbers are: “,num1 + num2) elif op==“2”: print(“Difference of the given two numbers are: “,num1 – num2) elif op==“3”: print(“Product of the given two numbers are: “,num1 * num2) elif op==“4”: print(“Ratio of the given two numbers are: “,num1 / num2) elif op==“5”: break else: print(“You have not given a valid option, start from beginning!”)
# checking the condition: Entry check & Exit check ”’ WAP to generate Fibonacci series numbers: 0,1,1,2,3,5,8,13…. 1. first 5 numbers of the series 2. till user wants ”’ print(“Fibonacci numbers are (using For loop):”) n = 10 f,s=0,1 for i in range(n): if i<2: print(i,end=” , “) else: t = s+f # 0,1, print(t, end=” , “) f=s s=t
print(“\nFibonacci numbers are (using While loop):”) print(“Enter any key to stop: “) f,s=0,1 counter = 1 while True: if counter ==1: print(f, end=” , “) elif counter ==2: print(s, end=” , “) else: t = s+f print(t, end=” , “) f=s s=t
counter += 1 #counter=counter + 1 cont = input() if len(cont)>0: break ## ”’ Develop guessing number game: Computer (giving the number) v Human (Guessing the number) ”’ import random #random is a module which has functions related to random number generation num = random.randint(1,100) attempts = 0 while True: guess = int(input(“Enter the number to guess (1-100): “)) if guess <1 or guess >100: print(“Invalid guess!”) continue attempts+=1 #attempts= attempts+1 if guess == num: print(f”Congratulations! You have guessed it correctly in {attempts} attempts!”) break elif guess < num: print(“Try to guess a higher number!”) else: print(“Try to guess a lower number!”)
”’ Develop guessing number game: Human (giving the number) v Computer (Guessing the number) ”’ import random #random is a module which has functions related to random number generation num = 55 attempts = 0 low,high = 1,100 while True: #guess = int(input(“Enter the number to guess (1-100): “)) guess = random.randint(low,high) if guess <1 or guess >100: print(“Invalid guess!”) continue attempts+=1 #attempts= attempts+1 if guess == num: print(f”Congratulations! You have guessed it correctly in {attempts} attempts!”) break elif guess < num: print(f”Try to guess a higher number than {guess}!”) low = guess+1 else: print(f”Try to guess a lower number than {guess}!”) high=guess-1 ”’ Develop guessing number game: Computer (giving the number) v Computer (Guessing the number) ”’ import random #random is a module which has functions related to random number generation num = random.randint(1,100) attempts = 0 low,high = 1,100 while True: #guess = int(input(“Enter the number to guess (1-100): “)) guess = random.randint(low,high) if guess <1 or guess >100: print(“Invalid guess!”) continue attempts+=1 #attempts= attempts+1 if guess == num: print(f”Congratulations! You have guessed it correctly in {attempts} attempts!”) break elif guess < num: print(f”Try to guess a higher number than {guess}!”) low = guess+1 else: print(f”Try to guess a lower number than {guess}!”) high=guess-1
# Strings – used for handling text data str1 = ‘HELLO’ \ ‘i’ str2 = “How are you” str3 = ”’I am fine I am doing alright I am super”’ str4 = “””I am here I am there I am everywhere””” print(str1, str2, str3, str4) print(type(str1),type(str2),type(str3),type(str4)) str5 = “How are” str6 = “You?” print(str5 +” “+ str6) print(“Hello ” + str(5)) print(“HELLO ” * 5)
#for loop for i in str6: print(“running the loop: “,i)
# indexing str7 = “I am fine here how are you there” print(“Length: “,len(str7)) # position of each character starts from 0 print(str7) print(“First character of str7 = “,str7[0]) print(“Second character of str7 = “,str7[1]) print(“Last character of str7 = “,str7[ len(str7) – 1 ]) print(“Last character of str7 = “,str7[ – 1 ]) print(“Second last character of str7 = “,str7[ – 2 ]) print(“6th to 9th characters: “,str7[5:9])
# Strings str1 = “I am fine” # first 3 print(str1[0:3]) #last 3 print(str1[-3:]) # print(type(str1)) #methods are the functions defined under a class # is.. is for asking question # isalpha() is like asking is the object alphabets print(“Is alpha: “,str1.isalpha()) str2 = “helothere” print(“Is alpha: “,str2.isalpha())
num1= input(“Enter a number: “) if num1.isdigit(): num1 = int(num1) else: print(“Invalid number!”) print(“Str1 isit titlecase:”, str1.istitle()) print(“Str1 convert to title case: “,str1.title()) print(“Str1 is it uppercase: “,str1.isupper()) print(“Str1 is uppercase: “,str1.upper()) print(“Str1 is it lowercase: “,str1.islower()) print(“Str1 is lowercase: “,str1.lower()) str2 = “I am fine I am good how are you am fine?” print(“count of o: “,str2.count(“am”)) print(“count of o: “,str2.count(“am”,5,15)) print(“Split: “,str2.split(‘am’)) l1 = [‘I ‘, ‘ fine I ‘, ‘ good how are you ‘, ‘ fine?’] print(“Join: “,“am”.join(l1)) start = 3 if str2.count(“am”,start)>0: print(“Index: “,str2.index(“am”,start)) else: print(“Given text not found in str2”) print(str2.replace(“am”,“AM”))
# votingin india you need to be over 18 yrs and nationality India nationality=“” #initialize age = input(“Enter your age:”) if age.isdigit(): age = int(age) if age>=18: #age criteria matched nationality = input(“Enter you country of nationality:”) if nationality.lower()==“india”: print(“You are eligible to vote in India”) else: print(“Your nationality doesnt match”) else: print(“Your age criteria not matched”) else: print(“Not a valid age, hence exiting the program”)
print(“Printing nationality: “,nationality)
# Strings are immutable – you cant edit the value of a string #nationality[2]=”d” TypeError: ‘str’ object does not support item assignment # Collections – multiple values in one variable # list – mutable ordered collection list1 = [5,10,“Hello”, 6.5, True] print(type(list1)) print(“first member: “,list1[0]) print(“last member: “,list1[-1]) print(“last 3 member: “,list1[-3:]) print(“first 3 member: “,list1[:3]) print([1,2,3] + [“First”,“Second”,“Third”]) print([1,2,3] * 4)
for i in list1: print(f”{i} => {type(i)}“)
for i in range(len(list1)): print(f”{list1[i]} => {type(list1[i])}“)
### list1 = [5,10,“Hello”, 6.5, True] if list1.count(“hello”) > 0: idx = list1.index(“hello”) list1.pop(idx) else: print(“hello is not in the list”)
# pop() looks for index, remove() looks for the value list1.remove(“Hello”) print(list1)
# add value to the list: # append() – will add values at the end of the list # insert() – needs value and also index list1.append([2,4,6,8]) list1.insert(1,“Sachin Tendulkar”) print(“After adding: “,list1) print(list1[-1][2]) # how to access list inside another list
#LIST – 2 # list is an ordered mutable collection # methods like append(), insert(), remove(), pop(), #count(), index() #wap to read marks of 5 students in their 5 subjects marks = [] for i in range(5): print((f”Enter the marks for student {i+1} “)) tlist=[] for j in range(5): m = int(input(f”Enter the marks in subject {j+1}: “)) tlist.append(m) marks.append(tlist)
# copy l3=l1 # l3 and l1 point to the same dataset(location in memory) l4 =l1.copy() #shallow copy – it creates a new copy print(“1. L1: “,l1) print(“1. L3: “,l3) print(“1. L4: “,l4) l1.append(11) l1.append(12) l3.append(13) l4.append(15) print(“2. L1: “,l1) print(“2. L3: “,l3) print(“2. L4: “,l4)
l1.clear() print(“L3 = “,l3)
### TUPLES ### # Tuple – ordered immutable collection t1 =() print(type(t1)) t1=(1,2,3,4,1,2,3,1,2,1) print(“Count = “,t1.count(2)) print(“Index = “,t1.index(2)) t1 = (4,) print(type(t1)) t1 = (4,5) #packing print(type(t1)) a,b = t1 #unpacking print(a,b) t1 = list(t1) #converting a tuple to list t1 = tuple(t1) #converting a list to tuple #accessing tuples are faster than accessing lists t1=(10,30,20,40) #iteration (loop) and indexing is exactly like List/String print(t1[1])
#Dictionary: unordered collection – doesnt have index, instead # dictionary has key:value pair – as a user you have to give the key dict1 = {1: “Five”, “Eight”: 80} print(type(dict1)) print(dict1) print(dict1[‘Eight’])
marks = {} for i in range(3): n = input(f”Enter Student’s name: “) tlist=[] for j in range(3): m = int(input(f”Enter the marks in subject {j+1}: “)) tlist.append(m)
# pop() and popitem() both are used to remove data #popitem() has no arguments – it will delete on of its own (latest added) # pop() takes the key and that perticular key:value pair is deleted dict2.popitem() print(“2. Dict4 – =: “,dict4) dict2.pop(“R”) print(“3. Dict4 – =: “,dict4) dict2 = {‘S’: [3, 6, 9], ‘R’: [1, 5, 9], ‘P’: [9, 5, 1]} print(dict2[‘S’]) print(“Get key for the given value: “,dict2.get(“S”)) #another way of getting the value dict2 = {‘S’: [3, 6, 9], ‘R’: [1, 5, 9], ‘P’: [9, 5, 1]} x = dict2.setdefault(“T”,{0,0,0}) print(x) print(dict2) dict2[“T”] = {10,20,30} print(dict2)
print(set1.isdisjoint(set2)) #set1.copy() – deep copy set1.clear() print(“Set 1: “,set1) # sets, lists and tuples can be converted into each others form set2 = {3,4,11,8,7} set2 = list(set2) set2.append(19) set2 = set(set2)
## Functions ### ”’ 1. inbuilts functions: print(), len(), set(), list() ….. 2. user defined functions: 3. one line functions ”’ def list_of_questions(): print(“How are you?”) print(“Where are you?”) print(“Are you coming?”)
list_of_questions() print(“printing list of questions once again!!!!!!”) list_of_questions()
# function definition: of no input argument def sum_two_nums(): num1 = 45 num2 = 38 add = num1 + num2 print(“Sum of two numbers is”,add)
# function definition: of taking input argument def sum_two_nums1(num1, num2): print(f’num1 = {num1} and num2 = {num2}‘) add = num1 + num2 print(“Sum of two numbers is”,add)
# function definition: of taking input argumenta # and returning values # 1. required positional arguments: both required and positional def sum_two_nums2(num1, num2): print(f’num1 = {num1} and num2 = {num2}‘) add = num1 + num2 #print(“Sum of two numbers is”,add) return add
# default positional def sum_two_nums3(num1, num2=9): print(f’num1 = {num1} and num2 = {num2}‘) add = num1 + num2 #print(“Sum of two numbers is”,add) return add
#calling functions sum_two_nums() x = 38 y = 67 #calling functions by passing two values sum_two_nums1(x, y)
#calling functions by passing two values and catching the return value result = sum_two_nums2(27,37) print(‘Function has returned ‘,result)
result = sum_two_nums3(27,39) print(‘Function has returned ‘,result) result = sum_two_nums3(27) print(‘Function has returned ‘,result)
# how to make arguments non-positional # by using keywords while passing the value result = sum_two_nums3(num2 = 27,num1 = 39) print(‘Function has returned ‘,result)
# functions ”’ 1. required: you have to provide value to this argument 2. positional: takes the value based on the position 3. default (non-required): 4. keyword (non-positional) ”’ def func1(n1,n2): print(“N1 = “,n1) print(“N2 = “,n2) add = n1 + n2 return add
def func2(n1,n2=66): #assigning a default value print(“N1 = “,n1) print(“N2 = “,n2) add = n1 + n2 return add
def func3(n1=-1,n2=0): #assigning a default value print(“N1 = “,n1) print(“N2 = “,n2) add = n1 + n2 return add
def fun4(s1,s2,s3,*s,**info): print(“Values of S are:”,s) if len(s)==0: print(“We are dealing with a triangle”) elif len(s)==1: print(“We are dealing with a square or a rectangle”) elif len(s)==2: print(“We are dealing with a Pentagon”) elif len(s)==3: print(“We are dealing with a Hexagon”) elif len(s)==4: print(“We are dealing with a Heptagon”) elif len(s)==5: print(“We are dealing with a Octagon”) elif len(s)>5: print(“We are dealing with a shape greater than 8 sides”)
print(“INFO = “,info) if “name” in info.keys(): print(f”Name of the player is {info[‘name’]}“) if “game” in info.keys(): print(f”The player plays {info[‘game’]}“) if “city” in info.keys(): print(f”The player lives in {info[‘city’]}“)
# docstring is the multi line comment added in the first line in a function def checkprime(num): ”’ This is a function that takes parameter and returns True for Prime and False for non Prime :param num: value to check :return: True for Prime and False for Non-Prime ”’ isPrime = True if num<2: isPrime = False elif num>2: for i in range(2,num//2 + 1): if num%i==0: isPrime = False break return isPrime
result = func1(10,20) print(“Total is”,result) result = func2(10) print(“Total is”,result) result = func2(20,50) print(“Total is”,result) result = func3(2,5) print(“Total is”,result) result = func3() print(“Total is”,result)
result = func3(n2=6) print(“Total is”,result) fun4(1,2,3,4,5,6,7,8, name=“Sachin”,game=“Cricket”,city=“Mumbai”) fun4(1,2,3, name=“Virat”,game=“Cricket”) for i in range(10,20): result = checkprime(i) if result: print(f”{i} is Prime”) else: print(f”{i} is not Prime”)
# using the same above function to generate prime numbers between 5000 and 6000 print(“Prime numbers are:”) for i in range(5000,6001): result = checkprime(i) if result: print(i,end=“, “) print()
result =recurfacto(10) print(“10 factorial is”,result)
import p1 as MyOwnModule from p1 import checkprime,fun4
if __name__==“__main__”: print(“Running PY18.py”) #printing doc string print(checkprime.__doc__) #result = p1.checkprime(97) MyOwnModule.checkprime(97) result = checkprime(97) if result: print(“97 is prime”) else: print(“97 is not a prime”)
”’ Properties of class & objects: 1. Inheritance: 2. Polymorphism: 3. Abstraction: hiding implementation detail 4. Encapsulation: hiding information ”’ # program to implement addition, subtraction, multiplication, division class Super_Op: def __init__(self): print(“B2 is initialized”)
def Class_Output1(self): print(“Output from Super_Op”) class Math_Op(Super_Op): def __init__(self,n): self.n = n def val_square(self): return self.n ** 2 def val_squarert(self): return self.n ** 0.5 def Class_Output1(self): print(“Output from Math_Op”) class B2: def __init__(self): print(“B2 is initialized”)
def Class_Output1(self): print(“Output from B2”) class Arith_Op (B2,Math_Op):
def __init__(self, n1,n2): Math_Op.__init__(self,n1) # call Math_Op init to make sure the data is initialized self.n1 = n1 self.n2 = n2 def Add(self): return self.n1 + self.n2 def Sub(self): return self.n1 – self.n2 def Mul(self): return self.n1 * self.n2 def Div(self): return self.n1 / self.n2 def Class_Output1(self): print(“Output from Arith_OP”)
”’ Properties of class & objects: 1. Inheritance: 2. Polymorphism: 3. Abstraction: hiding implementation detail 4. Encapsulation: hiding information: 3 types of accessibility provided are: 4.1: public 4.2: protected 4.3: private ”’ # program to implement addition, subtraction, multiplication, division class Super_Op: def __init__(self): print(“B2 is initialized”)
def Class_Output(self): print(“Output from Super_Op”)
def _Class_Output2(self): #protected print(“Output from Super_Op – Protected”)
def __Class_Output2(self): #private print(“Output from Super_Op – Private”) class Math_Op(Super_Op): def __init__(self,n): self.n = n def val_square(self): return self.n ** 2 def val_squarert(self): return self.n ** 0.5 def Class_Output(self): print(“Output from Math_Op”) class B2: def __init__(self): print(“B2 is initialized”)
def Class_Output(self): print(“Output from B2”) class Arith_Op (B2,Math_Op):
def __init__(self, n1=0,n2=0): Math_Op.__init__(self,n1) # call Math_Op init to make sure the data is initialized self.n1 = n1 self.n2 = n2 def Add(self): return self.n1 + self.n2 def Sub(self): return self.n1 – self.n2 def Mul(self): return self.n1 * self.n2 def Div(self): return self.n1 / self.n2 def Class_Output1(self): print(“Output from Arith_OP”)
class testClass: def testMethod(self): s1 = Super_Op() s1.Class_Output() s1._Class_Output2() ”’ Protected members is being called but technically it shouldnt be possible to call – this is not yet implemeted in Python”’ #s1.__Class_Output2() # above method of Super_Op is not accessible. ”’ testClass is no way connected to any of the classes above but still it can call Super_Op class members. – Public so, public members can be called by any class. protected: if you want to make any member protected, you need to add _(single underscore) before the name. protected members can be called by derived classes only. BUT THIS IS STRICTLY NOT IMPLEMENTED IN PYTHON! private: if you want to make any member private, you need to add __(double underscore) before the name. Private members can be accessed only by the members of the same class. ”’ t1 = testClass() t1.testMethod()
a1 = Arith_Op(10,20)
m1 = Math_Op(4) m1._Class_Output2() #m1.__Class_Output2() – you cant call since its private ################## ”’ handling external files: .text, .json .csv How to handle file reading: import os ”’ import os print(“Operating system is use: “,os.name) # nt platform for windows # posix platform for Mac, Unix, Linux # java etc… if os.name==“nt”: #windows related commands print(“You are working with Windows machine”)
# to clear the screen – cls os.system(‘cls’) #create directory os.mkdir(“Test1”) elif os.name==“posix”: print(“You are working on UNIX/Linux/Mac machine”) # those related commands you use os.system(‘clear’) else: print(“Some other platform. Please check”)
############ ”’ Modes of file: r : read w : write a : append (writing additional content to existing file) r+ : read and write mode: file should be existing w+ : write and read mode: file neednt be there a+: append and read ”’ poem=”’Twinkle Twinkle little star How I wonder what you are Up above the world so high like a diamond in the sky”’ print(poem)
# write – all the content # write line – adding line by line # write lines – multiple lines – content has to be in [] # step 1: create a file pointer – needs where and how file_ptr = open(“Test1/abc.txt”,“a”) file_ptr.write(poem) file_ptr.close()
file_ptr = open(“Test1/abc.txt”,“r”) read_content = file_ptr.read() file_ptr.close() print(“Content read from the file:\n“,read_content)
# reading external files – .txt file # read and write to a file filptr = open(“Test1\\abc.txt”,“r”) # read content = filptr.read() print(“1 Content of the file is: \n“,content) filptr.seek(10) content = filptr.read(30) print(“2 Content of the file is: \n“,content) content = filptr.read(30) print(“3 Content of the file is: \n“,content) filptr.seek(0)
# readline content = filptr.readline() print(“1. Line: Content of the file is: \n“,content) content = filptr.readline(50000) print(“2. Line: Content of the file is: \n“,content) filptr.seek(0)
# readlines content = filptr.readlines() print(“1. Lines: Content of the file is: \n“,content)
# closing the file filptr.close()
filptr = open(“Test1\\abc.txt”,“w”) #operations same as append # write content=“””Ba Ba Black sheep Do you have any wool Yes sir yes sir three bags full””” filptr.write(content) ## writelines content=[‘\nHello there\n‘,‘How are you doing?\n‘, ‘I am fine\n‘,‘Doing awesome\n‘] filptr.writelines(content) #close filptr.close()
”’ Mini Project: Write a .txt file based storage program to store data about Indian batsman for the world cup 2023. And pick up the highest score of each player through the python code. example: store player and their highest score Rohit 43 Gill 55 Rahul 21 Iyer 5 Virat 86 ”’ # reading external files – .csv file import csv header = [‘SNO’,‘NAME’,‘COUNTRY’,‘HIGHEST’] info = [[‘1’,‘Rohit’,‘India’,43],[‘2’,‘Gill’,‘India’,67], [‘3’,‘Iyer’,‘India’,4],[‘4’,‘Virat’,‘India’,93]] fileptr = open(“Test1\\abc.csv”,mode=“w”,newline=”) fileptr_csv = csv.writer(fileptr, delimiter=“,”) fileptr_csv.writerow(header)
for i in info: fileptr_csv.writerow(i)
fileptr.close()
fileptr = open(“Test1\\abc.csv”,mode=“r”,newline=”) fileptr_csv = csv.reader(fileptr, delimiter=“,”) for i in fileptr_csv: print(i[1],” – “,i[3]) fileptr.close()
”’ Modify the above program to print the name and the score of the highest player only ”’
{“Name”: “Siraj”, “Type”: “Bowler”, “City”: “Hyderabad”}]}”’ #loads actually reads data from the screen – data will be in a string format not # as json format json_content2 = json.loads(jason_txt) print(json_content2) print(json_content2[‘Players’]) print(json_content2[‘Players’][1][‘Name’]) for i in json_content2[‘Players’]: print(i[‘Name’]) fileptr.close()
”’ Errors: 1. Syntax error: rule not correctly written 2. Logical error: made error in writing the logic 3. Exceptions: runtime errors ”’ num1 = 5 num2 = 2 print(f”Sum of {num1} and {num2} is {num1-num2}“)
num1 = int(input(“Enter a number: “)) print(“Value entered is “,num1) # ValueError: # handling runtime errors are called Exception handling try: num1 = int(input(“Enter a number: “)) except ValueError: print(“Since you have entered an invalid number, we are stopping here!”) else: print(“This is a valid number”) finally: print(“Thank you for using my application”)
”’ WAP to divide 2 numbers ”’ num1,num2 = 0,0 while True: try: num1 = int(input(“Enter first number: “)) except ValueError: print(“Invalid number! Try again…”) else: break while True: try: num2 = int(input(“Enter second number: “)) except ValueError: print(“Invalid number! Try again…”) else: try: div = num1 / num2 except ZeroDivisionError: print(“Denominator cant be Zero! Try again…”) else: break div = num1 / num2 print(“Division value is:”,div) print(“Thank you for using my application”)
if __name__ ==“__main__”: myfun1(99,87) myfun1(99,7) s1 = Sample()
=====================
#import p11 as RocketScience from p11 import myfun1
#RocketScience.myfun2() print(myfun1(5,10))
import random random.random()
########### FILE HANDLING #mode of file handling: r (read), w (write-delete old content and write new), a (Append) ## r+ (read & write), w+ (write & read), a+ (append and read) filename = “abc.txt” fileobj = open(filename,“r+”) content = ”’This is a sample content story about a king and a queen who lived in a jungle so I am talking about Lion the kind of jungle”’ fileobj.write(content) content2 = [‘THis is sample line 1\n‘,‘line 2 content \n‘,‘line 3 content \n‘] fileobj.writelines(content2) fileobj.seek(20) output = fileobj.read() print(“Content from the file:\n“,output) fileobj.seek(10) output = fileobj.read() fileobj.seek(10) content3 = fileobj.read(15) content4 = fileobj.readline() print(“Content from the file:\n“,output)
fileobj.seek(0) content5 = fileobj.readlines() print(“Content from the file:\n“,content5)
fileobj.close()
## Exception handling #SyntaxError : print(“Hello) #logical error: you make error in the logic – very difficult to find #runtime errors (exceptions): num1 = int(input(“Enter a number: “)) # ValueError exception
except ZeroDivisionError: print(“Denominator is zero hence stopping the program from executing”) except TypeError: print(“Invalid numbers, hence exiting…”) except NameError: print(“One of the values has not been defined. Try again”) except Exception: print(“Not sure but some error has occurred, we need to stop”) else: print(“Answer is”,c) finally: print(“We have completed division process”) # class InvalidLength(Exception): def __init__(self,value=0): self.value = value
length, breadth = –1,-1 while True: try: length = int(input(“Enter length: “)) except ValueError: print(“Invalid number, try again…”) else: #assert length > 0, “Rectangle with this diamension is not possible” if length <=0: try: raise InvalidLength except InvalidLength: print(“Invalid value for Length hence resetting the value to 1”) length=1 break while True: try: breadth = int(input(“Enter breadth: “)) except ValueError: print(“Invalid number, try again…”) else: assert breadth>0,“Rectangle with this diamension is not possible” break area = length * breadth print(“Area of the rectangle is”,area) ## ### ## # datetime, date, time from datetime import datetime, timedelta import time from pytz import timezone
curr_time = datetime.now() print(“Current time is”,curr_time) print(“Current time is”,curr_time.strftime(“%d / %m /%Y”)) print(curr_time.year, curr_time.day, curr_time.date()) for i in range(5): time.sleep(1) # sleep for 2 seconds print(“Time left:”,5-i,“seconds”) print(“Good Morning”) print(“Current time is”,datetime.now()) print(“Date 2 days back was”,(curr_time-timedelta(days=2)).strftime(“%d/%m/%Y”)) print(“UTC Time is”,datetime.now(timezone(‘UTC’))) print(“UTC Time is”,datetime.now(timezone(‘US/Eastern’))) print(“UTC Time is”,datetime.now(timezone(‘Asia/Kolkata’)))