Data Analytics – November 2022
Assignment
Develop
1. Library Management System – think of different tables it can have and also add columns to those tables
2. Add constraints to columns as applicable
3. Add data types to each column
4. Add atleast 5 sample data to each table
MS SQL Server Authentication method:
- Windows login
- Sa
- Both: select both
SQL Server
DATATYPES:
.Approximate numeric data types (Float)
This data type is used to store floating-point and real values. It is mainly used in scientific calculations.
Date and Time data types
We use these data types to hold the temporal values such as date and time, including time offset in a column.
Character string data type
This data type allows us to define the character data type only, which can be fixed or variable in length.
Unicode character string data types
This data type allows us to define the full range of Unicode character sets encoded in the UTF-16 character set.
Binary data types
This data type allows storing image, audio, and video files of fixed and variable length into a database location. It stores information in 0 and 1 format.
Creating a table
— Creating a table
Create Table Library.dbo.Members(
MemberID INT IDENTITY PRIMARY KEY,
Full_name Varchar(40) NOT NULL,
Phone char(10),
emailid Varchar(20)
)
— Create table databasename.schemaname.tablename()
— Within the bracket we need to give the column names and the data types
— Add constraints if any: Primary Key
— We can use Identity to generate unique integer values: SQL SERVER
USE [Library]
GO
/****** Object: Table [dbo].[BOOKS] Script Date: 24-11-2022 07:04:04 ******/
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[BOOKS]’) AND type in (N’U’))
DROP TABLE [dbo].[BOOKS]
GO
/****** Object: Table [dbo].[BOOKS] Script Date: 24-11-2022 07:03:22 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[BOOKS](
[BOOKID] [int] IDENTITY PRIMARY KEY,
[BTITLE] [nchar](10) NULL,
[AUTHOR] [nchar](10) NULL,
[COST] [real] NULL
)
GO
Codd’s Rules (defines RDBMS systems)
- Rule 1 Information rule: information needs to be stored in a table format and each cell has to have one unique data => Table format
- Rule 2 Guaranteed access: Each single data has to be accessible => each table has a primary key
- Rule 3 Handling NULL values uniformly for all tables: data missing/not known/not applicable
- Rule 4 Active online catalog: Data Dictionary (managed by system) should have all the information about metadata. => sys.objects in MSSS
- Rule 5 comprehensive sub-language rule: database language to access data in a linear syntax format => SQL
- Rule 6 Updating Rule (about Table): data needs to be updatable => Update/Delete/Alter
- Rule 7 Insert, Update, Delete available (Database) => UNION,INTERSECTION etc
- Rule 8 Physical data independence: can only be changed from within the database. No external factor can change the stored data.
- Rule 9 Logical data independence: Application should not be impacted by any change in the database
- Rule 10 Integrity independence: database design can not be influenced by anyother team
- Rule 11 Distribution independence: User need/should not know where the data is stored.
- Rule 12 Non subversion: access to the lowest level => if you have access to a table, you should be able to access all the records/tuples in that table
Design
- Table
- Tuple
- Record
- Relation instance: group of columns that gives the relationship
- Employee table: EMP_NAME & EMP_PHONE
- Schema: Group of tables and other database objects
- Relation key (KEY): set of columns that will uniquely identify each row
- EMP_ID, NAME & DOB, Phonenumber & NAME
- Referential Integrity constraint (FOREIGN KEY): link 2 tables, in current table its called FOREIGN KEY and in Other table, it has to be a Primary Key
E R Diagram
Example: (Member entity)
Handling Many to Many relationship:
Click here to Watch the Video
Home, Assignments and Projects
Homework 1:
Image Search for 3 ER diagrams from internet (preferably related to your domain of interest)
Re-draw it and just add 3-4 lines about that ER
Assignment 1:
E R diagram: Library Management System and We will later convert these into database
Project 1: Develop the conceptual model with an EntityRelationship Diagram (ERD)
1. What are the entities in our database (nouns – these will be the tables) – Draw these as rectangles
2. What are their attributes? (properties/characteristics of an entity that we want to collect and store in the DB) – think about what would uniquely identify a particular instance of the entity)
– Draw these as bubbles off the square
– Underline the attribute(s) that uniquely identifies instance
3. What are the relationships between entities? (what is the cardinality of that relationship? (1-1? 1-many? Etc.)
– Draw lines between entities – put a label on the line in a diamond
– Indicate by crow’s feet (or #’s) the cardinality at each end of the relationship
E R Diagram Refresher :
Rectangle — Entity
Ellipses — Attribute (underlined attributes are [part of] the primary key)
Double ellipses — multi-valued attribute
Dashed ellipses– derived attribute, e.g. age is derivable from birthdate and current date
Keys:
– Superkey: an attribute or set of attributes that uniquely identifies an entity–there can be many of these
– Composite key: a key requiring more than one attribute
– Candidate key: a superkey such that no proper subset of its attributes is also a superkey (minimal superkey – has no unnecessary attributes)
– Primary key: the candidate key chosen to be used for identifying entities and accessing records. Unless otherwise noted “key” means “primary key”
– Alternate key: a candidate key not used for primary key
– Secondary key: attribute or set of attributes commonly used for accessing records, but not necessarily unique
– Foreign key: term used in relational databases (but not in the E-R model) for an attribute that is the primary key of another table and is used to establish a relationship with that table where it appears as an attribute also. So a foreign key value occurs in the table and again in the other table. This conflicts with the idea that a value is stored only once; the idea that a fact is stored once is not undermined.
Using Management Studio:
USE [Library]
GO
/****** Object: Table [dbo].[Transactions] Script Date: 29-11-2022 06:36:52 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Transactions](
[TID] [smallint] IDENTITY(1,1) NOT NULL,
[MemberID] [int] NOT NULL,
[BookID] [int] NOT NULL,
[IssueDate] [date] NULL,
[ReturnDate] [date] NULL,
CONSTRAINT [PK_Transactions] PRIMARY KEY CLUSTERED
(
[TID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Transactions] WITH CHECK ADD CONSTRAINT [FK_Transactions_Transactions] FOREIGN KEY([MemberID])
REFERENCES [dbo].[Members] ([MemberID])
GO
ALTER TABLE [dbo].[Transactions] CHECK CONSTRAINT [FK_Transactions_Transactions]
GO
ALTER TABLE [dbo].[Transactions] WITH CHECK ADD CONSTRAINT [FK_Transactions_Transactions1] FOREIGN KEY([BookID])
REFERENCES [dbo].[BOOKS] ([BOOKID])
GO
ALTER TABLE [dbo].[Transactions] CHECK CONSTRAINT [FK_Transactions_Transactions1]
GO
USING QUERY EDITOR:
Create Table BOOKDETAILS (
BDID SMALLINT PRIMARY KEY,
BOOKID INT FOREIGN KEY REFERENCES BOOKS(BOOKID),
CITY varchar(15),
Publisher varchar(15)
)
Go
— Insert data into the tables
Insert into dbo.Members(Full_name,Phone,emailid) values(‘Sachin’,’23456′,’sachin@sachin.com’)
GO
Insert into dbo.Members(Full_name,Phone,emailid) values(‘Kapil’,’45456′,’kapil@sachin.com’)
GO
— Reading is done by Select query
Select MemberID, Full_name from dbo.Members
— Practice https://livesql.oracle.com/
— ALTER is used to change the structure of your table – add, drop, Alter (modify in some versions) table members
ALTER TABLE dbo.BOOKS
ADD ISBN char(13)
GO
ALTER TABLE dbo.BOOKS
ADD ISBN_10 char(10)
Go
— Drop column ISBN_10
ALTER TABLE dbo.BOOKS
DROP COLUMN ISBN_10
Go
— Change the datatype of a column
ALTER TABLE dbo.BOOKS ALTER COLUMN ISBN int
— ####################################
— Properties of a Foreign Key
— On DELETE OR UPDATE:
— 1. NO ACTION
— 2. CASCADE
— 3. SET TO NULL
— 4. DEFAULT
— DELETE – when value in main table is deleted, it performs above selected option in the secondary table
— UPDATE – when value in main table is updated, it performs above selected option in the secondary table
— BOOKS : 1 , 2, 3
— Delete : SET TO NULL
— UPDATE : CASCADE
ALTER TABLE dbo.BOOKS ALTER COLUMN BTITLE nchar(30)
GO
ALTER TABLE dbo.BOOKS ALTER COLUMN AUTHOR nchar(30)
GO
Insert into dbo.Books(BTITLE,AUTHOR) values (‘SQL Programming’,’Sachin’)
select * from dbo.BOOKS
go
select * from dbo.members
go
select * from dbo.Transactions
go
Insert into dbo.Transactions (MemberID,BookID,IssueDate) Values (1,2,GETDATE())
Insert into dbo.Transactions (MemberID,BookID,IssueDate) Values (2,2,GETDATE())
–Update the existing value in table
Update dbo.Transactions Set BookID = 3 where TID=2
Update dbo.Members Set MemberID = 100 where MemberID=1
Delete from dbo.Members where MemberID=1
DAY 16: using SELECT in SQL (livesql.oracle.com)
select * from hr.employees;
Select EMPLOYEE_ID, LAST_NAME, FIRST_NAME, EMAIL, HIRE_DATE from hr.Employees;
Select EMPLOYEE_ID, LAST_NAME “LASTNAME”, FIRST_NAME, EMAIL, SALARY, HIRE_DATE from hr.Employees;
Select EMPLOYEE_ID, LAST_NAME AS LASTNAME, FIRST_NAME, EMAIL, SALARY, HIRE_DATE from hr.Employees;
Select EMPLOYEE_ID, LAST_NAME “LASTNAME”, FIRST_NAME, EMAIL, SALARY, SALARY*12 “ANNUAL SALARY”, HIRE_DATE from hr.Employees;
Select Distinct last_name , first_name from hr.Employees
Select Distinct last_name , first_name, COMMISSION_PCT from hr.Employees ORDER BY COMMISSION_PCT
Select Distinct last_name , first_name, COMMISSION_PCT from hr.Employees ORDER BY COMMISSION_PCT NULLS FIRST
Select first_name || ‘ ‘ || last_name “Employee_Name” from hr.Employees
Select first_name || ‘ ‘ || last_name || ‘ Earns $’ || Salary*12 || ‘ Annually’ “Message” from hr.Employees
Select FIRST_NAME || ‘ ‘ || last_name || ‘ Earns $’ || Salary*12 || ‘ Annually’ “Message” from hr.Employees Where Department_ID = 60 AND first_name = ‘Alexander’
Select first_name || ‘ ‘ || last_name || ‘ Earns $’ || Salary*12 || ‘ Annually’ “Message” , Department_ID from hr.Employees Where Department_ID = 60 OR Department_ID = 90
select * from hr.employees;
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT is not null;
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT >= .2;
— order by to sort
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT >= .2 order by COMMISSION_PCT desc;
select first_name, last_name, salary, COMMISSION_PCT, HIRE_DATE from hr.employees where COMMISSION_PCT >= .2 order by HIRE_DATE
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT is not null and rownum <=5;
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT is not null and rownum <=5 order by COMMISSION_PCT ;
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT is not null order by COMMISSION_PCT ASC , FIRST_NAME ASC
— You can also refer column using a number – 1 indicate first column which is first_name
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where COMMISSION_PCT is not null order by COMMISSION_PCT ASC, 1;
select first_name, last_name, salary, COMMISSION_PCT from hr.employees where first_name like ‘%d%’ order by COMMISSION_PCT NULLS FIRST
— AND : first_name like ‘D%’ and salary > 5000 – MIN RECORD 0 MAX RECORD 10 – COUNT = 5 – INTERSECTION
— OR : first_name like ‘D%’ or salary > 5000 – MIN RECORD 20 MAX RECORD 30 – COUNT = 25 – UNION
— 10 records where names which starts with D , 20 records which has salary greater than 5000 / 5 rows which are common
select count(*) from hr.employees where first_name like ‘D%’ ; –9
select count(*) from hr.employees where salary >=5000; –58
select count(*) from hr.employees where salary >=5000 and first_name like ‘D%’ ; — 5
select first_name, last_name from hr.employees where salary >=5000
INTERSECT
select first_name, last_name from hr.employees where first_name like ‘D%’;
select count(*) from hr.employees where salary >=5000 or first_name like ‘D%’;
select first_name from hr.employees where salary >=5000
UNION
select first_name from hr.employees where first_name like ‘D%’;
— OPERATORS in SQL
— Identity / negation + –
— Multiplication Division * /
— Addition, Subtraction, Concatenation + – ||
— Comparison operators: = != < > <= >= IS NULL LIKE BETWEEN IN
— Logical Operators: NOT AND OR
— Set operators: UNION INTERSECT MINUS
select * from hr.employees where salary <0
select * from hr.employees where -salary < 0 — negation
select FIRST_NAME, LAST_NAME, Salary*12 Annual_Salary from hr.employees — Multiplication
select FIRST_NAME ||’ ‘|| LAST_NAME Full_Name, Salary+ Salary* COMMISSION_PCT Total_Salary , COMMISSION_PCT, Salary from hr.employees
select FIRST_NAME ||’ ‘|| LAST_NAME Full_Name, Salary+ Salary* coalesce(COMMISSION_PCT,0) Total_Salary , COMMISSION_PCT, Salary from hr.employees
select FIRST_NAME ||’ ‘|| LAST_NAME Full_Name, Salary+ Salary* coalesce(COMMISSION_PCT,0) Total_Salary , COMMISSION_PCT, Salary from hr.employees where salary between 5000 and 10000
— IT_PROG AD_VP (60,90)
select FIRST_NAME ||’ ‘|| LAST_NAME Full_Name, Salary+ Salary* coalesce(COMMISSION_PCT,0) Total_Salary , COMMISSION_PCT, Salary from hr.employees where DEPARTMENT_ID IN (90,60)
select FIRST_NAME ||’ ‘|| LAST_NAME Full_Name, Salary+ Salary* coalesce(COMMISSION_PCT,0) Total_Salary , COMMISSION_PCT, Salary from hr.employees where JOB_ID IN (‘IT_PROG’,’AD_VP’)
select * from hr.departments
select DEPARTMENT_ID from hr.departments where LOCATION_ID in (1700, 1800, 1500)
select FIRST_NAME ||’ ‘|| LAST_NAME Full_Name from hr.employees where DEPARTMENT_ID IN (select DEPARTMENT_ID from hr.departments where LOCATION_ID in (1700, 1800, 1500))
select FIRST_NAME, LAST_NAME, LOCATION_ID , emp.DEPARTMENT_ID from hr.employees emp, hr.departments dept where COMMISSION_PCT is not null and emp.DEPARTMENT_ID = dept.DEPARTMENT_ID
— OPERATORS in SQL
— Identity / negation + –
— Multiplication Division * /
— Addition, Subtraction, Concatenation + – ||
— Comparison operators: = != < > <= >= IS NULL LIKE BETWEEN IN ANY SOME ALL
— Logical Operators: NOT AND OR
— Set operators: UNION INTERSECT MINUS
Select * from hr.employees
Select * from hr.employees where salary = ANY (3000,6000)
Select * from hr.employees where salary = 3000 or salary =6000
Select * from hr.employees where salary = SOME (3000,6000)
Select * from hr.employees where salary >= ALL (3000,6000)
Select * from hr.employees where salary = 3000 and salary =6000
Select First_name, last_name from hr.employees where salary = 3000 or salary =6000
UNION
Select First_name, last_name from hr.employees where DEPARTMENT_ID = 30 or DEPARTMENT_ID =60
Select First_name, last_name from hr.employees where salary = 3000 or salary =6000
UNION ALL
Select First_name, last_name from hr.employees where DEPARTMENT_ID = 30 or DEPARTMENT_ID =60
Select First_name, last_name from hr.employees where salary = 3000 or salary =6000
INTERSECT
Select First_name, last_name from hr.employees where DEPARTMENT_ID = 30 or DEPARTMENT_ID =60
Select First_name, last_name from hr.employees where DEPARTMENT_ID = 30 or DEPARTMENT_ID =60
INTERSECT
Select First_name, last_name from hr.employees where salary = 3000 or salary =6000
Select First_name, last_name from hr.employees where DEPARTMENT_ID = 30 or DEPARTMENT_ID =60
MINUS
Select First_name, last_name from hr.employees where salary = 3000 or salary =6000
— (1,3,5,6) – (2,4,6,7) = (1,3,5)
Select First_name, last_name from hr.employees where salary = 3000 or salary =6000
MINUS
Select First_name, last_name from hr.employees where DEPARTMENT_ID = 30 or DEPARTMENT_ID =60
DEFINE salary_val = 3000
Select First_name, last_name from hr.employees where salary = &salary_val
UNDEFINE salary_val
— FUNCTIONS – String/text
Select lower(first_name), upper(last_name), initcap(email) from hr.employees
select concat(concat(first_name,’ ‘), last_name) from hr.employees — 2 strings
— Substring : SUBSTR returns the position of a char: INSTR
select first_name, substr(first_name,1,5), instr(upper(first_name),’A’) from hr.employees
— RPAD LPAD
select RPAD(first_name,12,’.’) || ‘ ‘ || LPAD(last_name,10,’x’) from hr.employees
select RPAD(first_name,9,’ ‘) || ‘ ‘ || LPAD(last_name,9,’ ‘) || ‘ works as ‘ || LPAD(JOB_ID,8,’ ‘) MESSAGE from hr.employees
— — Number Functions
select * from dual
select sqrt(36) from dual
select distinct sqrt(salary) from hr.employees
select salary + COMMISSION_PCT TOTAL_SALARY from hr.employees where COMMISSION_PCT is not null
select salary + COMMISSION_PCT TOTAL_SALARY, round(salary + COMMISSION_PCT) ROUND_SALARY from hr.employees where COMMISSION_PCT is not null
select salary + COMMISSION_PCT TOTAL_SALARY, trunc(salary + COMMISSION_PCT) TRUNC_SALARY from hr.employees where COMMISSION_PCT is not null
select salary + COMMISSION_PCT TOTAL_SALARY, ceil(salary + COMMISSION_PCT) TRUNC_SALARY from hr.employees where COMMISSION_PCT is not null
select salary + COMMISSION_PCT TOTAL_SALARY, floor(salary + COMMISSION_PCT) TRUNC_SALARY from hr.employees where COMMISSION_PCT is not null
select salary * -COMMISSION_PCT TOTAL_SALARY, ABS(salary * -COMMISSION_PCT) TRUNC_SALARY from hr.employees where COMMISSION_PCT is not null
Select MOD(19,5) from dual
— — DATE FUNCTIONS
select * from hr.employees
select sysdate from dual
select FIRST_NAME, LAST_NAME, HIRE_DATE, floor(sysdate-hire_date) from hr.employees
select FIRST_NAME, LAST_NAME, HIRE_DATE, floor(sysdate-hire_date) DAYS_SERIVICE, MONTHS_BETWEEN(sysdate,hire_date) MONTHS_SERVICE from hr.employees
select ADD_MONTHS(sysdate,5) from dual
select NEXT_DAY(sysdate,1) from dual — date of the first weekday 1 for Sunday, 2 for Monday ….
select NEXT_DAY(sysdate,’FRIDAY’) from dual — date of the first weekday
select last_day(sysdate) from dual
select last_day(‘1-Feb-2023’) from dual
select next_day(‘1-Feb-2023’,2) from dual
select ADDDATE(day,1,sysdate) from dual
— year, quarter, month, day
select sysdate + 1 from dual
select DateAdd(sysdate, “1”, 10) from dual
select to_char(sysdate+100, ‘Dy DD-Month-YYYY HH24:MI:SS’) as “Current Time” from dual;
select dayofweek(sysdate) from dual
select now() from dual
Visit part 2 of the blog:
https://swapnil.pw/learn/data-analytics-nov-2022-part-2/