Useful SQL*Plus headers
Dino eats frog
Oracle with Active Record
When is he free?
Gentle dive into analytic functions
« Bitflux XML Editor
» Seven Tenets for Business Prosperity
I wanted to make some notes about a common mistake people make when making database enabled websites. Mind you, this has nothing to with web, it is really ignorance with principles of RDBMS design.
Let us take an example. Say, there needs to be a database to track employees and their performance levels on a yearly basis. Every year, an employee is given a rating from 1 to 5. This scenario can be repeated for example - may be every year employees get a hike. You need to track their salaries too.
I’ve seen this very often from web developers who think database design is a simple matter of creating tables and queries in MSAccess. It is indeed that simple, but you need to know underlying principles.
TABLE EMPLOYEES
employeeid NUMBER(10) NOT NULL PRIMARY KEY,
employee_name VARCHAR(100) NOT NULL,
...
grade_1998 NUMBER(1),
grade_1999 NUMBER(1),
grade_2000 NUMBER(1),
grade_2001 NUMBER(1),
grade_2002 NUMBER(1)
salary_1998, salary_1999
etc.This is typically what I call a spreadsheet design. To call this a relational database design is a really big stretch.
Note: The code below is for illustrative purposes only and does not particularly follow syntax for any RDBMS. Think of it like pseudo-code.
A good design in this case, could be:
TABLE EMPLOYEES
employeeid NUMBER(10) NOT NULL PRIMARY KEY,
employee_name VARCHAR(100) NOT NULL,
TABLE GRADES
employeeid NUMBER(10) NOT NULL,
grade_year NUMBER(4) NOT NULL,
grade NUMBER(1),
PRIMARY KEY (employeeid, grade_year)
CONSTRAINT fk_grades_employeeid
(employeeid FOREIGN KEY employees.employeeid)
To get the information, you just do a join.
SELECT
e.employee_id, e.employee_name,
g.grade_year, g.grade
FROM grades G, employees E
WHERE G.employee_id = E.employee_id
Oracle tip: For better performance in a join, ALWAYS put the table
that returns the least number of records after the WHERE clause is considered,
to the right of the join. For example, if you write FROM employees E, grades G
this query is going to be much slower.
Very simple. Instead of keeping only the year, we keep a date column.
Let us call this effective_date
. What this means is that,
the grade or salary you should consider for any given date, is determined
by the record with an effective_date nearest (in the past) to your given
date.
TABLE EMPLOYEES
employeeid NUMBER(10) NOT NULL PRIMARY KEY,
employee_name VARCHAR(100) NOT NULL,
TABLE GRADES
employeeid NUMBER(10) NOT NULL,
effective_date DATE NOT NULL,
grade NUMBER(1),
PRIMARY KEY (employeeid, effective_date)
CONSTRAINT fk_grades_employeeid
(employeeid FOREIGN KEY employees.employeeid)
TABLE SALARIES
employeeid NUMBER(10) NOT NULL,
effective_date DATE NOT NULL,
salary NUMBER(10,2) NOT NULL,
PRIMARY KEY (employeeid, effective_date)
CONSTRAINT fk_salaries_employeeid
(employeeid FOREIGN KEY employees.employeeid)
So, how do you get an employee’s current grade and salary? Assuming we are using Oracle, this can be done with the following query. SYSDATE is a pseudo-column in Oracle that returns current date and time.
SELECT
e.employee_id,
e.employee_name,
g.grade,
s.salary,
MAX(g.effective_date) grade_effective_since,
MAX(s.effective_date) salary_effective_since
FROM salaries S, grades G, employees E
WHERE G.employee_id = E.employee_id
AND S.employee_id = E.employee_id
AND S.effective_date <= SYSDATE
AND G.effective_date <= SYSDATE
GROUP BY
e.employee_id,
e.employee_name,
g.grade,
s.salary
Motto: Put the RELATIONAL back into Relational Database Design.
So, next time you get completely wow'ed by full-color, animated presentations from "web development firms" about their ability to make database driven sites, ask these questions: