- SECTION A (APPLIED DATABASE THEORY) (30 marks)
You are required to write an opinion article (also known as an -op-ed- piece) for a technology magazine on the following topic:
The Role of Relational Databases and SQL in the Era of Big Data
This is a highly debated topic in the technology industry. While some experts believe that the reign of relational databases is finally over now that we are in the big data world; others think that big data analysis still requires the concepts of relational databases and SQL.
What do you think? Present your thoughts and arguments in support of, against the use of, or changing roles of relational databases and SQL for -Big Data- applications.
Big data refers to a broad term for comprehensive data sets that are huge in size, unstructured in format and highly recurring in frequency. Due to these complexities in these data sets, it is widely believed that traditional data management applications including relational databases and SQL are inadequate to capture, store, analyse and visualise big data.
A number of alternative technologies, such as NoSQL, Hadoop framework and Predictive Analytics applications are poised to facilitate big data management. Research and development in these areas are still ongoing.
The concept of Relational Database was developed in 1970s. The relational databases and SQL have been around for more than 20 years and when these were developed; data was assumed to be neat, structured and static. However organisations today are confronting the reality of big, fast and varied enterprise data measuring in petabytes [1 petabyte = 1000 Terabytes (TB)]. In this context, where do you think the relational databases and SQL are heading? Do you think the non-relational database technologies will replace relational databases and SQL in future? Justify your position with relevant desktop research.
Your opinion piece (op-ed) should be no longer than 500 words.
You are required to extensively research on the relevant topics, take a stance (role of relational databases in big data) and present concise and workable arguments.
There is no requirement for formal in-text referencing in this question. However, it is recommended to put a list of references at the end of the article showing the published materials that you researched while writing this article.
SECTION B (SQL) (40 marks)
For each question, three marks will be awarded for the SQL and one mark for the correct output.
The following E-R diagram represents the JustLee Books database. The script for the table creation is located on the Moodle website for the CIS2002 course under the assignment specifications. You will need to make sure that you re-run the script to reset the tables to their default state.
In this question, you will use the JustLee books database. You must use this data.
Full description of the JustLee database is found in Appendix A in the textbook on page 511.
Write SQL queries to solve the following specifications. Include the query AND THE OUTPUT. A screen dump of the output is acceptable. Show as many rows as you can. A screen dump is usually done via the ALT + PRNT SCRN command sequence or use the ‘Snipping Tool’ under windows to capture parts of the screen.
While the output helps to understand your solution, you should not be analyzing the output of the query in detail. As long as you are confident that your query corresponds to the question completely, your output may not be significant. You can have an output that says NO ROWS FOUND and it could be a perfectly valid output as long as your query fulfils the requirement. Explain why the query may not contain any output in this instance.
1. Display full details of all the attributes in the AUTHOR table.
2. Display the last name and first name for all the authors in the AUTHOR table and sort all records by last name order ascending.
3. Display full name all the customers in a single column (by concatenating firstname and lastname) that have been referred by existing customers identified by customer numbers 1003, 1004, 1005 and 1006.
4. List the names of all the states of customers currently living in the North region (i.e. region is ‘N’). Each state should only appear once and the result should be by state name in ascending order.
5. Display the ISBN and title of books in the BOOKS table where the category starts with the letter C and where the book has a discount.
6. Display the book title, cost, retail and calculate the profit and name the new field profit for all the books where the profit from the sale of the book is more than 65% of the cost of the book.
7. Display the ISBN, book title, cost and retail price of all the books published during the month of December in any year. Order the list by the book title.
8. Display the ISBN, quantity, price paid for the book (paideach column) and the total price paid (i.e. quantity * paideach) for all orders in the ORDERITEMS table when the quantity is greater than 1.
9. Display the book title, retail and price of the book after subtracting any available discount for that book, only for those books that have a discount. Name the new field ‘DISCOUNTED PRICE’ and order by the discounted price of the book in descending order.
10. List ISBN, title, retail, category and published date of all the books that have the (category of ‘COMPUTER’ or where the category contains the string ‘FAMILY’ anywhere in the category description) AND where (price of the books is more than 20 dollars and where the published date is after the end of the financial year for 2003/2004 – i.e. 30 June 2004).
SECTION C (Data Modelling) (30 marks)
Construct data models for the following specifications. Include an ERD and a list of relations (entity list). Your relations must show all attributes, primary keys and foreign keys. You must use the Clive Finkelstein methodology.
Question 1 (15)
A specialist clinic records patient details and their appointments. Each patient record consists of the patient id, name, address, contact number, date of birth and referring doctor name. When a patient books an appointment, the system records the date and time of the appointment, specialist doctor name, consultation duration and fees. Other details of specialist doctor include area of specialty and registration number.
Question 2 (15)
A pilot may be assigned to several flights as a crew member over time. The crew for a flight comprises a minimum of two pilots but could have up to four. For a pilot, we store the name, flying license number and birth date. For a flight, we store the flight code, origin location, destination location and type of aircraft. We also need to store the date a pilot is assigned to a flight, the role of the pilot and the total kilometres and hours the pilot has been flying in a flight.
Prepare the following for both questions:
1 An ER diagram: Show all entities, relationships, cardinalities and optionalities. Also, include all intersection entities but do not create entities that are not specifically covered by the specification. You must use the Finkelstein methodology as per the study book and tutorials.
2 A list of relations (entity list). Produce complete relations for all entities and attributes. Show all primary and foreign keys. Include all attributes that are specifically mentioned and all key attributes. You may need to create primary and foreign keys that are not specifically mentioned but do not create any other additional attributes.