CS 236: Advanced Databases
Course Description:
In this course, we will discuss various issues arising in the context of
data management. The course will begin with a review of such issues as
file systems, architecture of database management systems,
data models, and relational databases. We will also examine logical and
physical design of databases, hardware and software implementation of database
systems, and distributed databases. The bulk of the class will consist of
reading papers drawn from the research literature.
Prerequisites:
Students must have taken a course in databases.
Class times:
Mondays and Wednesdays, 3:30pm - 4:50pm. The class meets in Bourns A125.
Office hours:
5pm - 6pm, MW, or by appointment. Tel: 827-2451.
E-mail: ravi@cs.ucr.edu.
Grading:
Class participation: 15%, project: 50%, exams: 35%.
Project or Research Paper
You will need to complete a research paper or a systems project for the class. Please see the
"Assignments" section in Canvas for details.
Books Useful to this Class
The bulk of the readings are expected to be from the research literature. A
list of readings from the literature will be made available. No textbook is
specifically required, but the following books are likely to be useful:
-
``Database Management Systems'', R. Ramakrishnan and J. Gehrke, McGraw Hill
-
``Fundamentals of Database Systems'', R. Elmasri and S. Navathe, Pearson
Publishing.
PTR.
Database Conferences
Here is a list of conferences with papers of relevance to this class.
The conferences have been ranked as "Tier-1" (highest prestige), "Tier-2", etc. Database conferences
are prefixed with (DB). However, on often finds relevant papers in conferences on Data Mining (DM),
Machine Learning (AI), Information Retrieval (IR), the World-Wide Web (W3), etc.
These conferences will give you a good idea of the nature of current research in the field of
databases.
Paper Readings in the Class
Here is a preliminary list of papers we will read in this class.
Indexing
R-tree indices:
Antonin Guttman: R-Trees: A Dynamic Index Structure for Spatial Searching.
SIGMOD Conference 1984: 47-57, R-tree.pdf
N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger. The R*-tree: An
Efficient and Robust Access Method For Points and Rectangles. SIGMOD Conference
1990, rstar.pdf
The Grid File:
J. Nievergelt, H. Hinterberger, K.C. Sevcik. The Grid File: An Adaptable,
Symmetric Multikey File Structure. ACM Trans. Database Syst. 9(1): 38-71
(1984), grid-file.pdf
see also this summary.
Space Filling Curves:
H.V. Jagadish. Linear clustering of objects with multiple attributes. SIGMOD
Conference 1990, hilbert-curve.pdf
Atinder's slides on R-Trees: rtree-slides
Here are slides on R-Trees, grid-file and space filling curves from G.
Kollios: Kollios-NTUA-structures-slides
You can find a framework (implemented by Marios Hadjieleftheriou) to create
spatial indices here.
Spatial Queries
Join Processing:
Leonard D. Shapiro: Join Processing in Database Systems with Large Main
Memories. TODS 11(3): 239-264, join.pdf
Donghui's slides on join processing: join-slides
Spatial Joins:
T. Brinkhoff, H-P Kriegel, B. Seeger: Efficient Processing of Spatial Joins
using R-trees. Proc. SIGMOD, 1993, r-tree-join.pdf
Ming-Ling Lo, Chinya V. Ravishankar: Spatial Joins using Seeded Trees.
SIGMOD Conference 1994: 209-220, seeded.trees.pdf
Ming-Ling Lo, Chinya V. Ravishankar: Spatial Hash-Joins. SIGMOD Conference
1996: 247-258, shj.pdf
Nick Koudas, Kenneth C. Sevcik: Size Separation Spatial Join. SIGMOD
Conference 1997: 324-335, ssj.pdf
Donghui's slides on spatial joins: spatial-join-slides
Ravi's slides on seeded-tree joins: seeded-trees-join slides
Nearest Neighbors:
N. Roussopoulos, S. Kelley, F. Vincent: Nearest Neighbor Queries. SIGMOD
Conference 1995: 71-79, roussopoulosNN95.pdf
G.R. Hjaltason, H. Samet: Ranking in Spatial Databases. SSD 1995: 83-95,
hjaltason95ranking.pdf
NN slides from G. Kollios: slides1 and from Y. Tao: slides2
Skyline Queries:
Stephan Börzsönyi, Donald Kossmann, Konrad Stocker: The Skyline Operator. ICDE
2001: 421-430, skyline-operator.pdf
Jan Chomicki, Parke Godfrey, Jarek Gryz, Dongming Liang: Skyline with
Presorting. ICDE 2003:717-719, skyline-presorting.pdf
Dimitris Papadias, Yufei Tao, Greg Fu, Bernhard Seeger: An Optimal and
Progressive Algorithm for Skyline Queries. SIGMOD Conference 2003:
467-478, skyline-bbs.pdf
Skyline slides from Y. Tao: skyline slides
Data Intensive Applications
Dean, J. and Ghemawat, S. 2008. MapReduce: simplified data processing
on large clusters. Commun. ACM 51, 1 (Jan. 2008), 107-113,
MapReduce.pdf
The map-reduce slides from Cloudera.
Aggregation for Data Intensive Applications:
Jian Wen, Vinayak R. Borkar, Michael J. Carey, Vassilis J. Tsotras:
Revisiting Aggregation for Data Intensive Applications: A Performance
Study. CoRR abs/1311.0059 (2013), aggregation.pdf
Here are the slides on aggregation, aggregation-slides
Top-K Queries
R. Fagin. "Combining fuzzy information: an overview." SIGMOD
Record, Vol 31,No 2, June 2002, pp. 109-118, fagin-sigrec02.pdf
Here are the Top-k slides
Temporal Databases And Indexing
Slides on Temporal DBs and Indexing: temporal databases, snapshot
index, MVB-Tree.
B. Salzberg and V.J. Tsotras: Comparison of Access Methods for
Time-Evolving Data. ACM Comput. Surv. 31(2): 158-221 (1999),
tempDB-survey.
V.J. Tsotras, N. Kangerlaris: The Snapshot Index: An I/O-optimal
access method for timeslice queries. Inf. Syst. 20(3): 237-260
(1995), SI-index.
B. Becker, S. Gschwind, T. Ohler, B. Seeger, P. Widmayer: An
Asymptotically Optimal Multiversion B-Tree. VLDB J. 5(4): 264-275
(1996), MVB-Tree
Data Outsourcing and Security
H. Hacigümüs, B. Iyer, C. Li, and S. Mehrotra. Executing
SQL over encrypted data in the database-service-provider
model. In Proc. ACM SIGMOD, pages 216-227, 2002.
B. Hore, S. Mehrotra, M. Canim, and M. Kantarcioglu.
Secure multidimensional range queries over outsourced data.
The VLDB Journal, pages 1-26, 2011.
Jonathan L. Dautrich and Chinya V. Ravishankar, ``Compromising Privacy in
Precise Query Protocols'', Proc. of the 16th International Conference on
Extending Database Technology (EDBT 2013), Genoa, Italy, March 2013.
Peng Wang and Chinya V. Ravishankar, ``Secure and Efficient Range Queries on
Outsourced Databases Using #-trees'', Proc. 29th International Conference on
Data Engineering (ICDE 2013), Brisbane, Australia, April 2013.