CIS-552 Database Design
Fall 2009

Homework #2
Due: Monday November 16

Introduction.

In this assignment you will be working with Java implementations of a rudimentary DBMS file manager and buffer manager. To begin with, you should read the documentation of all the classes in the cis552 package, paying special attention to the DBFile, BufferManager, and SlottedPage classes. We will also discuss the design of the cis552 package in class.

In addition to the cis552 Java library, you are given a number of database tables to work with. A zip file with the library and database tables can be downloaded here. The database schema is given in schema.html. It is a database that records information used by an automobile insurance company. Each table is given in two formats. The *.db format is what is actually used by the file manager. These use a slotted page structure to efficiently store variable length records. As far as the file manager is concerned a record is simply an array of bytes. The records used in this database are all tab delimted text.

The *.txt files have the same data as the corresponding *.db files, but in a plain text format for human readability. In this assignment you will need files "people.db" and "people-ssn.idx". The "people.db" file is a heap (unordered) file containing the people relation. The "people-ssn.idx" file is an ordered sequential index on people.ssn. The index uses the same slotted page structure as the data files. Each record in the index file contains three fields:

  1. The SSN of a person.
  2. A block number of the people file.
  3. A record ID, relative to the block.

Assignment

  1. Write a class called Test with a method called query1WithoutIndex to solve the following query without using the index. Write a method called query1WithIndex to solve the query using the index. Each method should take parameters for the buffer size and buffer replacement policy to use. Each method should print the results of the query and the number of disk io's (block reads) that the buffer manager performed.
    select fname, lname from people where ssn = '881-37-5925'
  2. Write a method called query2WithoutIndex to solve the following query without using the index. Write a method called query2WithIndex to solve the query using the index. Each method should take parameters for the buffer size and buffer replacement policy to use. Each method should print the results of the query and the number of disk io's (block reads) that the buffer manager performed.
    select fname, lname from people where ssn < '4'
  3. Note that ssn is a char field, so ssn < 4 is a string comparison which is true for any ssn starting with a 0, 1, 2, or 3.

  4. For each of the queries above, experiment to determine the number of disk io's that are required under a variety of conditions. Write a main method that calls each of the four methods above with buffer sizes of 2, 10, 20, and 40 pages, and replacement policies of LRU and MRU. Altogether you should collect 32 data points.
  5. Complete the following tables to record your results:

SSN = '881-37-5925'
 Without IndexWith Index
BufferLRUMRULRUMRU
2    
10    
20    
40    
 
SSN < '4'
 Without IndexWith Index
BufferLRUMRULRUMRU
2    
10    
20    
40    

What to turn in

In hard copy, turn in a printout of your source code, your test results, and a brief report that explains your results. The report is the most important part. It should explain the effects of index use, buffer size, and buffer replacement policy on the efficiency of processing each of the test queries. What are your conclusions?

Also turn in a soft copy of your source code by email.