Data Storage

The data in a database can be physically stored on different ways, each offering a particular set of advantages

Serial Data Files

In a serial data file the each record is as large as the data to be stored and there are no empty records. Serial data files are therefore often the most memory efficient way to store data.

In this type of file structure the computer has to read through the data record-by-record until it finds the record that it needs to access. This makes accessing data from a serial file relatively slow.

If a record is deleted or edited then the complete altered file is re-written back to the storage medium which is relatively slow and may involve writing to a temporary file until the process is completed. The original file is then replaced by the altered temporary file.

Sequential Data Files

Basically, the same as serial files except that the records are stored in order of a designated key field. This means records can be found more quickly. Similar issues remain when new records need to be inserted, or old records need to be deleted.

Indexed Sequential Data Files

Used for very large sequential data files. Records are stored in order of their key field, but also stored with an index, which allows a search to jump straight to the correct block of records in the file. Think of it like a book with an index that specifies the page number for each chapter, or a bookshelf where books are ordered A-Z by author, but there are also markers placed on the shelf for the start of each letter of the alphabet.

Random Access Data Files

In a random-access data file each record is identified with a record number in sequence and each record takes up the same amount of memory.

The computer therefore can quickly access any record by simply calculating its position in the data file based on the record number and then directly accessing that record. However, random access files can be wasteful of disk space if the data stored in each field varies in length because space is allocated for the longest possible field in every record (the field storing the data item MOUSE would occupy the same number of bytes as one storing the data item COMPUTER).

Each time a new record is created it is simply added (appended) to the end of the file and the record number increased by one. However, deleting a record would upset the numbering system so this cannot be done. Instead the record is kept but the contents deleted. Before long, quite a number of records will be 'blank' and a growing proportion of the data file becomes wasted space.

Indexed Data Files

In an indexed data file, the records do not have a fixed length, they match the length of the data in the same way as a serial data file and there is no wasted space. However, the start position of each record is stored in an index which is part of the file. This allows the computer to quickly access any record in the same way as with a random-access data file.

If a record is deleted it is left in the data file but the index entry is deleted. If a record is altered, then the updated record is simply added to the end of the data file and the index updated.

The data file will eventually contain a lot of wasted data that is not indexed because it had been deleted or edited. However, it is relatively easy for a computer to re-write an indexed data file data file, deleting all the un-indexed data and re-indexing the rest.

Summary