Friday 27 January 2012

17. Using a Spreadsheet as a Simple Database.


In addition to using spreadsheets for their original purpose of inter-related calculations, they serve as simple databases by their very nature of structured cell contents.

Many people use their Spreadsheet application as a convenient method of creating lists, in effect a mini database. This is a perfectly acceptable use of a spreadsheet especially when the more advanced features of a genuine database application (e.g. MS Access) are not required.

In the form we would use, i.e. one sheet with a table of values which does not relate to other sheets, it is called a FLAT FILE DATABASE. Strictly, a flat file database should consist of nothing but data. More broadly, the term refers to any database which exists in a single file in the form of rows and columns, with no relationships or links between records and fields except the table structure.

As we work through these examples we will use both Microsoft Excel 2010, and Open Office Calc. The latter works in a very similar manner to Excel 2003 so this tutorial should cover most spreadsheet types currently in use by most people. A few points to begin:-

All Spreadsheet applications work in the same way and look similar.

The sheet is made up of cells, rows and columns and into each cell can be entered data in the form of text or numbers.

Just like in a Word Processor, the text may be formatted in type, size, or colour etc.

Numbers may also be formatted in order to display them with or without decimal places, as dates, as percentages etc.

The main reason for the existence of Spreadsheets is for the manipulation and calculation of numbers.
This is achieved with the use of formulas in the cells.

We will leave formulas, calculations, and interactive number manipulation for another time and concentrate on simple database use.

For a successful database you should observe the following advice:-

Each cell should contain the smallest possible data element. For instance if you are storing a person’s name, use one cell for the first name and another cell for the second name. For the same reason split the details of an address into as many smaller parts as possible so that later you could search or sort by city, state/region , by post code or even by country if you are maintaining an international list.

Lists should not contain blank rows or columns within the data. This can cause problems when later sorting your data.

Ensure that cells bordering your list are blank.

The top row of your data will contain the headers of each column. These are called field names.

Below we have a very simple spreadsheet showing sales by representative, in which area, and the value of each sale.

We are going to see how easy it is to sort the data using the various field names.

Please click on any of the graphics below to view them full size.
Click on the white X just to the right of the enlarged graphics to return to the Blog.

The two spreadsheets side by side.
On the left, Microsoft Excel 2010, on the right, Open Office Calc.
In blue are the column headings/field names.
As you can see, initially the two sheets look slightly different so we will tidy them up for appearance sake.

On the left of the above screen shot is the spreadsheet (ss) from Open Office and on the right is the ss from MS Excel.

The Excel sheet looks neater and less confusing because of the alignment in the cells. All the text cells are left-adjusted and all numeric cells are right-adjusted. This is what might be called standard but you can have your cells aligned any way you want but I would advise consistency.

The Open Office ss has had  the Unit Cost figures changed to right-aligned for appearance
The main reason for storing data in a Flat File Database is so that the database may be consulted at a later date to see (e.g.) who has paid their subs in a club, where items are stored in a physical filing system such as boxes or shelves, where you can find a particular piece of music within your own storage system whether it be in the form of computer files or on CD etc. In order to be able to do that you need to be able to search the database for individual items (e.g. Let It Be by The Beatles) or items of the same or similar genre (e.g. show me all the rock songs I have by Pink Floyd).

This highlights two essential functions namely search, and sort.

At this point I must re-emphasise that a database is only as good as the information it contains and the way the fields are set up. Be as detailed as you can. For instance if you are setting up a database for your music collection it would be no good having a field containing both artist and song title as you could only use this for searching not sorting. By keeping these separate we could sort the collection by song title and then see all the different people we have singing it within the collection. This is just one example. Keep each field containing one piece of information in order to keep the database flexible.

The following are examples of searching through a database.

A simple Search in Open Office
The same Search in Excel
As you can see the methods are almost identical in each ss type.

This is an example of Search and Replace.

This is Search and Replace in Excel. It is almost identical in Open Office.
As it says in the graphic be careful when replacing words to be sure to type in exactly how the new word should look. Also be extra careful if using the "Replace All" function. Say you have stored the names of books and authors and have discovered that you have spelled an author named Jonson as Johnson. You could do a search and replace on this but if you choose Replace All you could find yourself changing other words or book titles which were actually spelled correctly.

Now moving on to the Sort function.

First of all highlight all the data in the database. You must highlight ALL of the database, not just the columns you wish to sort by, otherwise records will become jumbled (a record is one row of the database). There are many ways to highlight the cells. If you click on the first cell (in this case A1), hold down the shift key, then click on the last, (H44), those two cells and all of the intervening cells will be highlighted. Another way is to click on the first cell in the range, hold down the shift key, press on the down-cursor-movement-key until you reach the last cell in the column, press on the right-cursor-movement-key until all the columns are highlighted, then release the shift key. There are other ways but you can find them by experimentation. This applies to both Spreadsheet types.

Starting with Excel. This version of Excel has far more functionality on the menu ribbon so looks different from the process in Open Office or Excel 2003. With all the database highlighted, click on the Data menu item (if not already chosen),

then click on the Sort icon.

This in turn throws up another dialogue box and this is the essential core to sorting in Excel. Now in this exercise we are going to check on the activities of the Reps, and then sort their sales by Item. When we set up the database we gave each column a heading and when the database was highlighted the headings were included. In the dialogue box we first ensure that the "My data has headers" box is ticked. Now the "first level" sort is by Rep so we select this from the drop down box in the first line. Click on Add Level and in the second line choose Item.


Finally click on OK and the list is almost instantly resorted in the manner requested.


A glance through the list will now show the Reps sorted alphabetically, and the Items they have sold also sorted alphabetically against each Rep.

Moving on to Open Office Calc. Start off by highlighting the whole of the database including the column headers. The actions differ from Excel 2010 and are very similar to Excel 2003 and earlier. Click on Data on the menu and choose Sort.


A dialogue box will open showing three lines of "sort levels".


In the first line choose Rep and in the second line choose Item. Click on OK and the list is almost instantly resorted in the manner requested.


A glance through the list will now show the Reps sorted alphabetically, and the Items they have sold also sorted alphabetically against each Rep.

The above Sorting exercises were quite simple to carry out and were carried out on very small databases with only a few field and records. It is no more difficult on larger databases as long as you follow the advice given above, especially with reference to setting up the databases initially.

The database could easily have contained records of a collection of CDs and where we chose Rep our chosen field could have been Singer and instead of Item it could have been Album. We would have ended up with a list of Albums by Singer enabling selection of an evening's entertainment.

The choices and uses of Flat Field Databases are endless.

----------------------------------