Anagram Database

Many people believe there are hidden meanings in words and text and that is where anagrams come into play. An anagram is the rearrangement of some text into another arrangement that is significant to the reader.  Nowadays, anagrams are often found in crossword puzzles.

But how to represent anagrams in terms of a database? At first I though of a simple single table design. However, upon reflection it is clear that a word or phrase may actually have more than one anagram to it. So immediately we require a one to many relationship.

A possible Anagram Database Design

A possible Anagram Database Design

So, a one to many design reflects the reality of the database relationships better than a single table design. Because an objective of good database design is to model the reality of our process, we will use this design.

However, what about indexes?Of course, automatically we must use a primary key for both tables. Why? We must use a primary key to ensure each row is unique and to aid the engine in searching out values. But why an integer as the value of the primary key? This is a preference of mine based on two factors, one it is generally better to use integers for keys, they are easier for the engine to work with. Also, integer columns can be automatically incremented. So,  both tables have auto incremented integers for columns however the anagram table will reference the word table via a foreign key constraint. This requires that any anagram must have a word associated with it.

This database design must also consider indexes. Indexes are helpful to find data, for example if we have feature to allow users to search out particular anagrams from particular words. However, with indexes, it tends to slow down data modifications like updates, inserts or deletes. But this is a good trade off. We fully anticipate most users will search the database and only a few will actually be adding records.

There is one other thing to mention. The size of the text fields are pretty large, So when we create the index we want to limit the number of characters it uses, otherwise the index will become very large. One must try to keep the indexes small so they don’t take up much space in order to improve database performance.