Nominex - How it Works

How does Nominex work?

Carney

The algorithms behind Nominex are based on the phonetic rules described by Edward Carney, A Survey of English Spelling, 1994 (left).

A specially written "workshop" program performs the various steps involved, this has been developed since 2008 and has been continually improved.

An important part of the process is to translate each surname spelling into a phonetic version using the symbols of the IPA (international phonetic alphabet). Carney’s list of 225 spelling-to-pronunciation rules was used as the basis of these IPA versions, supplemented by his documented exceptions, and also many further exceptions found in the corpus of British surnames. There are now more than 2,400 pronunciation rules.

The similarity scores created by Nominex rely upon comparing each surname with all other surnames in the database. This comparison could be performed on the fly in a live system in which case there will be a performance hit. Alternatively they can be pre-calculated, at the expense of some disk space. Those names whose scores fall in the top 20% or 25% may be regarded as variants that are close enough to be useful, while the lower ranking pairs would generally be discarded.

The steps involved are described in more detail on the next few pages.

Next...


	You are in: Home > How does Nominex work > Overview
Home Existing Search Methods Precision vs. Recall Demo How does Nominex work? Overview Derived Forms IPA Conversion Creating Scores FAQ Links References Acknowledgements	How does Nominex work? The algorithms behind Nominex are based on the phonetic rules described by Edward Carney, A Survey of English Spelling, 1994 (left). A specially written "workshop" program performs the various steps involved, this has been developed since 2008 and has been continually improved. An important part of the process is to translate each surname spelling into a phonetic version using the symbols of the IPA (international phonetic alphabet). Carney’s list of 225 spelling-to-pronunciation rules was used as the basis of these IPA versions, supplemented by his documented exceptions, and also many further exceptions found in the corpus of British surnames. There are now more than 2,400 pronunciation rules. The similarity scores created by Nominex rely upon comparing each surname with all other surnames in the database. This comparison could be performed on the fly in a live system in which case there will be a performance hit. Alternatively they can be pre-calculated, at the expense of some disk space. Those names whose scores fall in the top 20% or 25% may be regarded as variants that are close enough to be useful, while the lower ranking pairs would generally be discarded. The steps involved are described in more detail on the next few pages. Next...
© 2009-19 Archer Software