Lablogo
Data

Please read the licence information carefully when downloading data.

Orthographic Neighborhoods for over 111,000 English words



Processing: Orthographic frequencies were counted in the multi-billion word Westbury Lab USENET corpus. Then the Westbury Lab's freely available  LINGUA software was used to tabulate the orthographic neighbours for all words in a large dictionary of English words. The standard output of LINGUA includes the following fields for each word in the lexicon:

    WORD:        The word in question.
    ONSIZE:    The number of orthographic neighbors of the word in question.
    ONFREQ:    The average of the orthographic frequencies of all the orthographic neighbors of the word in question.
    [NEIGHBOURFREQS]:   A list of the orthographic frequencies of all the orthographic neighbors of the word in question. [Variable in length]
    [NEIGHBOURS]:     A list of the orthographic neighbors of the word in question. [Variable in length]

List size:   111,624 words

Citation:  Westbury, C. & Shaoul, C. (2007) Orthographic Neighborhoods for over 111,000 English words  Edmonton, AB: University of Alberta (downloaded from http://www.psych.ualberta.ca/~westburylab/downloads/ON.download.html)


Acknowledgments: This work would not have been possible without the hardware and software provided by the TaPoR project. This research is also supported by NSERC.

If you have any questions about this data, please contact Cyrus Shaoul

PLEASE NOTE:
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.

Download the list:


Please fill out this form so that we can keep track of who has downloaded this file.

Full Name:
Email Address:
Organization:
What do you intend to use the data for?
Comments/Questions:



 

©2005, 2006, 2007:w  WestburyLab   chrisw at ualberta dot ca