Lablogo
Data

Please read the licence information carefully when downloading data.

Neighborhood Density Measures for 57,153 English Words.
[Version 0.1, released on May 20th, 2010]


This list contains two measures of Neighborhood Density produced by HiDEx, an implementation of the HAL model. The list conatins 57,153 words, and for each word, the following measures: For more background information about HiDEx and how it works, please see Shaoul & Westbury, 2010 and Shaoul & Westbury, 2006.

Processing: We used a snapshot of all the English documents in the Wikipedia from April, 2010. We processed this corpus using HiDEx with the following settings:
  • Context size = 10000 words
  • Window Length Behind = 5 words
  • Window Length Ahead = 5 words
  • Weighting Scheme = Inverse Ramp
  • Normalization Method = PPMI
  • Similarity Metric = Cosine
  • Use Zscore Thresholds = 1
  • Percent to sample for Zscore Thresholds = 0.02

Wikipedia corpus size:   971,819,808 words

Format:   This data is in a tab-separated text file, suitable for use with any software that supports this format, including Microsoft Excel. (Please remove the text header before loading)

Citation: Shaoul, C. & Westbury C. (2010) Neighborhood Density Measures for 57,153 English Words.   Edmonton, AB: University of Alberta (downloaded from http://www.psych.ualberta.ca/~westburylab/downloads/westburylab.arcs.ncounts.html)


Acknowledgments: This work would not have been possible without the hardware and software provided by the TaPoR project and the support of Dr. Harald Baayen. This research is also supported by NSERC.

If you have any questions about this data, please contact Cyrus Shaoul

PLEASE NOTE:
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.
Creative Commons License

To download the file, please fill out this form so that we can keep track of who has downloaded this file. We will only use this information to notify you of any future updates to this resource. Please enter a valid e-mail address.

Full Name:
Email Address:
Organization:
What do you intend to use the data for?
Comments/Questions:

©2005,2006,2007  WestburyLab   chrisw at ualberta dot ca