Wordnik provides a hyphenation API, with data licensed from traditional dictionaries. However, more than half of the unique words of English aren't in any dictionary (see http://www.sciencemag.org/content/331/6014/176). To provide hyphenation for unknown words, we'd like to implement the Liang algorithm (https://www.tug.org/docs/liang/) and combine it with a model (based on known dictionary hyphenations) that provides a confidence metric for the hyphenations provided for unknown words. 

View our work here.

Term
Fall 2020
Topic
Humanities
Platforms/Infrastructure