word2vecの特許の請求項9
word2vecの特許が話題だったので、請求項9をわかりやすくレイアウトしてみました。
9. (C1) A method for assigning a respective point in a high-dimensional space to each word in a vocabulary of words, the method comprising: { (C2)obtaining a set of training data, wherein { (C3)the set of training data comprises sequences of words } ; (C4)training a plurality of classifiers and an embedding function on the set of training data, wherein { (C5)the embedding function receives an input word and (C6)maps the input word to a numeric representation in the high-dimensional space in accordance with a set of embedding function parameters, } wherein { (C7)each of the classifiers corresponds to a respective position surrounding the input word in a sequence of words, } and wherein { (C8)each of the classifiers processes the numeric representation of the input word to generate a respective word score for each word in a pre-determined set of words, wherein { (C9)each of the respective word scores represents a predicted likelihood that the corresponding word will be found in the corresponding position relative to the input word, } and wherein { (C10)training the embedding function comprises obtaining trained values of the embedding function parameters } } } ; (C11)processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary ; and (C12)associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space. }