PRV - Parallel Routing Vectors


 

Word Vectors

my original oneAPI contest entry: PRT
basically implemented simple CAM (Content Addressable Memory) functionality
which returns a result index (a reference pointer into the actual result table),
based upon a direct match of a parallel searched for input key;
 
the input search keys are sentences of words,
 
this extension attempts to generalize the concept, to provide
a result vector, indicating a best statistical probability.
 
thing is that computers process numbers (not words)
so inputs need to be converted into numerical representations
 
Google, being somewhat interested in "search engines"
developed the concept of Word2vec,
which mathematically vectorizes input natural language words
so that they can be processed by neurological networks
 
semantic similarity between the words represented by those vectors,
is expressed mathematicaly as a scaler value of cosine similarity between the vectors;
 
https://en.wikipedia.org/wiki/Inner_product_space
 
https://en.wikipedia.org/wiki/Cosine_similarity
 
which if it works,
indicates the level of semantic similarity between the words represented by those vectors.
 
problem is that: the original vectorization involved "one hot" type of vector
wherein each natural language word was represented simply as a "1" in that vocabulary index position
 
id est: every word is totally distinct from every other word (unique)
 
mathematically, the cosine between two unique words results in a scaler "0"
id est: there is no similarity between totally unique vectors
 
well, after much research effort into such concept,
Google developed: BERT (Bidirectional Encoder Representations from Transformers)
which "is a Transformer-based machine learning technique for natural language processing (NLP) pre-training"
https://en.wikipedia.org/wiki/BERT_(language_model)
 
yeah, Google seems to have an interest in figuring out
what might be the best way to do computerized searches
 
what makes BERT much better is that
it is deeply bidirectional,
the deep learning neurological networks embed conceptual vectorization
(actually multidimensional arrays of Vectors called "Tensors"),
which mathematically encodes connotational similarity between words based upon
context of words occuring in a sentence both before, and after, the word being classified
 
after mathematical analysis of huge quantities of sentence structures,
the Tensor based mathematical models created
tend to produce resulting output vectors
which are quite impressive in connotational embeddings
 
the output vectors encode (encapsulate) the contextual meanings of the input words,
wherein word similarity is expressed as being closer numbers in the vectors
 
for example, to perform Sentiment Analysis:
once "Named Entities" are recognized,
since words of a positive connotation are classified with mathematical similarity,
and words of a negative connotation are also clasified as being mathematically similar,
one can then mathamatically determine whether any Named Entity
exists in a mathematically positive or negative sentiment environment
for any input sentence
 
actually, the generic concept of BERT type models is that of a Transformer
 
the parallel benefit of Transformers is that sequential input data
does not need to be processed sequentially
(strange but true, mathematically)
 
the theory being:
the concept of a highly parallel onePRV co-processor
which, when given a sentence input,
produces quite an impressive connotational output vector
 
I believe that
Google has open sourced quite a lot of their Research and Development,
and I was thinking that a standards based Intel onePRV
might be quite useful

 

 

this webpage was last updated on: February 27, 2021

 

PRV - ParallelRoutingVectors
on    GitHub