PRV - Parallel Routing Vectors

Word Vectors

my original oneAPI contest entry: PRT
basically implemented simple CAM (Content Addressable Memory) functionality
which returns a result index (a reference pointer into the actual result table),
based upon a direct match of a parallel searched for input key;

the input search keys are sentences of words,

this extension attempts to generalize the concept, to provide
a result vector, indicating a best statistical probability.

thing is that computers process numbers (not words)
so inputs need to be converted into numerical representations

Google, being somewhat interested in "search engines"
developed the concept of Word2vec,
which mathematically vectorizes input natural language words
so that they can be processed by neurological networks

semantic similarity between the words represented by those vectors,
is expressed mathematicaly as a scaler value of cosine similarity between the vectors;

https://en.wikipedia.org/wiki/Inner_product_space

https://en.wikipedia.org/wiki/Cosine_similarity

which if it works,
indicates the level of semantic similarity between the words represented by those vectors.

problem is that: the original vectorization involved "one hot" type of vector
wherein each natural language word was represented simply as a "1" in that vocabulary index position

id est: every word is totally distinct from every other word (unique)

mathematically, the cosine between two unique words results in a scaler "0"
id est: there is no similarity between totally unique vectors

well, after much research effort into such concept,
Google developed: BERT (Bidirectional Encoder Representations from Transformers)
which "is a Transformer-based machine learning technique for natural language processing (NLP) pre-training"
https://en.wikipedia.org/wiki/BERT_(language_model)

yeah, Google seems to have an interest in figuring out
what might be the best way to do computerized searches

what makes BERT much better is that
it is deeply bidirectional,
the deep learning neurological networks embed conceptual vectorization
(actually multidimensional arrays of Vectors called "Tensors"),
which mathematically encodes connotational similarity between words based upon
context of words occuring in a sentence both before, and after, the word being classified

after mathematical analysis of huge quantities of sentence structures,
the Tensor based mathematical models created
tend to produce resulting output vectors
which are quite impressive in connotational embeddings

the output vectors encode (encapsulate) the contextual meanings of the input words,
wherein word similarity is expressed as being closer numbers in the vectors

for example, to perform Sentiment Analysis:
once "Named Entities" are recognized,
since words of a positive connotation are classified with mathematical similarity,
and words of a negative connotation are also clasified as being mathematically similar,
one can then mathamatically determine whether any Named Entity
exists in a mathematically positive or negative sentiment environment
for any input sentence

actually, the generic concept of BERT type models is that of a Transformer

the parallel benefit of Transformers is that sequential input data
does not need to be processed sequentially
(strange but true, mathematically)

the theory being:
the concept of a highly parallel onePRV co-processor
which, when given a sentence input,
produces quite an impressive connotational output vector

I believe that
Google has open sourced quite a lot of their Research and Development,
and I was thinking that a standards based Intel onePRV
might be quite useful

this webpage was last updated on: February 27, 2021

PRV - ParallelRoutingVectors
on GitHub