-
RecommendationsGeneratorScheduler
runs on regular interval & starts the process for generating recommentations
- Data extraction begins by running the query passed by users via "AEM Recommendation Engine" OSGi Config
- Using DataCleaningUtil.java bogOfWords is generated
- Properties to be read are passed in the DataCleaningUtil constructor.
- generateBagOfWords() method returns a map of ProductId mapped to bagOfWords associated to that product. bagOfWords is just space separated String with different Tags, or other values of properties like jcr:title, cq:tags etc.
- showCountVectorizer() method from GenerateRecommendations is called. Note: This class is just calling all the required methods from Tokenizer, Dictionary & CountVectorizer classes to convert each product with words associated to them into numeric values which can be passed to an Algorithm for getting recommendations.
Generating recommendations:
- Tokenizer object is created using custom SimpleTokenizer implementation
- SimpleTokenizer provides a method to generate Array of tokens from BagOfWords liked to each product
- Dictionary object is initialized using custom SimpleTermDictionary implementation
- SimpleTermDictionary maintains a Vocabulary
- SimpleTermDictionary provides methods:
- getTermIndex(): returns the Index (Number given to each term in Dictionary) of given term.
- getTotalTerms(): returns the total number of terms/items in Vocabulary
- CountVectorizer Object is initialized with dictionary & tokenizer objects initialized in previous steps. CountVectorizer provides 2 methods:
- getCountVector() : which returns the SparseVector of the current product's bagOfWords.
- returns RealVector object
- SparseVector of size of Dictionary's Vocabulary size
- SparseVector which has represents a word as 1 in the array while 0 if that word from vocabulary is not part of Products BagOfWords
- getCountMatrix() : returns RealMatrix
- returns a matrix of Vectors for each Product's bagOfWords
- In SimilarityMatrixGenerator class's generateSimilarityMatrix() method RealMatrix generated using getCountMatrix() method is looped through and:
- For Each row i.e (each product's vector) cosine angle is calculated with all the other products
- And cosine is stored in a dotProductMatrix
- Finally SimilarityMatrix Object is created which contains dotProductMatrix and NodeIdIndexMap (mapping nodeId to Index, which is used for reading recommendations from dotProductMatrix)
- A separate Map is maintained which maps each ProductId (String) to the Index of product in RealMatrix from step 9.
- SimilarityMatrix object is serialized & stored in JCR under /var, which can be later read while getting the similarities. And Recommendation generation is an heavy operation, RecommendationsGeneratorScheduler can generate & update the SimilarityMatrix after fixed interval.
- RecommendationsReaderService can be used for reading the recommendations from SimilarityMatrix serialized & stored within JCR.
Note: There can be several RecommendationEngines configured in any AEM instance for generating different recommendations for different pages, products or items.