Integrating Apache Mahout with AEM


Created byAnkit Gubrani / @ankitgubrani90

& Rima Mittal / @rimamittal

Speakers Bio


  • Introduction to Apache Mahout
  • Machine Learning
  • Recommendations
  • AEM with Apache Mahout
  • Demo
  • Extension Points



Introduction to Apache Mahout

Introduction to Apache Mahout

  • Project of the Apache Software Foundation.
  • Producing free implementations of scalable machine learning algorithms, written in Java.


  • Started as a Lucene sub-project.
  • Became Apache TLP in April 2010.
  • Latest version – 0.12.2 – Released on 13th June 2016.

Why Apache Mahout?

  • Increasing volume of data!
  • Traditional Data mining algorithms struggle to process very large datasets.
  • Apache Mahout to the rescue!

Traditional Machine Learning

Machine Learning with Mahout


  • Adobe, Facebook, LinkedIn, Twitter and Yahoo use Mahout internally.
  • Twitter uses Mahout for interest modelling.
  • Yahoo! Uses Mahout for pattern mining.



Machine Learning

Machine Learning

  • Programming computers to optimize a Performance Criterion using Example Data or Past Experience
    • Branch of Artificial Intelligence.
    • Computers evolve behavior based on Empirical data.


  • Supervised Learning
    • Use Labelled training data to create a classifier that can predict output for unseen inputs.
  • Unsupervised Learning
    • Use Unlabeled training data to create a function that can predict output.

Machine Learning with Apache Mahout

  • Data Science use cases Mahout supports:
    • Collaborative Filtering
    • Clustering
    • Classification

Collaborative Filtering

  • User behavior mining to make product recommendations.



  • Organizing items into naturally occurring groups, such that items belonging to same group are similar to each other



  • Learning from existing categorizations and assigning unclassified items to the best category





Apache Mahout Recommendation Engine

  • Helps users find items they might like based on historical behavior and preferences.
  • Mahout provides a rich set of components from which a customized recommender system can be constructed using a selection of Algorithms.


  • Top Level Packages
    • DataModel
    • UserSimilarity
    • ItemSimilarity
    • UserNeighboorhood
    • Recommender



AEM with Apache Mahout


  • AEM 6.2
  • Mahout as a Maven Dependency


  • DataModel
    • Implementations representing a repository of information about users and their associated preferences.
      • AbstractDataModel, JDBCDataModel, FileDataModel, GenericBooleanPrefDataModel, GenericDataModel.
      • AEM - JCRDataModel.

Code (1/2)

  • Using The AEM JCRDataModel

  • public JSONArray getUserBasedRecommendations(ResourceResolver resourceResolver, String userId, int numberOfRecommendations) {
    //Creating JCRDataModel to fetch information from JCR
    DataModel model = JCRDataModel.createDataModel(resourceResolver);

Code (2/2)

  • AEM-Mahout Recommendation steps

  • UserSimilarity userSimilarity = getSimilarity(model);
    UserNeighborhood neighborhood = getNeighbourHood(N_NEIGHOBUR_HOOD, userSimilarity, model);
    GenericUserBasedRecommender recommender = new GenericUserBasedRecommender(model, neighborhood, userSimilarity);
    recommendations = recommender.recommend(userIdHash, numberOfRecommedations, null, false);

AEM Product Recommendation

  • User Based Recommendation
    • Takes user ratings into consideration
    • Based on PearsonCorrelationSimilarity
    • Uses NearestNUserNeighborhood

Configuring JCRDataModel

  • Configurations
    • User Generated Content Path
      • /content/usergenerated/asi/jcr
    • Based on PearsonCorrelationSimilarity
      • /etc/commerce/products/geometrixx-outdoors
    • Uses NearestNUserNeighborhood
      • Defaults to social/tally/components/response











Thank you.