{"id":15124,"date":"2014-09-10T11:52:39","date_gmt":"2014-09-10T06:22:39","guid":{"rendered":"http:\/\/www.tothenew.com\/blog\/?p=15124"},"modified":"2016-12-19T14:54:44","modified_gmt":"2016-12-19T09:24:44","slug":"recommendation-engine-by-using-apache-mahout-cassandra","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/recommendation-engine-by-using-apache-mahout-cassandra\/","title":{"rendered":"Recommendation Engine by using Apache Mahout &amp; Cassandra"},"content":{"rendered":"<p style=\"text-align: justify\">As a part of R&amp;D we are trying to build a recommendations for users by using Apache Mahout &amp; Cassandra.There was a direct Cassandra\u00a0 <code>DataModel<\/code>CassandraDataModel based on a Cassandra keyspace, but there was a lot of confusions which involves so many methods and functions,then we are tried to minimize it by using Cassandra Indexes\u00a0 and Mahout FileDataModel<\/p>\n<p style=\"text-align: justify\">Here we are having 1 millions of Records in Cassandra Database in a testkeyspace with 2 tables and their schema are<\/p>\n<p style=\"text-align: justify\">CREATE TABLE SampleUser (<br \/>\nuser_id bigint,<br \/>\nratings float,<br \/>\nitem_id bigint,<br \/>\nPRIMARY KEY (user_id, ratings, item_id)<br \/>\n)<\/p>\n<p style=\"text-align: justify\">CREATE TABLE SampleItems (<br \/>\nitem_id bigint,<br \/>\nuser_id bigint,<br \/>\nratings float,<br \/>\nPRIMARY KEY (item_id, user_id)<br \/>\n)<\/p>\n<p style=\"text-align: justify\">We want to provide User based Recommendations, Item based recommendation with help of similar users (For example, based on the rating giving by other users in a common locality or other similarity).<\/p>\n<p style=\"text-align: justify\">They are<\/p>\n<p style=\"text-align: justify\">1)Customers Who Bought This Item Also Bought\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 2)Customers Who Viewed This Item Also Viewed<\/p>\n<p style=\"margin-bottom: 0cm;font-weight: normal;line-height: 100%\">The java code for Mahout Recommendations in Cassandra Database<\/p>\n<p>[html]<br \/>\npublic class RecommendationsFromCassandra {<br \/>\nprivate Cluster cluster;<br \/>\nprivate Session session;<\/p>\n<p>\/*<br \/>\n* Connection to Cassandra Database<br \/>\n*\/<\/p>\n<p>public void connect(String node) {<br \/>\ncluster = Cluster.builder().addContactPoint(node).build();<br \/>\nMetadata metadata = cluster.getMetadata();<br \/>\nsession = cluster.connect();<br \/>\n}<\/p>\n<p>\/*<br \/>\n* Executing the query to get the user ids who are having particular item<br \/>\n*\/<\/p>\n<p>public void query1() throws IOException {<br \/>\nString userids = null;<br \/>\nResultSet results = session<br \/>\n.execute(&quot;SELECT user_id FROM testkeyspace.sampleitems WHERE item_id = 914&quot;);<br \/>\nfor (Row row : results)<br \/>\n           {<br \/>\nif (userids == null)<br \/>\nuserids = Long.toString(row.getLong(&quot;user_id&quot;));<br \/>\nelse<br \/>\nuserids = userids + &quot;,&quot; + row.getLong(&quot;user_id&quot;);<br \/>\n}<\/p>\n<p>\/*<br \/>\n* Transfer the all user ids to the 2nd query to get the all users and<br \/>\n* their item id&#8217;s , Ratings<br \/>\n*\/<\/p>\n<p>String stmt = &quot;SELECT user_id,item_id,ratings FROM testkeyspace.sampleuser WHERE user_id in (&quot;<br \/>\n+ userids + &quot;) ;&quot;;<\/p>\n<p>Statement s1 = new SimpleStatement(stmt);<br \/>\ns1.setFetchSize(Integer.MAX_VALUE);<\/p>\n<p>ResultSet results1 = session.execute(s1);<br \/>\nFile newTextFile = new File(&quot;${env:HOME}\/thetextfile.txt&quot;);<br \/>\nFileWriter fw = new FileWriter(newTextFile);<\/p>\n<p>for (Row row2 : results1) {<br \/>\nfw.write(row2.getLong(&quot;user_id&quot;) + &quot;\\t&quot; + row2.getLong(&quot;item_id&quot;)<br \/>\n+ &quot;\\t&quot; + row2.getFloat(&quot;ratings&quot;) + &quot;\\n&quot;);<br \/>\n}<br \/>\nfw.close();<\/p>\n<p>\/*<br \/>\n* Recommendation Part<br \/>\n*\/<\/p>\n<p>UserSimilarity similarity;<br \/>\ntry {<br \/>\nDataModel datamodel = new FileDataModel(new File(<br \/>\n&quot;${env:HOME}\/thetextfile.txt&quot;));<br \/>\nsimilarity = new PearsonCorrelationSimilarity(datamodel);<br \/>\nUserNeighborhood neighbourhood = new NearestNUserNeighborhood(100,<br \/>\nsimilarity, datamodel);<\/p>\n<p>Recommender recommender = new GenericUserBasedRecommender(<br \/>\ndatamodel, neighbourhood, similarity);<br \/>\nlong start = System.currentTimeMillis();<br \/>\nList&lt;RecommendedItem&gt; recommendations = recommender.recommend(10,<br \/>\n10);<br \/>\nfor (RecommendedItem recommendation : recommendations) {<br \/>\nSystem.out.println(recommendation);<\/p>\n<p>}<br \/>\nlong stop = System.currentTimeMillis();<br \/>\nSystem.out.println(&quot;Took: &quot; + (stop &#8211; start) + &quot; millis&quot;);<\/p>\n<p>} catch (TasteException e) {<br \/>\ne.printStackTrace();<br \/>\n}<\/p>\n<p>}<\/p>\n<p>public void close() {<br \/>\ncluster.close();<br \/>\n}<\/p>\n<p>public static void main(String[] args) throws IOException {<br \/>\nRecommendationsFromCassandra client = new RecommendationsFromCassandra();<br \/>\nclient.connect(&quot;127.0.0.1&quot;);<br \/>\nclient.query1();<br \/>\n\/\/ CassandraRecommender r = new CassandraRecommender();<br \/>\nclient.close();<br \/>\nFile file = new File(&quot;${env:HOME}\/thetextfile.txt&quot;);<br \/>\nfile.delete();<br \/>\n}<br \/>\n}<br \/>\n[\/html]<\/p>\n<p><b><span style=\"color: #000000\"><span style=\"font-family: Liberation Serif,serif\"><span style=\"font-size: medium\"><br \/>\nThe time consumming for Recommendations to particular user is <\/span><\/span><\/span><span style=\"color: #000000\"><span style=\"font-family: Liberation Serif,serif\"><span style=\"font-size: medium\">2.40<\/span><\/span><\/span><span style=\"color: #000000\"><span style=\"font-family: Liberation Serif,serif\"><span style=\"font-size: medium\"> sec for 1m records<\/span><\/span><\/span><\/b><\/p>\n<p><strong><span style=\"color: #000000\"><span style=\"font-family: Liberation Serif,serif\"><span style=\"font-size: medium\">\u00a0<\/span><\/span><\/span><\/strong><\/p>\n<p><strong><span style=\"color: #000000\"><span style=\"font-family: Liberation Serif,serif\"><span style=\"font-size: medium\">\u00a0<\/span><\/span><\/span><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As a part of R&amp;D we are trying to build a recommendations for users by using Apache Mahout &amp; Cassandra.There was a direct Cassandra\u00a0 DataModelCassandraDataModel based on a Cassandra keyspace, but there was a lot of confusions which involves so many methods and functions,then we are tried to minimize it by using Cassandra Indexes\u00a0 and [&hellip;]<\/p>\n","protected":false},"author":114,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":2},"categories":[1395],"tags":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/15124"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/114"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=15124"}],"version-history":[{"count":0,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/15124\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=15124"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=15124"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=15124"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}