Building a Hybrid Recommendation Engine for E-Commerce
Recommendation systems are the backbone of modern e-commerce platforms, helping users discover products they might be interested in while increasing conversion rates and average order values. At Bank Alfalah, I led the development of a hybrid recommendation engine for Alfa Mall, our e-commerce platform. This post shares our approach and key learnings.
The Recommendation Challenge
When we started, Alfa Mall had a basic recommendation system that simply showed popular products or items from the same category. This approach had several limitations:
- It didn't account for individual user preferences
- It couldn't handle the cold start problem for new users or products
- It failed to capture complex relationships between products
We needed a more sophisticated approach that could provide personalized recommendations while addressing these challenges.
Our Hybrid Approach
We developed a hybrid recommendation system that combines three different techniques:
1. Collaborative Filtering
Collaborative filtering identifies patterns in user behavior to make recommendations. We implemented a matrix factorization approach using Alternating Least Squares (ALS) with implicit feedback.
from implicit.als import AlternatingLeastSquares
import scipy.sparse as sparse
# Create user-item matrix from interaction data
user_items = sparse.csr_matrix((data, (row, col)))
# Initialize ALS model
model = AlternatingLeastSquares(factors=100, regularization=0.01, iterations=50)
# Train model
model.fit(user_items)
# Get recommendations for user
recommendations = model.recommend(user_id, user_items[user_id])
2. Content-Based Filtering
Content-based filtering recommends items similar to those a user has liked in the past. We created product embeddings using product attributes, descriptions, and images.
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
# Create TF-IDF vectors from product descriptions
vectorizer = TfidfVectorizer(max_features=5000, stop_words="english")
tfidf_matrix = vectorizer.fit_transform(product_descriptions)
# Compute similarity between products
cosine_sim = cosine_similarity(tfidf_matrix, tfidf_matrix)
# Get recommendations based on product similarity
def get_content_recommendations(product_id, top_n=10):
idx = product_indices[product_id]
sim_scores = list(enumerate(cosine_sim[idx]))
sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
sim_scores = sim_scores[1:top_n+1] # Exclude the product itself
product_indices = [i[0] for i in sim_scores]
return products.iloc[product_indices]
3. Real-Time User Behavior Analysis
We also incorporated real-time user behavior to capture immediate interests and intent.
# Simplified example of real-time recommendation logic
def get_realtime_recommendations(user_session):
# Get recently viewed products
recent_views = user_session.get_recent_views(limit=5)
# Get products from cart
cart_items = user_session.get_cart_items()
# Get search queries
search_terms = user_session.get_search_terms(limit=3)
# Combine signals with different weights
recommendations = []
recommendations.extend(get_similar_products(recent_views, weight=0.5))
recommendations.extend(get_complementary_products(cart_items, weight=0.3))
recommendations.extend(get_products_by_search(search_terms, weight=0.2))
# Sort by weighted score and return top recommendations
return sort_and_deduplicate(recommendations)
Ensemble Method
The final step was to combine these three approaches into a unified recommendation system. We used a weighted ensemble method that adjusts weights based on user history and context.
def get_hybrid_recommendations(user_id, context):
# Get recommendations from each model
collab_recs = get_collaborative_recommendations(user_id)
content_recs = get_content_recommendations(user_id)
realtime_recs = get_realtime_recommendations(user_id, context)
# Determine weights based on user history
user_history = get_user_history(user_id)
if user_history.is_new_user:
weights = {"collab": 0.2, "content": 0.3, "realtime": 0.5}
else:
weights = {"collab": 0.4, "content": 0.3, "realtime": 0.3}
# Combine recommendations with weights
final_recs = {}
for product_id, score in collab_recs.items():
final_recs[product_id] = score * weights["collab"]
for product_id, score in content_recs.items():
if product_id in final_recs:
final_recs[product_id] += score * weights["content"]
else:
final_recs[product_id] = score * weights["content"]
for product_id, score in realtime_recs.items():
if product_id in final_recs:
final_recs[product_id] += score * weights["realtime"]
else:
final_recs[product_id] = score * weights["realtime"]
# Sort and return top recommendations
sorted_recs = sorted(final_recs.items(), key=lambda x: x[1], reverse=True)
return [product_id for product_id, score in sorted_recs[:10]]
Results
After deploying our hybrid recommendation engine, we saw significant improvements:
- 22% increase in conversion rate from recommended products
- 15% increase in average order value
- 35% increase in click-through rate on recommendations
The system was particularly effective at cross-selling complementary products and helping users discover new items they wouldn't have found otherwise.
Challenges and Solutions
Cold Start Problem
For new users or products with limited interaction data, we relied more heavily on content-based filtering and popularity-based recommendations. As users interacted more with the platform, we gradually shifted toward collaborative filtering.
Scalability
As our user base grew, computing recommendations in real-time became challenging. We implemented a hybrid approach where:
- Collaborative filtering models were retrained nightly and results cached
- Content-based similarities were precomputed and stored
- Real-time components were optimized for low-latency computation
Evaluation Metrics
We used both offline metrics (precision, recall, NDCG) and online A/B testing to evaluate our recommendation system. Online metrics like click-through rate, conversion rate, and revenue proved most valuable for business impact assessment.
Conclusion
Building a hybrid recommendation engine for Alfa Mall significantly improved user engagement and business metrics. The combination of collaborative filtering, content-based filtering, and real-time behavior analysis provided robust, personalized recommendations that addressed the limitations of simpler approaches.
In future posts, I'll dive deeper into specific aspects like handling temporal dynamics in recommendation systems and incorporating contextual information for even more personalized recommendations.