InfoQ Homepage News Hashnode Creates Scalable Feed Architecture on AWS with Step Functions, EventBridge and Redis
Hashnode Creates Scalable Feed Architecture on AWS with Step Functions, EventBridge and Redis
This item in japanese
Mar 15, 2024 2 min read
Write for InfoQ
Feed your curiosity. Help 550k+ globalsenior developers
each month stay ahead.Get in touch
Hashnode created a scalable event-driven architecture (EDA) for composing feed data for thousands of users. The company used serverless services on AWS, including Lambda, Step Functions, EventBridge, and Redis Cache. The solution leverages Step Functions' distributed maps feature that enables high-concurrency processing.
The company previously implemented a solution to provide personalized user feeds but soon discovered that the solution suffered from issues around slower page loads and a potential risk of destabilizing the database due to executing expensive queries while composing user feeds on the fly. Florian Fuchs, software engineer at Hashnode, describes the overall idea for optimizing feed calculations:
To optimize page speed, we found that pre-calculating feeds for users is the best option. This means we don't have to calculate the feed every time a user visits our feed page. Instead, we can return the feed from the cache and make page loading times faster. A crucial enabler for this is using a cache. With the fast access a cache offers, we can directly load the feed from there to be presented for our users.
Engineers implemented the feed calculation logic in AWS Step Functions with two workflows. The first workflow uses three Lambda functions to prepare user data for the feed calculation. Lambda functions extract relevant data from the database and store it in the AWS ElastiCache (Redis) cache. The second workflow is responsible for the actual feed calculation. Depending on whether the cached metadata is found for the user, the feed calculation logic can be either fully based on metadata sourced from the Redis cache or require extracting user metadata for the database.
In the new architecture, feed recalculation is triggered by events for the post creation or update, published into the AWS Event Bridge, or periodically, with the help of EventBridge Scheduler.
The Hashnode team leveraged the Map state in Step Functions, which is helpful for orchestrating parallel workloads. The map state supports two modes, depending on the processing requirements. The default inline mode offers limited concurrency and only accepts a JSON array as input. The distributed mode is suitable for large-scale parallel workloads and supports processing data sources stored in S3. In the distributed mode, Step Functions can run upwards of 10,000 parallel child workloads.
Step Functions with Distributed Map State (Source: AWS Documentation)
The solution employs two step functions using the map state in the distributed mode, one for users with cached metadata and one for users where no metadata was found. Developers report that, for now, the full recalculation of feeds for thousands of users takes only 26 seconds. The team additionally implemented periodic cache-purge logic to ensure old cached data is removed regularly.
This content is in the Event Driven Architecture topic
Related Topics:
-
Related Editorial
-
Related Sponsors
-
Popular across InfoQ
-
Microsoft Patches Critical ASP.NET Core Vulnerability with 9.9 Severity Score
-
Monzo’s Real-Time Fraud Detection Architecture with BigQuery and Microservices
-
Architecture Should Model the World as it Really is: a Conversation with Randy Shoup
-
OpenJDK News Roundup: Vector API, Ahead-of-Time Object Caching, Prepare to Make Final Mean Final
-
Reducing False Positives in Retrieval-Augmented Generation (RAG) Semantic Caching: a Banking Case Study
-
Anthropic Adds Sandboxing and Web Access to Claude Code for Safer AI-Powered Coding
-
Related Content
The InfoQ Newsletter
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example