[フレーム]
BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Log In
or

Don't have an InfoQ account?

Register
  • Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
  • Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
  • Save articles and read at anytimeBookmark articles to read whenever youre ready.

Topics

Choose your language

InfoQ Homepage News Atlassian's 4 Million PostgreSQL Database Migration: When Standard Cloud Strategies Fail

Atlassian's 4 Million PostgreSQL Database Migration: When Standard Cloud Strategies Fail

Jul 05, 2025 3 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.
Get in touch
Listen to this article - 0:00
Audio ready to play
0:00
0:00

Atlassian recently migrated 4 million Jira databases to Amazon Aurora, intending to reduce costs and improve the reliability of its Jira Cloud platform. Due to the large number of files involved and the constraints of managed services, the team developed a custom tool to orchestrate the process, as traditional cloud migration strategies were not viable.

In an article on the Atlassian engineering blog, the team describes the technical challenges and outcomes of migrating thousands of PostgreSQL clusters with up to 4000 databases each.

Atlassian’s architecture for Jira uses one database per tenant (an approach that is usually justified when the tenant count is small), which translates into over 4 million PostgreSQL databases. Pat Rubis, principal site reliability engineer at Atlassian, explains:

One database per tenant is an uncommon architecture, and we’ve opted for it in order to maximize isolation, scalability, and operational control at Atlassian’s massive scale. It makes it much easier to ensure that data from one tenant cannot accidentally or maliciously be accessed by another, and allows us to scale our fleet horizontally, balancing load and optimising performance for tenants of significantly different sizes.

Due to the specific architecture, the team has to occasionally rebalance the databases across instances to maintain an even spread of load. In late 2023, the team decided to perform a replatform of the entire fleet to Amazon Aurora, involving all the accounts of the Jira Cloud platform. The goals were to take advantage of Aurora's better SLA (99.99%), increase elasticity by autoscaling the reader instances, and achieve some cost optimizations.

The project was estimated to last a few months, minimizing tenant downtime and migration costs. It was orchestrated using AWS Step Functions and relied on feature flags to immediately override the tenants' database endpoints on the application servers. While the conversion of an Amazon RDS for PostgreSQL instance to Aurora is usually a simple task, the large number of databases per instance forces a cutover in unison for all those tenants, each with their own connection endpoint and credentials.

Furthermore, as a single Jira database corresponds to about 5000 files on disk, the overall number of files per PostgreSQL instance was in the millions, hitting a limitation on Aurora's side, with the new replica instance timing out while performing a status check activity and impacting Atlassian's ability to convert the clusters safely. A different approach, called "draining," was devised to orchestrate the migration, reducing first the number of tenants on instances to be converted and controlling the number of databases moved across clusters.

Database draining

Source: Atlassian blog

To minimize the impact on normal operations during the migration process, one of the challenges of the project was controlling both source and destination concurrency. Rubis adds:

Ultimately, we had to find a balance between how much additional infrastructure we wanted in each region to perform the migrations (and how much that would cost), and how long we were comfortable with each region taking to complete.

At peak, Atlassian managed to migrate up to 90000 Jira databases per day, with an average of 38000 databases per day. Cassian Cox, senior engineering manager at Atlassian, comments on LinkedIn:

This was a huge piece of infrastructure work that's been a big part of my time at Atlassian. This unlocked huge improvements in scalability, reliability, and cost efficiency.

Migrations by day

Source: Atlassian Engineering Blog

The entire project involved 2403 RDS database instances to be converted, with 2.6 million databases migrated and 1.8 million databases drained from the source instances.

Overall, the team estimates the total number of database files used in Jira at over 27.4 billion but has not disclosed additional metrics or details on the cost savings achieved.

The startup timeout threshold experienced by Atlassian is currently not documented on the Amazon Aurora quotas and constraints page.

About the Author

Renato Losio

Show moreShow less

Rate this Article

Adoption
Style

Related Content

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

BT

AltStyle によって変換されたページ (->オリジナル) /