We have implemented SQL Row-level security using SESSION_CONTEXT. From an architecture standpoint, this is a beautiful paradigm that allows our application to be clueless to the fact that only 1 tenant's data is being returned based on the logged in user.
However, we've come across some performance issues related to pre-filtering joins that result in full table scans or index scans. When I explicitly add FORCESEEK, the queries run significantly faster but SQL will never choose a seek on its own and it's my understanding that RLS is to blame for this. I verified that parameter sniffing and cached statistics are not to blame by forcing a recompute every time.
Have you had this happen before? Is there a way around it with RLS? Our backup plan is to scrap RLS completely and move to an Azure SQL elastic pool and horizontal sharding based on customer id.
Thanks so much for any guidance you can provide.
UPDATE: Below are the query execution plans for: 1) With RLS enabled on all 3 tables 2) With RLS enabled on all 3 tables + FORCESEEK option 3) With RLS disabled
With RLS enabled With RLS and ForceSeek on Joins Without RLS
1 Answer 1
We have implemented SQL Row-level ...[which] allows our application to be clueless to the fact that only 1 tenant's data is being returned based on the logged in user. ... However, we've come across some performance issues
You will continue to run into performance issues with multi-tenant databases. RLS is really neither here nor there. If you added your own tenant filters in your application code, you would have the same issues.
To manage a multi-tenant database schema is always a challenge. First you change every table to support tenant-based filtering. This typically means including the TenantId in the clustered index of every table, either as the leading column, or as a traling column with a partition scheme. Then you have to manage query plans, which can't differ between tenants in a multi-tenant database. Then you're stuck upgrading all the tenants at the same time, which, critically, reduces your ability to quickly patch issues affecting a single tenant. Eventually you'll have to build processes to Export and Import a single tenant. You can work through this stuff, and many companies do. But it's fundamentally a challenging model, and one that only really makes sense when reducing your database footprint actually reduces your costs somehow. IE not in Azure SQL Database.
So you really might be better off to "scrap RLS completely and move to an Azure SQL elastic pool and horizontal sharding based on customer id".
Now the benefits of database-per-tenant are many, and performance is just the start. The best thing is that greatly improves your data security story. It allows you to upgrade or patch on a tenant-by-tenant basis. It gives you tenant-level backup, restore, oops-recovery, export. It provides tenant-level security and reporting. It gives you horizontal scalability, and the option of provisioning dedicated resources for a tenant. etc.
And while it may appear that it's more complicated to manage, and does require some additional automation, in the long run it's actually easier too.
Explore related questions
See similar questions with these tags.
CREATE TABLE
and the RLS definitions? The pictures of the plans aren't very useful