I first posted this question on stackoverflow but I think, it as more it's place here.
In order to limit the size of my REST API answers, I want to implement the Google performance tip: using the fields
query string parameter to do partial resources.
If I have a full answer GET https://myapi.com/v1/users
[
{
"id": 12,
"first_name": "Angie",
"last_name": "Smith",
"address": {
"street": "1122 Something St.",
"city": "A city"
..and so on...
}
},
... and so on
]
I will be able to filter it GET https://myapi.com/v1/users?fields=first_name,address/city
[
{
"id": 12,
"first_name": "Angie",
"address": {"city" : "A city"}
},
... and so on
]
The concept is pretty easy to understand, but I can't find an easy way to implement it!
My API resources are all design the same way:
- use query string parameters for filtering, sorting, paging.
- call a service with that parameters to do a SQL request (only the
WHERE
condition, theORDER BY
condition and theLIMIT
are dynamic) - use a converter to format data back to JSON
But when using this new fields
parameter, what do I need to do? where do I filter the data?
Do I need to filter only the JSON output? But I will make (in that example) an unwanted JOIN query on address table and fetch unwanted fields in the users table.
Do I need to make a dynamic SQL query to fetch exactly the requested fields and add the JOIN only when the end user need it? Then the converter will have to be smart to convert only the available fields in the SQL query.
In my opinion, this second solution will produce a code extremely dynamic, extremely complex and difficult to maintain.
So, how do you implement such REST API with partial resource feature? What are you best practice in that case?
(I'm a Java and PHP developer, but I don't think it's relevant for that question as I have the same question for the two languages)
1 Answer 1
I think it's important to note that google's recommendation here is about how to make your application more performant when calling google apis
Google's apis have a lot of fields and they have many different customers using them for many different reasons. Adding a field filter will save them and the client lots of bandwidth.
However, the same might not be true for your api.
If your objects are small
Limiting the fields will have less of an effect on bandwidth and bandwidth might not be your performance bottleneck.
If you have few, or control your own clients
You can customise the endpoint to return exactly the data the client needs without the client having to specify a dynamic list of fields.
If the client makes use of cached responses
Sending all the fields may be preferable to making two or more calls to the same endpoint.
If you do want to go ahead and implement field filtering then, as you point out, you immediately bump into the problem that filtering the fields may well increase the processor demand of your api.
It will be a case by case problem of whether to filter on the database, output or elsewhere.
fields
? On DB or in-memory? Where is more complex to implement? I think making the projection in the DB is more common, easy and performant. But it depends on your stack. For example, with NodeJS this is easier to do in memory than in Java cause of the strong typing of javaSELECT *
by aSELECT first_name, city
..., I guess? Or did I miss something?