Wednesday, 13 April 2016

Best Practices for Designing a Pragmatic RESTful API

Web APIs has become an very important topic in the last year. We at M-Way Solutions are working every day with different backend systems and therefore we know about the importance of a clean API design.
Typically we use a RESTful design for our web APIs. The concept of REST is to separate the API structure into logical resources. There are used the HTTP methods GET, DELETE, POST and PUT to operate with the resources.
These are 10 best practices to design a clean RESTful API:

1. Use nouns but no verbs

For an easy understanding use this structure for every resource:
ResourceGET
read
POST
create
PUT
update
DELETE
/carsReturns a list of carsCreate a new ticketBulk update of carsDelete all cars
/cars/711Returns a specific carMethod not allowed (405)Updates a specific ticketDeletes a specific ticket
Do not use verbs:
/getAllCars
/createNewCar
/deleteAllRedCars

2. GET method and query parameters should not alter the state

Use PUT, POST and DELETE methods  instead of the GET method to alter the state.
Do not use GET for state changes:
GET /users/711?activate or
GET /users/711/activate

3. Use plural nouns

Do not mix up singular and plural nouns. Keep it simple and use only plural nouns for all resources.
/cars instead of /car
/users instead of /user
/products instead of /product
/settings instead of /setting


4. Use sub-resources for relations

If a resource is related to another resource use subresources.
GET /cars/711/drivers/ Returns a list of drivers for car 711
GET /cars/711/drivers/4 Returns driver #4 for car 711

5. Use HTTP headers for serialization formats

Both, client and server, need to know which format is used for the communication. The format has to be specified in the HTTP-Header.
Content-Type defines the request format.
Accept defines a list of acceptable response formats.

6. Use HATEOAS

Hypermedia athe Engine oApplication State is a principle that hypertext links should be used to create a better navigation through the API.
01{
02  "id": 711,
03  "manufacturer""bmw",
04  "model""X5",
05  "seats": 5,
06  "drivers": [
07   {
08    "id""23",
09    "name""Stefan Jauker",
10    "links": [
11     {
12     "rel""self",
13     "href""/api/v1/drivers/23"
14    }
15   ]
16  }
17 ]
18}


7. Provide filtering, sorting, field selection and paging for collections

Filtering:
Use a unique query parameter for all fields or a query language for filtering.
GET /cars?color=red Returns a list of red cars
GET /cars?seats<=2 Returns a list of cars with a maximum of 2 seats
Sorting:
Allow ascending and descending sorting over multiple fields.
GET /cars?sort=-manufactorer,+model
This returns a list of cars sorted by descending manufacturers and ascending models.
Field selection
Mobile clients display just a few attributes in a list. They don’t need all attributes of a resource. Give the API consumer the ability to choose returned fields. This will also reduce the network traffic and speed up the usage of the API.
GET /cars?fields=manufacturer,model,id,color
Paging
Use limit and offset. It is flexible for the user and common in leading databases. The default should be limit=20 and offset=0
GET /cars?offset=10&limit=5
To send the total entries back to the user use the custom HTTP header: X-Total-Count.
Links to the next or previous page should be provided in the HTTP header link as well. It is important to follow this link header values instead of constructing your own URLs.
Link: <https://blog.mwaysolutions.com/sample/api/v1/cars?offset=15&limit=5>; rel="next",
<https://blog.mwaysolutions.com/sample/api/v1/cars?offset=50&limit=3>; rel="last",
<https://blog.mwaysolutions.com/sample/api/v1/cars?offset=0&limit=5>; rel="first",
<https://blog.mwaysolutions.com/sample/api/v1/cars?offset=5&limit=5>; rel="prev",

8. Version your API

Make the API Version mandatory and do not release an unversioned API. Use a simple ordinal number and avoid dot notation such as 2.5.
We are using the url for the API versioning starting with the letter „v“
/blog/api/v1

9. Handle Errors with HTTP status codes

It is hard to work with an API that ignores error handling. Pure returning of a HTTP 500 with a stacktrace is not very helpful.

Use HTTP status codes
The HTTP standard provides over 70 status codes to describe the return values. We don’t need them all, but  there should be used at least a mount of 10.
200 – OK – Eyerything is working
201 – OK – New resource has been created
204 – OK – The resource was successfully deleted
304 – Not Modified – The client can use cached data
400 – Bad Request – The request was invalid or cannot be served. The exact error should be explained in the error payload. E.g. „The JSON is not valid“
401 – Unauthorized – The request requires an user authentication
403 – Forbidden – The server understood the request, but is refusing it or the access is not allowed.
404 – Not found – There is no resource behind the URI.
422 – Unprocessable Entity – Should be used if the server cannot process the enitity, e.g. if an image cannot be formatted or mandatory fields are missing in the payload.
500 – Internal Server Error – API developers should avoid this error. If an error occurs in the global catch blog, the stracktrace should be logged and not returned as response.
Use error payloads
All exceptions should be mapped in an error payload. Here is an example how a JSON payload should look like.
01{
02  "errors": [
03   {
04    "userMessage""Sorry, the requested resource does not exist",
05    "internalMessage""No car found in the database",
06    "code": 34,
08   }
09  ]
10

10. Allow overriding HTTP method

Some proxies support only POST and GET methods. To support a RESTful API with these limitations, the API needs a way to override the HTTP method.
Use the custom HTTP Header X-HTTP-Method-Override to overrider the POST Method.

API Design Cheat Sheet

  1. Build the API with consumers in mind--as a product in its own right.
    • Not for a specific UI.
    • Embrace flexibility / tunability of each endpoint (see #5, 6 & 7).
    • Eat your own dogfood, even if you have to mockup an example UI.
  2. Use the Collection Metaphor.
    • Two URLs (endpoints) per resource:
      • The resource collection (e.g. /orders)
      • Individual resource within the collection (e.g. /orders/{orderId}).
    • Use plural forms (‘orders’ instead of ‘order’).
    • Alternate resource names with IDs as URL nodes (e.g. /orders/{orderId}/items/{itemId})
    • Keep URLs as short as possible. Preferably, no more-than three nodes per URL.
  3. Use nouns as resource names (e.g. don’t use verbs in URLs).
  4. Make resource representations meaningful.
    • “No Naked IDs!” No plain IDs embedded in responses. Use links and reference objects.
    • Design resource representations. Don’t simply represent database tables.
    • Merge representations. Don’t expose relationship tables as two IDs.
  5. Support filtering, sorting, and pagination on collections.
  6. Support link expansion of relationships. Allow clients to expand the data contained in the response by including additional representations instead of, or in addition to, links.
  7. Support field projections on resources. Allow clients to reduce the number of fields that come back in the response.
  8. Use the HTTP method names to mean something:
    • POST - create and other non-idempotent operations.
    • PUT - update.
    • GET - read a resource or collection.
    • DELETE - remove a resource or collection.
  9. Use HTTP status codes to be meaningful.
    • 200 - Success.
    • 201 - Created. Returned on successful creation of a new resource. Include a 'Location' header with a link to the newly-created resource.
    • 400 - Bad request. Data issues such as invalid JSON, etc.
    • 404 - Not found. Resource not found on GET.
    • 409 - Conflict. Duplicate data or invalid data state would occur.
  10. Use ISO 8601 timepoint formats for dates in representations.
  11. Consider connectedness by utilizing a linking strategy. Some popular examples are:
  12. Use OAuth2 to secure your API.
    • Use a Bearer token for authentication.
    • Require HTTPS / TLS / SSL to access your APIs. OAuth2 Bearer tokens demand it. Unencrypted communication over HTTP allows for simple eavesdroppping and impersonation.
  13. Use Content-Type negotiation to describe incoming request payloads.
    For example, let's say you're doing ratings, including a thumbs-up/thumbs-down and five-star rating. You have one route to create a rating: POST /ratings
    How do you distinguish the incoming data to the service so it can determine which rating type it is: thumbs-up or five star?
    The temptation is to create one route for each rating type: POST /ratings/five_star and POST /ratings/thumbs_up
    However, by using Content-Type negotiation we can use our same POST /ratings route for both types. By setting theContent-Type header on the request to something like Content-Type: application/vnd.company.rating.thumbsup orContent-Type: application/vnd.company.rating.fivestar the server can determine how to process the incoming rating data.
  14. Evolution over versioning. However, if versioning, use the Accept header instead of versioning in the URL.
    • Versioning via the URL signifies a 'platform' version and the entire platform must be versioned at the same time to enable the linking strategy.
    • Versioning via the Accept header is versioning the resource.
    • Additions to a JSON response do not require versioning. However, additions to a JSON request body that are 'required' are troublesome--and may require versioning.
    • Hypermedia linking and versioning is troublesome no matter what--minimize it.
    • Note that a version in the URL, while discouraged, can be used as a 'platform' version. It should appear as the first node in the path and not version individual endpoints differently (e.g. api.example.com/v1/...).
  15. Consider Cache-ability. At a minimum, use the following response headers:
    • ETag - An arbitrary string for the version of a representation. Make sure to include the media type in the hash value, because that makes a different representation. (ex: ETag: "686897696a7c876b7e")
    • Date - Date and time the response was returned (in RFC1123 format). (ex: Date: Sun, 06 Nov 1994 08:49:37 GMT)
    • Cache-Control - The maximum number of seconds (max age) a response can be cached. However, if caching is not supported for the response, then no-cache is the value. (ex: Cache-Control: 360 or Cache-Control: no-cache)
    • Expires - If max age is given, contains the timestamp (in RFC1123 format) for when the response expires, which is the value of Date (e.g. now) plus max age. If caching is not supported for the response, this header is not present. (ex: Expires: Sun, 06 Nov 1994 08:49:37 GMT)
    • Pragma - When Cache-Control is 'no-cache' this header is also set to 'no-cache'. Otherwise, it is not present. (ex: Pragma: no-cache)
    • Last-Modified - The timestamp that the resource itself was modified last (in RFC1123 format). (ex: Last-Modified: Sun, 06 Nov 1994 08:49:37 GMT)
  16. Ensure that your GET, PUT, and DELETE operations are all idempotent. There should be no adverse side affects from operations.
API development has been a very interesting and challenging type of web development in recent years. Scalability wasn't nearly as important early on in the Internet's history when there were less users, so standalone apps performed well for the most part. With the flood of different mobile devices, however, today's users are now able to use the same app from many different devices, which has resulted in developers needing to scale their apps.
With that said, it makes sense to create a common API that can easily and effectively scale to the ever-changing needs of the user rather than developing different standalone apps. Because there's several different technologies to create APIs, there's no real standard or method for designing APIs. However, there are several agreed-upon best practices for designing APIs correctly. In the absence of a real standard, it's hard to create rigid conventions. However, most of what's going to be discussed comes from real-world experience from moving legacy apps to an API model or creating a new API from scratch that's not only resilient and stable, but that's also scalable and fast.

Importance of Good API Design

No one likes to see a 404 error on a website, especially when someone has bookmarked a blog post or article for future use. It's a hassle for the user anytime a URL needs to be changed, and URL modification isn't good for the SEO, either. Designing an API without the right abstraction might require changing the URLs later. Although redirects can solve this problem, it's a SEO best practice to avoid redirects. So, it makes sense to design the standards before you delve into API design and focus later on the specific details and business names before creating the URLs.
In this series, we'll keep ourselves focused on standards for API development. This series covers different technical standards, which vary depending on the specific business case. The assumption is that the API design is for RESTful services that are HTTP-based.

1. Nouns, No Verbs

The first rule of API design is to use nouns in the URL. We should also delegate verbs using HTTP verbs. For example, the following URL contains the GET HTTP method:
chanderdhall.com/gettrainings
Because HTTP supports the GET method, the API needs to be designed in a way that the standard verb GET is used, which results in the following URL:
chanderdhall.com/trainings
What's the problem with the original URL? Although the URL looks good and is very intuitive, what if a user tries to use this with a different HTTP verb? Let's say the user wants to use the following URL:
POST chanderdhall.com/gettrainings
This is evidently self-defeating. We're trying to add a training, and the URL appears to give us the wrong interpretation, which might confuse the user into the URL's actual use. This same rule applies to the rest of the verbs. You can create a HTTP DELETE request that says GetTrainings as shown in the following solution:
GET chanderdhall.com/trainings POST chanderdhall.com/trainings PUT chanderdhall.com/trainings DELETE chanderdhall.com/trainings
As you might notice, the URL never changes, no matter what the verb is.

2. Two Base URLs per Resource

The next standard is based on the Keep it Simple, Stupid (KISS) principle. We really need two base URLs per resource: one for multiple values and one for the specific value. In this example, we need to either know about multiple trainings offered or we need to find a specific training, given that we have the required information to retrieve it. Again, that information could be the training ID or name. The following shows different results that use two base URLs with four HTTP verbs:
HTTP GET (read) GET chanderdhall.com/trainings - Retrieves all trainings GET chanderdhall.com/trainings/123 - Retrieves all trainings with ID=123 HTTP POST (insert) POST chanderdhall.com/trainings - Creates a new podcast POST chanderdhall.com/trainings/123 - Error
Why is there an error by using the POST method for a training with an ID? It's because the ID is something that gets generated by the database and then is passed back to the client. Why might you create this type of functionality if all we're going to do is return an error? This is because we're designing an API and we don't want to give the wrong impression to a user that his or her request got accepted when, in actuality, it didn't. So, it's important that we handle errors wherever needed.
HTTP PUT (update) PUT chanderdhall.com/trainings - Bulk update all trainings PUT chanderdhall.com/trainings/123 - Updates a particular training if it exists. If it doesn't exist, then it throws an error. It does this because we don't want the PUT method to perform an insert because the POST method is the correct verb for inserts. HTTP DELETE GET chanderdhall.com/trainings - Deletes all trainings. Becareful to make sure the person is authenticated and authorized to delete the trainings. GET chanderdhall.com/trainings/123 - Deletes trainings with ID=123 with authorization permissions. It should check to see if the particular training actually exists. If the training doesn't exist, then it should throw an error stating that the resource wasn't found.

3. Associations

Your APIs should be very intuitive when you're developing them for associations. The following URL intuitively suggests that it's requesting a training that's delivered by a trainer with an ID of 14 and a training with ID 114:
GET /trainers/14/training/114
We have traversed two levels in this URL. One level is the trainer, and the second level is the training that the trainer is providing.
How many levels should we traverse? That's a hard question to answer but it's better to keep it to a max of three levels unless a business case dictates to traverse higher.

4. Complexity

It's highly recommended to use a query string toreduce the complexity of a URL that has several different associations. The following shows how to use a query string that's somewhat complicated and unintuitive to the user:
GET /trainers/12/training/11/zip/75080/city/richardson/state/tx
The following shows a better way to write a query string:
GET /trainers/12/training/11?zip=75080&city=richardson&state=tx
What goes into a query string? There's no hard and fast rule. You can use a Domain Driven Design approach to put the entities in the regular URL before the query string (in this case ‘trainers' and ‘training') and the value objects can be added as a part of the query string (in this case zipcode, city and state). Although value objects are similar to entities in Domain Driven Design, they might not have an ID associated with them. In this case, the address is a value object and assuming our address doesn't have an ID, then the properties of that value object can become the filtering criteria for the query string.
Is this a universal law?
No, it's not. This technique can be helpful in generic cases. However, it's usually your business that determines the filtering criteria, but it's something that you can deviate from.
I would recommend going up to three levels on the parent-child relationships in the URL. If your URL is still complex, then consider adding a query string.

5. Error Format

Error formatting can becoming especiallyinteresting when we don't explicitly define a format. I believe an error format is very important and there's a need to have a particular error format across the board in a company. Let's consider an error format such as the following:
{     "error_type": "Application Error",     "error_details": "#500: Internal server error" }
Although this isn't a bad error format, it's somewhat difficult to use. From the client's point of view, the HTTP error code isn't a different field and needs to be inferred either by splitting the string "error_details" at the level of the ":" and then comparing it with all possible HTTP codes. There's room for error and processing is unnecessary and unwanted at the client level. Here's another error format that I like:
{                 "http_code": "401",                 "message" : "Authentication needed",                 "internal_error_code" : "123",                 "details_url" : http:// chanderdhall.com/wiki/errors/123 }
The property names such as "internal_error_code" can be made shorter such as "code." The reason that these property names are longer are so that they are also self-explanatory. In the example above, 401 is nothing but the HTTP error code and 123 is an internal error code details to which could be found at the details_url. This has an added advantage for the user to understand the error better.
It's really up to you regarding how many HTTP error codes that should be supported. About 10 standard codes should be sufficient. My recommendation is to use HTTP error codes so that it conforms to the standard. Anytime we see a 200 error, then we know that we're successful and so it's quintessential for the user of the API to know the associated HTTP error code.
I hope you liked the Part 1 of best practices for API design. Please let me know if you have any questions by sending me an email. In Part 2, we discuss versioning, pagination, partial responses,  formatted results, multiple responses, search, and hypermedia.

No comments:

Post a Comment

Angular Tutorial (Update to Angular 7)

As Angular 7 has just been released a few days ago. This tutorial is updated to show you how to create an Angular 7 project and the new fe...