APIs

Lassi Haaranen, edited by Petri Ihantola

licensed under a Creative Commons Attribution 4.0 International license.

API Introduction

Not all web applications are meant directly for people. Different web services provide useful information primarily aimed for other applications to consume. Some examples are things like authentication, stock data, weather reports, Twitter feeds, etc.

These programmatic interfaces might not have a GUI but instead are meant to be used by other programs on client-side as well as on server-side.

Typically they are created in a way that allows the consumers to use different operating systems or languages.

Web Services

Most commonly they transfer the data through HTTP.

Three of commonly used options for web services:

  • SOAP Simple Object Access Protocol
  • XML-RPC XML (encoded) Remote Procedure Call
  • REST Representational State Transfer (probably most common nowadays?)

SOAP

Simple Object Access Protocol (SOAP) is an XML-based messaging protocol where client sends requests encoded in XML and gets reponses back as XML. It can use other protocols in addition to HTTP, such as SMTP.

Web Services Description Language (WSDL) is XML-based language used to describe the data/functionality offered by a web service. It's commonly used with SOAP.

You should not need to write SOAP yourself, let the API/Toolkit handle that for you.

SOAP Example

A call in some system like this:

...getFirstName("tteekkar")
        

Would be expressed in SOAP like this:

<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" 
  SOAP-ENV:encodingStyle= 
  "http://schemas.xmlsoap.org/soap/encoding/">
  <SOAP-ENV:Body>
    <mtd:GetFirstName xmlns:mtd="an URI"
      <username xsi:type="xsd:string">
        tteekkar
      </username>
    </mtd:GetFirstName>
  </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

XML-RPC

Remote Procedure Call (RPC) generally refers to collections of techniques to call a procedure typically on another computer. XML-RPC is a specification that is SOAP's predecessor.

Procedure calls, parameters and responses are encoded in XML. It's simpler than SOAP and doesn't have a WSDL.

Representational State Transfer

In RPCs the focus is on method. In Representational State Transfer (REST) the focus is on resources that can be accessed and acted upon in a standard way.

Roy Fielding described REST in his dissertation 'Architectural Styles and the Design of Network-based Software Architectures'.

Anything that is important enough to be separately referenced/ modified/retrieved is a resource that has a reference. Universal Resource Identifier (URI) is the reference. They should be constructed in an intuitive (human readable) manner and reflecting the underlying data structure.

Programmable Web

The web is full of different APIs, a good site to start looking for what is out there is programmableweb.com.

As an example New York Times' Most shared sports items from past 7 days.

RESTful Web Services

Resources are any type of data that can be pointed to with a hyperlink: images, documents, web services, etc.

Data is encoded in a way that it can represent the state of the resource and resources are referenced by URIs.

REST is a pattern, a way of doing things that is not formally codified.

The REST portion of this lecture is influenced by Beautiful REST & JSON APIs by Les Hazlewood (definitely worth watching if you need to create an API).

REST Buzzword Bingo

Scalability, Generality, Independence, Latency (caching), Security, Encapsulation, Separation of concerns, Statelessness, Uniform interface

REST and JSON

REST APIs often use JSON since it offers many benefits:

  • ubiquity
  • simplicity
  • readability
  • flexibility
  • etc...

However, XML/plaintext/html etc. is also used.

REST: Resources

In RPC, resources can be thought of as verbs. In REST, resources should be nouns.

Broadly speaking there are two types of resources:

  • Collection resource e.g. /blogposts/2014 These link and list multiple instance resources. And they might also have their own attributes (e.g. 'first', 'latest')
  • Instance resource e.g. /blogposts/rant-about-rest These resources describe one instance of something

Resources Example

Example of an instance resource:

// from: http://some-library-service/isbn/978-0596517748
{
  "title": "JavaScript: The Good Parts",
  "author": "Douglas Crockford",
  "published": "2008",
  "isbn": "978-0596517748"
  "href": "http://some-library-service/isbn/978-0596517748"
}
// from: http:/some-library-service/keyword/javascript
{
  "keywords": ["javascript"],
  "books": [ 
    {
      "href": "http://some-library-service/isbn/978-0596517748",
      "isbn": "978-0596517748"
    }, {
      "href": "http://some-library-service/isbn/978-1933988696",
      "isbn": "978-1933988696"
    },
    ...
  ]
}

REST: Statelessness

Any individual request should be understandable without knowledge of any preceding requests.

This also means that requests can be interpreted and acted upon individually, which makes parallel processing easier.

All required data needs to be present in the request itself.

REST: Addressability

Addressability (every resource has a unique identifier) gives the major benefit of using resources in unforeseen ways.

Related to this it is a good idea to version your APIs, for example:

http://my-cool-site.com/api/v1/some/stuff

The future is unknown, and you might need to change the API for one reason or another. Versioning enables you to leave the old behavior in place so those programs that depend on that version will not be broken.

REST: Connectedness

The idea of connectedness is that resources link to other resources

Following links traverses the API uncovering more information and structure.

REST: Uniform Interface

All the resources share a common uniform interface. Instead of doing something (URLs as a "method") set something (URLs as "variables").

Four main constraints of uniform interface:

Take advantage of HTTP "verbs":

  • GET fetch resource
  • PUT create/replace resource
  • DELETE remove resource
  • POST add/modify resource

POST is not idempotent, so it can be used for partial updates.

Browsers do support PUT and DELETE - HTML doesn't.

PUT and DELETE request can be initiated via Javascript (AJAX), but not via HTML form submission.

Uniform Interface Principles

  • Identification of Resources: Request identifies the resource, data could be returned in number of formats.
  • Manipulation of resources through representations: If a client has the resources, it knows enough to modify/manipulate it
  • Self-descriptive messages: Each message contains enough information on how to process it (what is its mime-type etc.)
  • HATEOAS

Wikipedia has a nice article about REST.

HATEOAS

Hypertext As The Engine Of Application State (HATEOAS).

"Clients make state transitions only through actions that are dynamically identified within hypermedia by the server (e.g., by hyperlinks within hypertext). Except for simple fixed entry points to the application, a client does not assume that any particular action is available for any particular resources beyond those described in representations previously received from the server." Wikipedia - REST (emphasis mine)

Still not widely used

HATEOAS Challenges

There is no standard way for describing the relationships of resources in RESTful interfaces. Couple of proposals:

  • HAL specification http://stateless.co/hal_specification.html
  • JSON API http://jsonapi.org/

REST: Good Ideas

API versioning. You might need to change it in the future. If you version, you will not break existing API consumers.

some-domain.com/api/v1/resources/item/775

Standard timeformats. Don't invent your own format for expressing times, ISO 8601 is a better option.

"time": "D2014/07/08T11:23pm" // Do NOT do this
"time": "2014-11-13T17:45:11.647Z" //Do this

Pagination. Use offset and limit, and use them consistently.

/api/v1/books/2014/?offset=50&limit=25

REST: Reference Expansion

Reference expansion (a.k.a. link expansion or entity expansion) is also a good idea: some/resource?expand=attribute+other_attribute.

// from: http:/some-library-service/keyword/javascript?expand=title
{
  "keywords": ["javascript"],
  "books": [ 
    {
      "href": "http://some-library-service/isbn/978-0596517748",
      "isbn": "978-0596517748",
      "title": "JavaScript: The Good Parts"
    }, {
      "href": "http://some-library-service/isbn/978-1933988696",
      "isbn": "978-1933988696"
      "title": "Secrets of the JavaScript Ninja"
    },
    ...
  ]
}

REST service composition

REST services can also invoke other (REST) services.

  • The idempotence and safeness must be maintained, i.e., an idemponent (safe) request should not cause a non-idemponent (non-safe) request to subservice

The stateless constraint somewhat limits service composition (creating task services)

  • Model a long-running task as a resource
    POST /pizzaOrder instead of POST /pizza
  • Or POST /pizza and return 202 In progress (STATE?)

REST Many-to-Many

Many-to-many mappings should be resources as well.

For example, consider API describing groups and projects that can have many-to-many relations.

/* A relationship between a group and a project */
{
  "group": {
    "id": 7,
    "href": "http://..."
  },
  "project": {
    "id": 3,
    "href": "http://..."
  },
  "createdBy": "someUserId",
  "created": "2014-10-28"
}

Deleting this resource will only remove the link, groups and projects will stay intact.

REST Security

Avoid session based authentication; keeping up with the principle of statelessness, prefer to authenticate every request. Also prefer API Keys (like in GitLab) instead of username/password.

Authorize based on the content of the resource, not URLs which might change.

OAuth is a good option.

HTTP and Caching

HTTP has built-in support for caching. HTTP headers contain metadata about fetched resources and can be used to make condiditonal requests.

Conditional GET requests are only executed, when a certain condition is fulfilled, e.g. If-Modified-Since is only executed if a resource has been modified since the given date.

Images, JS libraries, CSS files, static pages, etc. should stay in cache as long as possible. Similarly, on rapidly changing pages, caching should be disabled or expire fast.

Caching is also really beneficial for REST APIs.

Caching Metadata

ETag is calculated from the resource contents. If it has not changed, the resource hasn't changed either.

Expires informs when the resource expires and should be fetched again.

Last-modified informs when the resource was last modified on the server.

API Etiquette

Almost all APIs have some sort of usage policy. Typically the number of requests you make in a given time frame is limited, e.g. maximum of 10 requests in a minute.

Before you start utilizing some API in your web app, make sure you adhere to the usage policies.

Some challenges of REST

  • Transactions
    Manipulating multiple resources; commit iff all successful
  • Asynchronous tasks and batch processing
  • Links between resources

Thank you