ARTG2260: Programming Basics

Working with APIs

Resources

Shiffman - 10.1 Introduction to Data and APIs in JavaScript - p5.js
Shiffman - 10.2: What is JSON? Part I - p5.js
Shiffman - 10.3: What is JSON? Part II - p5.js
Shiffman - 10.4: Loading JSON data from a URL (Asynchronous Callbacks!) - p5.js
Shiffman - 10.5: Working with APIs in Javascript - p5.js
Shiffman - 10.7: API Query with JavaScript setInterval() - p5.js

Additional Resources:
Shiffman - 10.8: Wordnik API and Javascript - p5.js
Shiffman - 10.9: New York Times API and Javascript - p5.js
Shiffman - 10.10: Giphy API and Javascript - p5.js
Shiffman - Coding Challenge #57: Mapping Earthquake Data - p5.js
Shiffman - Coding Challenge #75: Wikipedia API - p5.js

Data Sources

Data.gov Census.gov Dataverse Network Climate Data Sources Climate Station Records CDC Data (Disease Control and Prevention) World Bank Catalog Free SVG Maps UK Office for National Statistics StateMaster Quandl

Working with APIs

An API (Application Programming Interface) is an interface through which one application can access the services of another. These can come in many forms. Wikipedia's API offers its data in JSON, XML, and HTML formats.

We can use data that's already on our domain, and loadJSON() will work pretty fast. We can also load JSON files from other domains, but we can't control how long that will take to load. Try loading this example from the USGS Earthquake API dataset. Loading the data directly from the USGS domain takes longer than loading it from your own domain, but this process is still important -- it's the first step towards working with APIs.

Now try clicking this URL, which tells English Wikipedia's web service API to send you the content of the main page: https://en.wikipedia.org/w/api.php?action=query&titles=Main%20Page&prop=revisions&rvprop=content&format=jsonfm&formatversion=2 . The key element that makes this service an API is exactly that offer; Wikipedia API's sole purpose in life is to offer you its data. And not just offer it, but allow you to query it for specific data in a specific format.

For any API, we'll need to look up the documentation to understand how they expect queries to be formatted. Wikipedia's API documentation isn't the best, but it isn't the worst either. Learning how to use a particular API can be tedious and requires a lot of technical reading. Looking at the properties page is a good place to start. Let's look at a short list of sample queries.

https://en.wikipedia.org/w/api.php?action=query&titles=San_Francisco&prop=images&imlimit=20&format=jsonfm
A request for a list of images on the page for "San Francisco".

https://en.wikipedia.org/w/api.php?action=query&prop=links&format=json&titles=pizza
The links prop returns all of the links on a page.

https://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&titles=pizza
The extracts prop returns the article's truncated text. You can add the exintro= prop to limit the return to only the article introduction. You can also have greater control by specifying a certain number of characters (exchars=) or a certain number of sentences (exsentences=) to return.

Queries consist of the API root and the query string at the end. Query strings follow the format ?key=value&key=value&key=value , or something like that.

In the example below, we can use the preload() function to load data, which of course ensures that our data is loaded before beginning the program.

But what if we have an application that wants to display some continuous stream of data? We can automate API calls to retrieve data well into the future. But we can't use preload() to load data that doesn't exist yet. So we'll need to spend some time continuously loading data (say maybe every 30 seconds). But we don't want our program to freeze up every time we have to do that. That's where asynchronous callbacks come in. This allows for us to have multiple asynchronous (not happening at the same time), meaning that our application doesn't have to wait while the data comes in. But once that data does finally load, a callback() function is executed.

The syntax for doing this is:


    loadJSON('path/tofile.json', gotData);

Where gotData is the name of the callback function. Accordingly, we should have a callback function:


    function gotData(data){ 
       // do something with the data
    }

In this case, data is provided as a parameter to the gotData function. This is a special case where JavaScript will fill that variable with the data loaded during the loadJSON() function. We can then use that variable within the scope of the callback function.


    function setup() {
      noCanvas();
      loadJSON('assets/characters.json', gotData);
    }

    function gotData(data){
      print(data);  
    }

This syntax is specific to p5.js, but this concept of asynchronous loading applies to almost any JavaScript environment -- jQuery, Angular, etc. all have their own ways of doing the same thing.

Alternatively, we can load JSON files from other domains. In some cases, using 'jsonp' as a third parameter can allow you to escape CORS errors.

We can now make a very simple visualization of this data. Notice that we put the visualization functions in the gotData() function, because the data that we need to visualize is in that scope, not in setup();

But there are limitations to this. The gotData function only runs once. What if we want to use the data in the draw() function? Currently we can't, because we can't access that variable from the scope of draw(). The following might seem like a good solution:

But, oh no, we get an error. Cannot read property 'number' of undefined? What does that mean? This has to do with asynchronous loading. We got to the draw() function before data had a chance to load. We can address this by adding a condition to check if spaceData exists yet:

APIs

Most APIs require authentication in the form of an API key. That's because when companies give out their data to you, they want you to identify yourself. Wikipedia's API is an exception. A lot of cool APIs that I'd like to use as examples (like OpenWeatherMap take a little while to process authentication.

Try navigating to this link in your browser:

https://en.wikipedia.org/w/api.php?action=query&prop=links&format=json&titles=pizza

You should hopefully see a lot of data in a format that is hard to read. You can head to a JSON formatter and paste the data in, or install a browser extension that does the same thing. Now let's do the same thing in code:

On closer inspection, we can see that the actual array of links is buried in the hierarchy: wikiData.query.pages[24768].links . Wikipedia has an interesting data structure where ALL of the articles are in one massive array of objects. It happens to be that our query for 'pizza' returned a hit for pages[24768], "Pizza". Let's render the array of links as text.

We can use JavaScript's setInterval() function to call an API at a regular interval. It takes two parameters; the function to call and the interval (in milliseconds) at which to call it. In this case, let's use the New York Times API (it's probably okay to just use my API key for now). We want to return an array of the most recent articles, pick one randomly and print its headline. The request will look like https://api.nytimes.com/svc/search/v2/articlesearch.json?q=processing&sort=newest&api-key=13bb404ad62a4606beabaab654e18cfc We can paste the results in a JSON formatter to better understand the results. How can we extract a random headline?

Solution: To actually access the headline, we need to first select a random article from the array at newsData.response.docs[] which returns 10 articles (docs) by default. We can then select the headline using newsData.response.docs[i].headline.main

The following examples (adapted from Daniel Shiffman) outline dealing with other APIs. The JSON object at http://api.open-notify.org/iss-now.json constantly updates the position of the International Space Station. This example uses the map() function to convert lat, long values to canvas coordinates. We can adjust the parameters of the map() function so that we can actually see it moving.

Wordnik API

The Wordnik API is great for giving us information about word usage.

Giphy API

The Giphy API is great for finding GIFs! We need to run this on its own webpage so that it can render the GIFs as new HTML image elements. That's because p5 renders on HTML canvas, which doesn't like GIFs. We also need to add the p5.dom library. We'll talk about that more next week. You can run the following as a complete example.

  
<!doctype html>
<html>
<head>
<meta charset="utf-8" />

<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.6.0/p5.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/p5.js/0.6.0/addons/p5.dom.min.js"></script>
</head>
<body>

<script>
let api = "http://api.giphy.com/v1/gifs/search?";
let apiKey = "&api_key=dc6zaTOxFJmzC";
let query = "&q=rainbow";


function setup() {
  let url = api + apiKey + query;
  loadJSON(url, gotData);
}

function gotData(giphy) {
  for (var i = 0; i < giphy.data.length; i++) {
    createImg(giphy.data[i].images.original.url);
  }
}
</script>
</body>
</html>