Makers Festival: Working with APIs

While I won't be writing a lot about the development of ForSaleByMaker I wanted to share some thoughts about working with APIs since that is the focus of the Makers Festival.

The vast majority of projects I build these days use an API of some kind. For validation, or data or they provide a service I don't want to build.

In recent projects I'm used API to collect details of soccer matches, posts on social media, and logs of financial transations.

In building ForSaleByMaker, I'll be using ProductHunts new API to collect data on makers, the products they've built and where posted on ProductHunt. That data will be used to build and publish listings on ForSaleByMaker.

When working with APIs there are serveral things to keep in mind as you build your MVP.

Rate Limits

Almost all APIs will have some kind of rate limit, that defines how often you can query data.

In most cases this won't be a problem (unless your project becomes very popular), but it can in some cases and you'll need to build ways into your MVP to monitor how close to these limits you're getting and wait.

When building 500Makers for example, I needed to pull down essentially the entire list of products and makers from Product Hunt. To ensure I didn't hit the rate limit I added a delay after each query of products to sure I won't go over my allowed queries.

With the new API that uses GraphQL this kind of delay become more difficult to implement, as different queries now affect your rate limit in different ways. To combat this problem, I started to use information provided by the API.

Each response that the ProductHunt APi sends includes 3 headers. These headers inform you how many complexity points or requests you have remaining until the rate limit is reset.

Using this data I am able to delay the next request accurately to avoid passing the rate limit.

Realtime Data

In some cases you need realtime information in your project other times delayed information is just fine.

So in some instances you can write a process that runs at a specified time collects the data you need to be used until the next time the process is run. This is what happens in 500Makers. They data is collected once per day, the leaderboard is built and displayed. I added a note to the page to indicate the day the data was collected. The leaderboard doesn't update in real-time.

In other situations you need more real-time information, in reality because of the rate limits, unless it's user information you'll probably still need to implement some kind of delay (due to the rate limits discussed above).

When I built Product Hunt Daily River, which displays products submitted to Product Hunt in chronological (not popularity) order I needed the list to be updated throughout the day. In this case I request todays products from Product Hunt once every minute.

In other cases you need to pull data in realtime and in the case of ForSaleByMaker, I will be making requests in real-time. In particular the list of products that a user has featured on Product Hunt. Before a product can be listed we need to verify if the user is a maker of that product. We do this in real-time to ensure we get an accurate representation of the data on Product Hunt.

Data storage

When you query data from and API, where does that data live, if at all.

In some cases you query the data, use it, manipulate it show it to the user and then then discard it.

In other cases you want to get query data, but use it later so may wish to store it in some fashion.

Exactly how you store that data is determined by what you need it for.

In some cases I'm just storing the data so I need to query the API again (those pesky rate limits usually dictate this), especially if I don't think it will change very often. So how this data is stored is not very important, usually just some kinds of caching mechanism that your framework of choice provides.

In other instances however I want to be able to query the data, filter the data and sort the data. This is very much the case with 500Makers. In this cases all the data I query is saved to a database in a well structured way. This allows me to build the leaderboard, query the data to find products built by specific users in specific years. I am also able to build variations of the leaderboard, by number of votes instead of number of products.

Since I was storing all this information, I was able to build other related projects. I added 500Hunters for example that used the same data queried in a different way.

I'm in the midst of building ForSaleByMaker at the moment the next post will discuss hosting the site!

Rate Limits

Realtime Data

Data storage

Get great content direct to your inbox