Create or update the parameters and ruleset of an existing Custom API
A Custom API starts as either an extension of an existing Extract API (allowing you to override or correct data returned from it), or as a blank canvas API with which rules can be given.
Custom APIs can be created and updated on the dashboard or via API. For a detailed walkthrough, follow the Getting Started with Custom API guide. This doc will focus on some quickstart examples, best practices, and the API reference.
Custom APIs use Diffbot's cloud-based rendering engine, and fully execute most page-level Javascript in order to access Ajax-delivered elements.
Quickstart — Creating a Custom API
This example will create a Custom API that extends the Article API to extract author
out of a blog post page on diffbot.com.
import requests
url = "http://api.diffbot.com/v3/custom?token=YOUR_DIFFBOT_TOKEN"
payload = {
"rules": [
{
"name": "author",
"selector": ".mb-1.text-dark strong"
}
],
"api": "/api/article",
"urlPattern": "(http(s)?://)?(.*\\.)?www.diffbot.com.*",
"testUrl": "http://www.diffbot.com/insights/build-a-sanctions-tracker/"
}
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json'
}
response = requests.request("POST", url, headers=headers, json=payload)
print(response.text)
const headers = new Headers();
headers.append("Content-Type", "application/json");
headers.append("Accept", "application/json");
const payload = {
"rules": [
{
"name": "author",
"selector": ".mb-1.text-dark strong"
}
],
"api": "/api/article",
"urlPattern": "(http(s)?://)?(.*\\.)?www.diffbot.com.*",
"testUrl": "http://www.diffbot.com/insights/build-a-sanctions-tracker/"
}
const requestOptions = {
method: "POST",
headers: headers,
body: JSON.stringify(payload),
redirect: "follow"
};
fetch("http://api.diffbot.com/v3/custom?token=YOUR_DIFFBOT_TOKEN", requestOptions)
.then((response) => response.text())
.then((result) => console.log(result))
.catch((error) => console.error(error));
curl --location --globoff 'http://api.diffbot.com/v3/custom?token=YOUR_DIFFBOT_TOKEN' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '{
"rules": [
{
"name": "author",
"selector": ".mb-1.text-dark strong"
}
],
"api": "/api/article",
"urlPattern": "(http(s)?://)?(.*\\.)?www.diffbot.com.*",
"testUrl": "http://www.diffbot.com/insights/build-a-sanctions-tracker/"
}'
On success, you should receive a JSON response with a hashes
attribute. The value of this attribute can be safely ignored or stored for identification. It is not used by any public Diffbot APIs.
{"hashes":["1234567"]}
Quickstart — Updating a Custom API
To update an existing Custom API, the values of urlPattern
and api
in the JSON payload have to exactly match those of an existing Custom API. If either of these values are changed, a new Custom API is created. Furthermore, the rules object will be replaced completely.
The following example will edit the quickstart Custom API created above and update the rules object to extract the subheader of the same page.
import requests
url = "http://api.diffbot.com/v3/custom?token=YOUR_DIFFBOT_TOKEN"
payload = {
"rules": [
{
"name": "subheader",
"selector": "#slice-hero p.lead"
}
],
"api": "/api/article",
"urlPattern": "(http(s)?://)?(.*\\.)?www.diffbot.com.*",
"testUrl": "http://www.diffbot.com/insights/build-a-sanctions-tracker/"
}
headers = {
'Content-Type': 'application/json',
'Accept': 'application/json'
}
response = requests.request("POST", url, headers=headers, json=payload)
print(response.text)
const headers = new Headers();
headers.append("Content-Type", "application/json");
headers.append("Accept", "application/json");
const payload = {
"rules": [
{
"name": "subheader",
"selector": "#slice-hero p.lead"
}
],
"api": "/api/article",
"urlPattern": "(http(s)?://)?(.*\\.)?www.diffbot.com.*",
"testUrl": "http://www.diffbot.com/insights/build-a-sanctions-tracker/"
}
const requestOptions = {
method: "POST",
headers: headers,
body: JSON.stringify(payload),
redirect: "follow"
};
fetch("http://api.diffbot.com/v3/custom?token=YOUR_DIFFBOT_TOKEN", requestOptions)
.then((response) => response.text())
.then((result) => console.log(result))
.catch((error) => console.error(error));
curl --location --globoff 'http://api.diffbot.com/v3/custom?token=YOUR_DIFFBOT_TOKEN' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '{
"rules": [
{
"name": "subheader",
"selector": "#slice-hero p.lead"
}
],
"api": "/api/article",
"urlPattern": "(http(s)?://)?(.*\\.)?www.diffbot.com.*",
"testUrl": "http://www.diffbot.com/insights/build-a-sanctions-tracker/"
}'
On success, you should receive a JSON response with a hashes
attribute. As further confirmation that a Custom API was updated (and not created), the value of the hashes
attribute will exactly match the hashes
value returned when the Custom API was created earlier.
Note that the final updated Custom API will not extract an author
attribute anymore because the rule for it was not included in the new rules object.