Code examples for bulk requests: Pseudocode, PHP, Ruby, Java

Scrape that site they said. It will be fun they said.

If you tried to do scraping yourself or your company is currently doing this, you might have noticed, that getting first results is not that hard. Retrieving big quantities on a regular basis and reliably is a different story.

Things can and will go wrong. Crawlers get blocked by firewalls and target sites change their HTML structure. And then you need a fix, asap. Probably on a weekend.

Scraping probably isn't your core business and with priceAPI.com there will be less hassle for you, and your company will even pay less.

Best Practice Things will go wrong, but it will be fine

Our core business is to extract information reliably. However, there are things beyond our influence: sites will change their structure and block our crawlers — just not as often as others'.

Of course, now we are the ones fixing these asap.

Nevertheless, you have to design your application in a way that is robust when a product or a whole source cannot be accessed for a certain time span.

Let's say you use priceAPI.com for repricing. When we cannot return results for one of your products, you should probably leave the price unchanged, instead of not listing it anymore.

Bulk Requests (asynchronous)

Bulk requests can gather information for multiple products in parallel. We limit to a maximum of 1,000 products for each bulk request. It is possible to create multiple bulk requests in parallel.

Results for a bulk request can be accessed for 72 hours after the request was issued.

Bulk requests process is segmented in 3 different HTTP requests:

  1. Post request
  2. Query status
  3. Get results
See code examples: Pseudocode, PHP, Ruby, Java

1. Post “Bulk Request”

This returns a job_id that you need in steps 2 and 3.

Endpoint

HTTP verb
POST
URL
https://api.priceapi.com/jobs

Parameters

Required
token, source, country, key, values (separated with \n)
Optional
completeness, currentness
For bulk requests, the parameter is called values (plural), not value.

Limit

A maximum of 1,000 values can be used in one bulk request.
You can issue multiple bulk requests in parallel.
If you need more, please contact support.

Example request via HTTP POST:

curl –X POST https://api.priceapi.com/jobs
    -d "token=YOUR_TOKEN"
    -d "source=google-shopping"
    -d "country=de"
    -d "key=gtin"
    -d "values=00885909666966\
               08718108041581"

Example response

{
  "job_id": "1234567890abcdef12345678", // save this id for steps 2 and 3
  "status": "new",
  "requested": 2                        // Number of different values
}

2. Query Status

The status query shows the progress of the bulk request.

Endpoint

HTTP verb
GET
URL
https://api.priceapi.com/jobs/job_id

Parameters

Required
token

As response for this request a progress status of the given bulk request is given. Possible values for status are: new, enqueued, working, finishing, and finished.
After the response is finished the results can be downloaded. While progress is 100 it could take up to 1 minute until the job changes his status from working to finished.

Example request via HTTP GET:

https://api.priceapi.com/jobs/1234567890abcdef12345678?token=YOUR_TOKEN

Example responses

{
  "job_id": "1234567890abcdef12345678",
  "status": "new",      // Not scheduled, yet
  "requested": 2,
  "completed": 0,
  "progress": 0         // from 0 to 100
}
{
  "job_id": "1234567890abcdef12345678",
  "status": "enqueued", // Other bulk requests will be processed before this - our queue works FIFO
  "requested": 2,
  "completed": 0,
  "progress": 0
}
{
  "job_id": "1234567890abcdef12345678",
  "status": "working",  // Crawlers are currently working to fulfill this bulk request
  "requested": 2,
  "completed": 1,
  "progress": 50
}
{
  "job_id": "1234567890abcdef12345678",
  "status": "working",
  "requested": 2,
  "completed": 2,
  "progress": 100       // Almost ready, but not finished, yet
}
{
  "job_id": "1234567890abcdef12345678",
  "status": "finishing", // Almost, but not quire...
  "requested": 2,
  "completed": 2,
  "progress": 100
}
{
  "job_id": "1234567890abcdef12345678",
  "status": "finished", // You can now download the results
  "requested": 2,
  "completed": 2,
  "progress": 100
}

3. Download Results

After status request is given a response finished, results can be downloaded.

Results for a bulk request can be accessed for 72 hours after the request was issued.

Endpoint

HTTP verb
GET
URL for JSON
https://api.priceapi.com/products/bulk/job_id
URL with format
https://api.priceapi.com/products/bulk/job_id.format

Parameters

Required
token
Optional
format: either json, xml or csv
format_option: only for format csv; either row or column

Example request via HTTP GET:

https://api.priceapi.com/products/bulk/1234567890abcdef12345678?token=YOUR_TOKEN

Example response

{
  "job_id": "1234567890abcdef12345678",
  "status": "finished",
  "free_credits": 0, // used test credits for this request
  "paid_credits": 2, // used paid credits for this requests (including or over-usage)
  "products": [      // array including multiple entries
    {
      "source": "google-shopping",
      "country": "de",
      "key": "gtin",
      "value": "00885909666966",    // the value in the same format as you requested it
      "success": true,              // true or false
      "reason": null,               // if success "false", see section about errors
      "id": "15858468867886681869",
      "name": "Apple iPad mini Wi-Fi 16 GB - Schwarz & Graphit",
      "brand_name": "Apple",
      "category_name": "Tablet",
      "review_count": 37,
      "review_rating": 100,         // On a scale of 0 to 100; "null" if no ratings
      "gtins": [                    // Array of all GTINs we currently know this product
        "00047165900004",           // is associated at on the data source
        "00885909575329",
        "00885909666881",
        "00885909666898",
        "00885909666966"
      ],
      "url": "https://www.google.de/shopping/product/15858468867886681869", // URL of the product on the data source
      // maybe more attributes, depending on the source
      "offers": [
        {
          "shop_name": "Some Shop",
          "price": "284.99",                // As a string to avoid floating point rounding errors
          "price_with_shipping": "284.99",  // Same here
          "shipping_costs": "0.00",         // Same here
          "currency": "EUR",                // ISO 4217 currency code
          // maybe more attributes, depending on the source
        },
        {
          "shop_name": "Another shop",
          "price": "289.82",
          "price_with_shipping": null,      // Not always given by shop
          "shipping_costs": null,           // Same here
          "currency": "EUR",
          // ...
        }
        // Several more offers...
      ]
    },
    {
      // Second product
    }
  ]
}

3.1 Download Results in CSV format

Next to JSON format it is possible to get bulk request results in CSV format. The CSV structure is in test phase; there might still be changes in structure. As CSV separator a comma , is used, quotation marks " enclose all values.

There are two CSV formats available, one row based (best for ppivot tables) and one column based. You can choose with the format_option parameter.

Example request via HTTP GET for row based format:

https://api.priceapi.com/products/bulk/1234567890abcdef12345678.csv?token=YOUR_TOKEN&format_option=row

Example response

"source","country","key","value","success","reason","source_id","source_url","updated_at","gtins","eans","name","brand_name","category_name","review_count","review_rating","relevance","offer.url","offer.name","offer.price","offer.price_with_shipping","offer.shipping_costs","offer.position","offer.position_with_shipping","offer.availability_code","offer.availability_text","offer.shop_name"

Example request via HTTP GET for column based format:

https://api.priceapi.com/products/bulk/1234567890abcdef12345678.csv?token=YOUR_TOKEN&format_option=column

Example response

"source","country","key","value","success","reason","source_id","source_url","updated_at","gtins","eans","name","brand_name","category_name","review_count","review_rating","relevance","offers[0].url","offers[0].name","offers[0].price","offers[0].price_with_shipping","offers[0].shipping_costs","offers[0].availability_code","offers[0].availability_text","offers[0].shop_name","offers[1].url","offers[1].name","offers[1].price","offers[1].price_with_shipping","offers[1].shipping_costs","offers[1].availability_code","offers[1].availability_text","offers[1].shop_name","offers[2].url",...

3.2 Get Result in XML format

Example request via HTTP GET:

https://api.priceapi.com/products/bulk/1234567890abcdef12345678.xml?token=YOUR_TOKEN

Example response

<?xml version="1.0" encoding="UTF-8"?>
<hash>
  <job-id>1234567890abcdef12345678</job-id>
  <status type="symbol">finished</status>
  <free-credits type="integer">0</free-credits>
  <paid-credits type="integer">2</paid-credits>
  <products type="array">
    <product>
      <source>google-shopping</source>
      <country>de</country>
      <key>gtin</key>
      <value>00885909666966</value>
      <success type="boolean">true</success>
      <id>15858468867886681869</id>
      <name>Apple iPad mini Wi-Fi 16 GB - Schwarz & Graphit</name>
      <brand-name>Apple</brand-name>
      <category-name>Tablet<category-name/>
      <review-count>37<review-count/>
      <review-rating>100<review-rating/>
      <gtins type="array">
        <gtin>00047165900004</gtin>
        <gtin>00885909575329</gtin>
        <gtin>00885909666881</gtin>
        <gtin>00885909666898</gtin>
        <gtin>00885909666966</gtin>
      </gtins>
      <url>https://www.google.de/shopping/product/15858468867886681869</url>
      <offers type="array">
        <offer>
          <shop-name>Some Shop</shop-name>
          <price>284.99</price>
          <price-with-shipping>284.99</price-with-shipping>
          <shipping-costs>0.00</shipping-costs>
          <currency>EUR</currency>
          <!-- maybe more attributes, depending on the source -->
        </offer>
        <offer>
          <shop-name>Another shop</shop-name>
          <price>289.82</price>
          <price-with-shipping nil="true"/>
          <shipping-costs nil="true"/>
          <currency>EUR</currency>
          <!-- maybe more attributes, depending on the source -->
        </offer>
        <!-- ... -->
      </offers>
    </product>
    <product>
      <!-- second product --->
    </product>
  </products>
</hash>

Daily and Monthly Usage

Usage today

This returns an overview on used credits and quota of the current day.

Endpoint

HTTP verb
GET
URL
https://api.priceapi.com/v2/me/usage_today

Parameters

Required
token

Example request via HTTP GET:

https://api.priceapi.com/v2/me/usage_today?token=YOUR_TOKEN

Example response

{
  "usage": {
    "date": "2016-10-17",
    "credits": 1000,
    "free_credits": 200,
    "paid_credits": 800,
    "quota": 10000
  }
}

Usage month

This returns an overview on used and included credits aggregated for the current month with a list of registered usages per day.

Endpoint

HTTP verb
GET
URL
https://api.priceapi.com/v2/me/usage_month

Parameters

Required
token

Example request via HTTP GET:

https://api.priceapi.com/v2/me/usage_month?token=YOUR_TOKEN

Example response

{
  "usage": {
    "from": "2016-10-01",
    "to": "2016-10-17",
    "credits": 56000,
    "free_credits": 1000,
    "paid_credits": 55000,
    "included_credits": 50000,
    "overusage": 5000
  },
  "usages": [
    {
      "date": "2016-10-01",
      "credits": 1000,
      "free_credits": 1000,
      "paid_credits": 0
    },
    {
      "date": "2016-10-17",
      "credits": 55000,
      "free_credits": 0,
      "paid_credits": 55000
    }
  ]
}

Parameters for Requests

token

The personal security token is needed to gain access to priceAPI.com. The token consists of 64 characters and is shown on the Dashboard.

country

Defines the country where informations should be gathered. The country parameter expects a alpha-2-country-code (ISO 3166 – Alpha 2) in lower case. All available data countries can be found on the Dashboard.

Examples:

  • de (Germany)
  • gb (United Kingdom)
  • it (Italy)

source

Defines the data source — price comparison site or market place — from which information should be gathered. Every country has different sources. A list of available data sources can be found on the Dashboard.

Examples:

  • country: de, source: google-shopping (Google Shopping Germany)
  • country: gb, source: amazon (Amazon United Kingdom)
  • country: it, source: amazon (Amazon Italy)

key

Defines the search parameter for the lookup. Available key values are:

gtin
GTIN, EAN or UPC
keyword
Any search phrase as you would enter it in a search engine
source_id
A product's ID as given by the source, e.g. ASIN for amazon

values

Defines the search string for lookup. Depending on chosen key, parameter value expects a valid GTIN, EAN, UPC, a keyword or a product id.

All 3 examples returns the same result for a lookup on amazon:

  • key: gtin, value: 08902527431911
  • key: keyword, value: samsung galaxy s3 16gb white
  • key: source_id, value: B0080DJ6CM

completeness

Defines the amount of result pages gathered for each product. For each gathered result page 1 credit will be deducted.

Available values:

one_page
Receive only first page with lowest prices
all_pages
Receive all pages
at_least_two_shops
Request so many pages that at least two different shops are listed in offer results — most of the time that will be 1.

currentness

Defines the currentness of requested informations. For realtime requests double amount of credits will be deducted, but guarantees data is more up to date.

Available values:

daily_updated
Information is max. 20 hours old
realtime
Information is max. 15 minutes old; double amount of credits will be deducted

job_id

The job_id is given by priceAPI.com as identifier for a bulk requests. It is needed to gain information about the bulk request's status and finally after completion to download its results.

format

This parameter chooses the format in which results are sent to you for bulk requests. We currently support json, xml or csv. The default is json. For csv also see the format_option parameter.

format_option

There are currently two CSV formats and you can choose between them with the format_option parameter. Choose either row or column. The current default is column for legacy reasons, but we recommend to use the newer row based format.

With a format_option equal to row, a product is repeated in multiple lines, one line for each offer. This is a good format for pivot tables.

With column, there is only one line per product and additional columns are added for each offer. This may be easier to process for other use cases.

conditions

This parameter filters the offers (if the source provides this information). The filter criteria can be either "new", "used" or "all" (this depends on the source). This filter can be applied to pre filter used offers. The pre filter feature is available on request only.

The 'product' of the result has a key 'condition' set to the 'conditions' parameter or the default 'conditions' parameter of the source. The 'conditions' key is only visible if the feature is enabled for the user.

Errors

Errors on Request Level

General errors can occur on a request itself. For example when using a wrong token, when choosing an unknown data source or country that is not available for the chosen source.

Possible values for reason:

unauthorized
token is not valid
missing parameter
not all required parameters were given
parameter value invalid
given value is not allowed for this parameter
unsupported source
chosen source is not supported (yet)
unsupported country
chosen country is not supported for the chosen source
unsupported key
chosen key is not supported by the chosen source
source disabled
combination of source and country is temporarily disabled
key disabled
key is temporarily disabled for this source and country
value disabled
value is temporarily disabled for this source, country and key
not enough free credits
no free credits left, a paid subscription is needed
daily quota exceeded
user defined daily quota is exceeded; you can change this under Quota settings.
too many values
requested more than 1,000 values in one job
job not found
no bulk request could found for given job_id and token
job not finished
bulk request is still in progress; wait until it is finished. See Query Status.

Example response:

{
  "success": false,
  "reason": "unauthorized"
}

Error on Product Level

Errors on product level can occur for different reasons. For example: the requested GTIN is not a valid GTIN number (check digit), the source could not be accessed or the source has changed its source code.

Possible values for reason:

not found
We have searched the source, but there were no matching results for the given request. Credits will be deducted as normal, because this has caused the same costs on our side as for found products.
invalid value
Requested value is not valid. For example: check digit for GTIN is wrong or source_id has a wrong format. No credits will be deducted.
value disabled
The requested value is temporarily disabled for this source, country and key.
failed to access source
priceAPI.com could not reach the data source after multiple retries. No credits will be deducted.
error
An error has occured while crawling the source. For example: the source might have changed its HTML format. No credits will be deducted.

Example response:

{
  "job_id": "1234567890abcdef12345678",
  "status": "finished",
  "free_credits": 0,
  "paid_credits": 1, // Note that reason not found is not an error and will cost credits.
  "products": [      // Of course, actual errors on our side cost 0 credits.
    {
      "source": "google-shopping",
      "country": "de",
      "key": "gtin",
      "value": "00885909666966",
      "success": false,                    // as this is false, you might want to check reason
      "reason": "failed to access source", // for all codes, see below
    },
    {
      "source": "google-shopping",
      "country": "de",
      "key": "gtin",
      "value": "00885909666966",
      "success": false,                    // as this is false, you might want to check reason
      "reason": "not found",               // for all codes, see below
    }
  ]
}

Sources

Please login to access the documentation of your sources.

Over Usage

If more credits are used than the chosen plan has included, the over usage will charged at the end of an accounting period at the same conditions the current chosen plan has. The current chosen plan can changed and looked at under Subscription.

Daily Quota

Daily Quota is a personal protective mechanism, so that no unnecessary expense is generated, if your technical implementation is buggy. The maximum allowed daily quota can be changed under Quota settings.

As credits are only deducted after a job has finished, it is possible to effectively use more credits than the daily quota when you issue multiple bulk requests in parallel.

Also with completeness parameters other than one_page, the number of credits cannot be foreseen at the time of job creation, in this case, it is also possible to effectively use more credits than the daily quota.

We provide the daily quota to limit the potential harm that broken code could cause. Still, it is your responsibility to make sure, your access to the API is sound.

GTIN, EAN, JAN, UPC It's all related

GTIN
Global Trade Item Number; up to 14 digits long
EAN
International Article Number, formerly known as European Article Number and Japanese Article Number; up to 13 digits long
UPC
Universal Product Code; up to 12 digits long

Any UPC can be converted to an EAN by prefixing a 0.
Likewise, any EAN can be converted to a GTIN by prefixing a 0.

All three codes contain one check digit for error detection. See Wikipedia.

API change policy

We regard some changes of the API as backwards compatible, these changes can, but need not to be, communicated to the end users:

  • adding new endpoints
  • adding parameters for endpoints
  • adding new fields to the responses
  • removing of possible values (e.g. the value of a status which might never ever be returned)
  • any and all changes to undocumented endpoints

We consider all other changes as incompatible, e.g.:

  • removal of endpoints
  • removal of parameters for endpoints
  • removal of field in responses
  • change of default values
  • adding new possible status
  • changing the semantics of parameters/responses

The documentation of backwards compatible changes might happen after the release. Undocumented endpoints, parameters and return values might have incompatible changes any time without communication. Incompatible changes will be documented upfront in time to give the users the time to change their client code.

As soon as changes are documented on the priceapi documentation page the rules above apply.

Security

For added security, we deliver our service via the encrypted protocol HTTPS. HTTPS uses so-called ciphers for authentication and encryption. We may disable any ciphers that get deprecated or over time. It is your responsibility to update your system so that it can communicate securely using up-to-date ciphers.

Failing to keep your system up-to-date may lead to you not being able to communicate to our servers anymore.

Questions?

Did we forget to mention something important?
Send Feedback