Scrape API

Introduction

The Ujeebu Scrape API returns a web page or parts of a web page after rendering it using the Google Chrome browser. It will run JavaScript, and manage headless browser instances for you. It also automatically routes your requests through rotating proxies to lower your chances of getting blocked.

To use API, subscribe to a plan here and connect to :

GET https://api.ujeebu.com/scrape

POST https://api.ujeebu.com/scrape

Parameters

Parameter	Type	Description	Default Value
url REQUIRED	`string`	URL to render.
response_type	`string`	indicates what to return. Possible values are: 'html','raw', 'pdf' or 'screenshot'.	'html'
json	`boolean`	when set to true, returns a JSON response instead of raw content as specified by `response_type`.	false
useragent	`string`	override default headless browser user agent.	null
cookies	`string`	indicates custom cookies to send with request.	null
timeout	`number`	maximum number of seconds before request timeout.	60
js	`boolean`	indicates whether to execute JavaScript or not.	true
js_timeout	`number`	when `js` is enabled, indicates how many seconds the API should wait for the JS engine to render the supplied URL.	`timeout`/2
custom_js	`string`, `base64-string`	JavaScript code to execute in page context when `js` is enabled	null
wait_for	`number`, `string`, `base64-string`	indicates number of milliseconds to wait before returning response, a selector to wait for, or custom JavaScript to handle the wait. Needs 'js' to be on	0
wait_for_timeout	`number`	indicates timeout (in milliseconds) for the `wait_for` param.	null
screenshot_fullpage	`boolean`	when `response_type` is `screenshot`, indicates whether to take a screenshot of the full page or just the visible viewport.	false
screenshot_partial	`string`	when `response_type` is `screenshot`, a valid selector of element to screenshot or json-string with coordinates (x, y, width, height ) of the rect to screenshot.	null
scroll_down	`boolean`	indicates whether to scroll down the page or not, this applies only when `js` is enabled.	false
scroll_wait	`number`	when `scroll_down` is enabled, indicates the duration of the wait (in milliseconds) between two scrolls.	100
progressive_scroll	`boolean`	indicates type of scroll. If set to `true`: progressively scrolls down until page height no longer increases or URL changes. If set to `false` (default), goes to `scroll_to_selector` or end of page.	false
proxy_type	`string`	indicates type of proxy to use. Possible values: 'rotating', 'advanced', 'premium', 'residential', 'mobile', 'custom'.	rotating
proxy_country	`string`	country `ISO 3166-1 alpha-2` code to proxy from. Valid only when `premium` proxy type is chosen.	US
custom_proxy	`string`	URI for your custom proxy in the following format: `scheme://host:port`. applicable and required only if `proxy_type=custom`	null
custom_proxy_username	`string`	custom proxy username if applicable.	null
custom_proxy_password	`string`	custom proxy password if applicable.	null
auto_proxy	`boolean`	enable a more advanced proxy by default when rotating proxy is not working. It will move to the next proxy option until it gets the content and will only stop when content is available or none of the options worked. Please note that you are billed only on the top option attempted.	false
session_id	`alphanumeric`	alphanumeric identifier with a length between 1 and 16 characters, used to route multiple requests from the same proxy instance. Sessions remain active for 30 minutes	null
scroll_callback	`string`, `base64-string`	defines a JavaScript function with boolean output that determines whether to stop scrolling or not.	null
scroll_to_selector	`string`	when `scroll_down` is enabled, indicates the element to scroll to in each scroll. If 'null' the scroll is performed until the end of the page.	null
device	`string`	indicates type of device to use to render page. Possible values: 'desktop', 'mobile'	desktop
window_width	`number`	indicates browser viewport width.	null
window_height	`number`	indicates browser viewport height.	null
block_ads	`boolean`	indicates whether to block ads or not.	false
block_resources	`boolean`	indicates whether to block resources (images, css, fonts...) or not.	`false` if `response_type` is `screenshot` or `pdf`, `true` otherwise
extract_rules	`json-string`	defines rules used to Extract data from supplied web page.	null
strip_tags	`csv-string`	indicates comma-separated list of tags to remove from page after rendering.	null
http_method	`string`	indicates the http method (`GET`, `POST`, `PUT`) to use to request the target web page	GET
post_data	`string`	date to forward to target web page in case of `POST` or `PUT` http method	null

Response

The response returned depends on the response_type and json parameters. It can be either a byte array in the case of 'pdf' and 'screenshot', text when response_type='raw' or 'html', or JSON when json=1. response_type possible values are as follows:

html: returns the html code of the page . If js = 1 it will first execute JavaScript.
raw: returns the source html (or file content if URL is not an HTML page) as received from the URL without running JavaScript. js=1 is ignored.
pdf: converts page to PDF and returns the PDF binary data.
- If the json parameter is set to 'true' a JSON response is returned with the base64 encoded value of the pdf file. e.g.:
```
{
  "success": true,
  "screenshot": null,
  "html_source": null,
  "pdf": "JVBERi0xLjQKJeLjz9MKNCAwIG9iaiAKPDwKL1N1YnR5cGUgL0xpbms...",
  "html": null
}
```
screenshot: produces a screenshot of the URL in PNG format and returns the binary data.
- If screenshot_fullpage is set to 'true', takes a screenshot of the full page. If set to 'false', takes a screenshot of the visible viewport only.
- If the json parameter is set to 'true', a JSON response is returned with the base64 encoded value of the image file. e.g.:
```
{
  "success": true,
  "screenshot": "iVBORw0KGgoAAAANSUhEUgAAA2oAACyOCA...",
  "html_source": null,
  "pdf": null,
  "html": null
}
```

Error Response

If an error occurred you will receive as json as following

{
  "url": "string",
  "message": "string",
  "error_code": "string",
  "errors": ["string"]
}

Name	Type	Description
url	string	Given URL
message	string	Error message
error_code	string	Error code
errors	[string]	List of all errors

Response Codes

Code	Billed	Meaning	Suggestion
200	Yes	Successful request	-
400	NO	Some required parameter is missing (URL)	Set missing URL or refer to the request error message
401	NO	Missing API-KEY	Provide API-KEY
404	YES	Provided URL not found	Provide a valid URL, or change `proxy_type`
408	YES	Request timeout	Increase `timeout` value, change proxy type or use `auto_proxy` and/or enable JS
429	NO	Too many requests	upgrade your plan
500	NO	Internal error	Try again, and contact us if still unsuccessful

Credits

Credit cost per request:

Proxy Type	No JS	w/ JS or Screenshot	PDF	Geo Targeting	Metered
rotating	1	5	10	US	No
advanced	5	10	10	US	No
premium	8	12	17	US	No
residential(us)	20	30	35	US	No
residential	8	10	10	Multiple countries	+10 credits per MB after 1MB
custom	1	5	10	Custom	No

info

Consumed credits are returned in the Ujb-credits header

Waiting for JavaScript to execute

By default, the Scrape API stops when the 'DOMContentLoaded' event is fired, but you can use the wait_for parameter to implement additional waiting logic.

wait_for can take three type of values:

an integer: to specify the number of milliseconds the JS engine must wait before returning a response
a string CSS selector: the JS engine will wait for the element to appear in the page
a base64-string JS callable: the JS engine will try to evaluate the callable which implements its own waiting logic.

The following code is an example of a callable which will wait until the loader .loaderBox is hidden:


async () => {
    while (true) {
        // select the loader box
        let loader = document.querySelector('.loaderBox');
        if (!loader) {
            return;
        }
        // get the style of loader box
        let style = window.getComputedStyle(loader);

        // stop if box is hidden
        if (style.display === 'none') {
            return;
        }

        // wait 10ms to prevent app from hanging
        await new Promise(resolve => setTimeout(resolve, 10));

    }
}

// base64 string of the js callable above
wait_for=CmFzeW5jICgpID0+IHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo=

Note: Make sure to URL-encode the value if you're using GET


curl --location -g --request GET 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D',
  'headers': {
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Running custom JavaScript

custom_js = string|base64-string

It is sometimes necessary to run custom JavaScript before rendering a page to do things like click a button to load items, or delete unwanted HTML elements etc... You can send custom JavsScript to be executed in the page context as base64-string like in the following example:

// base64 string of 'document.querySelector(".load-more").click()'
custom_js = 'ZG9jdW1lbnQucXVlcnlTZWxlY3RvcigiLmxvYWQtbW9yZSIpLmNsaWNrKCk=';

Note: Make sure to URL-encode the value if you're using GET

Downloading files

When the content type returned by the target URL is not HTML, text or image, the API will download files up to 2MB and return the binary file.

Downloading page as PDF

Set parameter response_type to 'pdf' and json to 'false' (default value) to generate PDF file of an HTML page.

Taking a screenshot

Set parameter response_type to 'screenshot' and json to 'false' (default value) to return the screenshot of an HTML page.


curl --location --request GET 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
    'method': 'GET',
    'url': 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1',
    'headers': {
        'ApiKey': '<API Key>'
    }
};
request(options, function (error, response) {
    if (error) throw new Error(error);
    console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

You can take a partial screenshot by setting the parameter screenshot_partial to - A valid selector of the element to screenshot - A valid json string with the following coordinates x, y, width and height to specify the partial to screenshot


curl --location --request GET 'https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text',
  'headers': {
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Rendering on a mobile device

Set parameter device to the name of the mobile device you would like to emulate. Below is a list of devices we currently support:

List of available mobiles devices

Blackberry PlayBook
Blackberry PlayBook landscape
BlackBerry Z30
BlackBerry Z30 landscape
Galaxy Note 3
Galaxy Note 3 landscape
Galaxy Note II
Galaxy Note II landscape
Galaxy S III
Galaxy S III landscape
Galaxy S5
Galaxy S5 landscape
iPad
iPad landscape
iPad Mini
iPad Mini landscape
iPad Pro
iPad Pro landscape
iPhone 4
iPhone 4 landscape
iPhone 5
iPhone 5 landscape
iPhone 6
iPhone 6 landscape
iPhone 6 Plus
iPhone 6 Plus landscape
iPhone 7
iPhone 7 landscape
iPhone 7 Plus
iPhone 7 Plus landscape
iPhone 8
iPhone 8 landscape
iPhone 8 Plus
iPhone 8 Plus landscape
iPhone SE
iPhone SE landscape
iPhone X
iPhone X landscape
iPhone XR
iPhone XR landscape
JioPhone 2
JioPhone 2 landscape
Kindle Fire HDX
Kindle Fire HDX landscape
LG Optimus L70
LG Optimus L70 landscape
Microsoft Lumia 550
Microsoft Lumia 950
Microsoft Lumia 950 landscape
Nexus 10
Nexus 10 landscape
Nexus 4
Nexus 4 landscape
Nexus 5
Nexus 5 landscape
Nexus 5X
Nexus 5X landscape
Nexus 6
Nexus 6 landscape
Nexus 6P
Nexus 6P landscape
Nexus 7
Nexus 7 landscape
Nokia Lumia 520
Nokia Lumia 520 landscape
Nokia N9
Nokia N9 landscape
Pixel 2
Pixel 2 landscape
Pixel 2 XL
Pixel 2 XL landscape

To use a custom mobile device outside the list above, set parameter device to 'mobile', then specify your custom device viewport dimensions using window_width and window_height.


curl --location --request GET 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true',
  'headers': {
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone%208&wait_for=1000&screenshot_fullpage=true',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone%208&wait_for=1000&screenshot_fullpage=true"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Removing unwanted elements

Use parameter strip_tags to pass a comma separated list of css selectors of the elements you would like to delete before returning the result. In the example below we remove all 'meta', 'form' and 'input' tags, as well as any elements with class 'hidden':

strip_tags=meta,form,.hidden,input

In the example below we take a screenshot of page 'http://whatsmyuseragent.org/' after removing the div that contains the IP address part:


curl --location --request GET 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true',
  'headers': {
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Using a custom user agent

The Scrape API will by default send its own user agent header (Chrome's headless browser header), but if you want to use a different user agent you need to set parameter useragent:


curl --location --request GET 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=raw&json=false&js=false' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My custom user agent',
  'headers': {
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My custom user agent"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My custom user agent")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My%20custom%20user%20agent',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=raw&json=false&js=false"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Passing custom header/cookies

The following will forward headers 'Username' and 'APIKEY' using prefix 'UJB-':

curl -i \
-H 'UJB-Username: ujeebu' \
-H 'UJB-Authorisation: Basic dXNlcm5dhsWU6cGFzc3dvcmQ=' \
-H 'ApiKey: <API Key>' \
-X GET \
https://api.ujeebu.com/scrape?url=https://medium.com/personal-growth/how-to-be-yourself-2221085391a3

To send cookies to a target URL, use parameter cookies to pass a list of semi-colon separated 'CookieName=CookieValue' values. e.g.: Cookie1=Value1;Cookie2=Value2


curl --location --request GET 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=raw&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89'\'',' \
--header 'UJB-CUSTOM-HEADER: <CUSTOM-HEADER-VALUE>' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=raw&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89',
  'headers': {
    'ApiKey': '<API Key>',
    'UJB-CUSTOM-HEADER': '<CUSTOM-HEADER-VALUE>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=raw&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89"

payload={}
headers = {
  'ApiKey': '<API Key>',
  'UJB-CUSTOM-HEADER': '<CUSTOM-HEADER-VALUE>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=raw&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89")
  .method("GET", null)
  .addHeader("UJB-CUSTOM-HEADER", "<CUSTOM-HEADER-VALUE>")
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=raw&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>',
    'UJB-CUSTOM-HEADER: <CUSTOM-HEADER-VALUE>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=raw&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89',"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("UJB-CUSTOM-HEADER", "<CUSTOM-HEADER-VALUE>")
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Transparent Header

when response_type is either html, pdf or screenshot the endpoint will forward all URL headers prefixed with UJB-. when response_type is raw the headers will be forwarded as they are with no prefix.

Using Proxies

We realize your scraping activities might be blocked once in a while. In order to help you achieve the most success we developed a multi-tiered proxy offering which lets you select the proxy type that best fits your needs

Your API calls go through our rotating proxy by default. The default proxy uses top IPs that will get the job done most the time. If the default rotating proxy is working great for your needs, no need to do anything. For tougher URLs, you need to set proxy_type to one of the following options:

rotating: Default.
advanced: US IPs only.
premium: US IPs only. Premium proxies which work well with social media and shopping sites.
residential: Geo-targeted residential IPs which work on "tough" sites that aren't accessible with the other options. Please note that data scraped via non-US residential IPs is currently metered once requests exceeding 1MB. Keep in mind that all assets associated with an HTML page count toward this total, not just the HTML itself. Meanwhile, US Residential IPs are not metered. Please refer to the Credits section for more details.
custom: Set your own proxy. See custom proxy section

tip

We won't bill for failing requests that aren't 404s.

info

A request length also includes assets downloaded with the page when JS rendering is on.

info

To use premium proxy from a specific country, set the parameter proxy_country to the ISO 3166-1 alpha-2 country code of one of the following:

Supported countries

Algeria: DZ
Angola: AO
Benin: BJ
Botswana: BW
Burkina Faso: BF
Burundi: BI
Cameroon: CM
Central African Republic: CF
Chad: TD
Democratic Republic of the Congo: CD
Djibouti: DJ
Egypt: EG
Equatorial Guinea: GN
Eritrea: ER
Ethiopia: ET
Gabon: GA
Gambia: GM
Ghana: GH
Guinea: PG
Guinea Bissau: GN
Ivory Coast: CI
Kenya: KE
Lesotho: LS
Liberia: LR
Libya: LY
Madagascar: MG
Malawi: MW
Mali: SO
Mauritania: MR
Morocco: MA
Mozambique: MZ
Namibia: NA
Niger: NE
Nigeria: NE
Republic of the Congo: CG
Rwanda: RW
Senegal: SN
Sierra Leone: SL
Somalia: SO
Somaliland: ML
South Africa: ZA
South Sudan: SS
Sudan: SD
Swaziland: SZ
Tanzania: TZ
Togo: TG
Tunisia: TN
Uganda: UG
Western Sahara: EH
Zambia: ZM
Zimbabwe: ZW
Afghanistan: AF
Armenia: AM
Azerbaijan: AZ
Bangladesh: BD
Bhutan: BT
Brunei: BN
Cambodia: KH
China: CN
East Timor: TL
Hong Kong: HK
India: IN
Indonesia: ID
Iran: IR
Iraq: IQ
Israel: IL
Japan: JP
Jordan: JO
Kazakhstan: KZ
Kuwait: KW
Kyrgyzstan: KG
Laos: LA
Lebanon: LB
Malaysia: MY
Maldives: MV
Mongolia: MN
Myanmar: MM
Nepal: NP
North Korea: KP
Oman: OM
Pakistan: PK
Palestine: PS
Philippines: PH
Qatar: QA
Saudi Arabia: SA
Singapore: SG
South Korea: KR
Sri Lanka: LK
Syria: SY
Taiwan: TW
Tajikistan: TJ
Thailand: TH
Turkey: TR
Turkmenistan: TM
United Arab Emirates: AE
Uzbekistan: UZ
Vietnam: VN
Yemen: YE
Albania: AL
Andorra: AD
Austria: AT
Belarus: BY
Belgium: BE
Bosnia and Herzegovina: BA
Bulgaria: BG
Croatia: HR
Cyprus: CY
Czech Republic: CZ
Denmark: DK
Estonia: EE
Finland: FI
France: FR
Germany: DE
Gibraltar: GI
Greece: GR
Hungary: HU
Iceland: IS
Ireland: IE
Italy: IT
Kosovo: XK
Latvia: LV
Liechtenstein: LI
Lithuania: LT
Luxembourg: LU
Macedonia: MK
Malta: MT
Moldova: MD
Monaco: MC
Montenegro: ME
Netherlands: NL
Northern Cyprus: CY
Norway: NO
Poland: PL
Portugal: PT
Romania: OM
Russia: RU
San Marino: SM
Serbia: RS
Slovakia: SK
Slovenia: SI
Spain: ES
Sweden: SE
Switzerland: CH
Ukraine: UA
United Kingdom: GB
Bahamas: BS
Belize: BZ
Bermuda: BM
Canada: CA
Costa Rica: CR
Cuba: CU
Dominican Republic: DM
El Salvador: SV
Greenland: GL
Guatemala: GT
Haiti: HT
Honduras: HN
Jamaica: JM
Nicaragua: NI
Panama: PA
Puerto Rico: PR
Trinidad And Tobago: TT
United States: US
Australia: AU
Fiji: FJ
New Caledonia: NC
New Zealand: NZ
Papua New Guinea: PG
Solomon Islands: SB
Vanuatu: VU
Argentina: AR
Bolivia: BO
Brazil: BR
Chile: CL
Colombia: CO
Ecuador: EC
Falkland Islands: FK
French Guiana: GF
Guyana: GY
Mexico: MX
Paraguay: PY
Peru: PE
Suriname: SR
Uruguay: UY
Venezuela: VE

Using Ujeebu Scrape with your own proxy

To use your custom proxy set the proxy_type parameter to custom then set custom_proxy parameter to your own proxy in the following format: scheme://host:port and set proxy credentials using custom_proxy_username and custom_proxy_password parameters

info

If you're using GET http verb and custom_proxy_password contains special chars it's better to url encode it before passing it.

curl -i \
-H 'ApiKey: <API Key>' \
-X GET \
https://api.ujeebu.com/scrape?url=url=https://ipinfo.io&response_type=raw&proxy_type=custom&custom_proxy=http://proxyhost:8889&custom_proxy_username=user&custom_proxy_password=pass

Route all your request from same proxy instance

To route multiple requests through the same proxy instance (IP), you can give session_id a unique identifier. This identifier ensures that all requests with the same session_id are directed through the same IP address.

curl -i \
-H 'ApiKey: <API Key>' \
-X GET \
https://api.ujeebu.com/scrape?url=url=https://ipinfo.io&response_type=raw&proxy_type=premium&session_id=bd3400dev9fo

Passing POST/PUT data

You can use parameter http_method to request a web page via either POST or PUT. And if you need to send data in the request body you can use parameter post_data to specify the data to forward.

info

You can specify the type of forwarded POST/PUT data in post_data by using a custom header:

To forward JSON data you can send a valid JSON value in post_data and set header UJB-Content-Type: application/json
To forward form data you can send your form encoded data in post_data and set header UJB-Content-Type: application/x-www-form-urlencoded
multipart/form-data is not supported ( sending files )

Passing POST data


curl --location -g --request GET 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=raw&json=false&js=false&http_method=POST&post_data={"title": "hello title"}' \
--header 'UJB-Content-Type: application/json' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=raw&json=false&js=false&http_method=POST&post_data={"title": "hello title"}',
  'headers': {
    'UJB-Content-Type': 'application/json',
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=raw&json=false&js=false&http_method=POST&post_data={\"title\": \"hello title\"}"

payload={}
headers = {
  'UJB-Content-Type': 'application/json',
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=raw&json=false&js=false&http_method=POST&post_data={\"title\": \"hello title\"}")
  .method("GET", null)
  .addHeader("UJB-Content-Type", "application/json")
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=raw&json=false&js=false&http_method=POST&post_data=%7B%22title%22:%20%22hello%20title%22%7D',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'UJB-Content-Type: application/json',
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=raw&json=false&js=false&http_method=POST&post_data=%7B%22title%22:%20%22hello%20title%22%7D"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("UJB-Content-Type", "application/json")
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Response produced by code above:


{
    "title": "hello title",
    "id": 101
}

Passing PUT data


curl --location -g --request GET 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=raw&json=false&js=false&http_method=PUT&post_data={"title": "put title"}' \
--header 'UJB-Content-Type: application/json' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=raw&json=false&js=false&http_method=PUT&post_data={"title": "put title"}',
  'headers': {
    'UJB-Content-Type': 'application/json',
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=raw&json=false&js=false&http_method=PUT&post_data={\"title\": \"put title\"}"

payload={}
headers = {
  'UJB-Content-Type': 'application/json',
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=raw&json=false&js=false&http_method=PUT&post_data={\"title\": \"put title\"}")
  .method("GET", null)
  .addHeader("UJB-Content-Type", "application/json")
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=raw&json=false&js=false&http_method=PUT&post_data=%7B%22title%22:%20%22put%20title%22%7D',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'UJB-Content-Type: application/json',
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=raw&json=false&js=false&http_method=PUT&post_data=%7B%22title%22:%20%22put%20title%22%7D"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("UJB-Content-Type", "application/json")
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Response produced by code above:


{
    "title": "put title",
    "id": 1
}

Scrolling down

By default Ujeebu Scrape does not scroll after executing JavaScript.

Set parameter scroll_down to 'true' to scroll to the end of the page before returning a response.

Scroll conditions

The scroll script will continuously scroll until one of the following conditions is satisfied:

scroll_callback parameter's function (if provided) return 'false'

// scroll until the .load-more button disappears
() => document.querySelector(".load-more") !== null;

The height of the page doesn't change after two consecutive scrolls
The URL of the page changes (this will throw an error, disable scrolling and re-run the scrape request).

Scroll behavior

The API's scroll script will by default scroll down to the end of page, but this behavior can be customized using parameter scroll_to_selector which will define the selector of the element to scroll to.

Parameter scroll_wait defines the time in milliseconds to wait between two scrolls (default = 50)

Blocking Ads

Block ads is disabled by default. To change this, set parameter block_ads to 'true'.

Returning output as JSON

By default, the API returns a response matching the given response_type parameter:

'pdf' will return 'application/pdf'
'screenshot' will return 'image/png'
'html' and 'raw' will return 'text/html'

If parameter json is set to 'true' the API will return a JSON response in all cases.

Scraping Data

extract_rules = json string

To extract specific data from a page, add extraction rules to your API call. The simplest way to use extract rules is through the following format:

{
  "key_name": {
    "selector": "css_selector",
    "type": "rule_type" // 'text', 'link', 'image', 'attr', or 'obj'
  }
}

There are 5 types of rules

'text' will return the text content of the matched element
'link' if the matched element is an a tag, will return the href attribute of the element
'image' if the matched element is an img tag, will return the src attribute of the element

'attr' returns specified attribute for matching element. e.g.:

{
  "rule_name": {
    "selector": "meta[name=description]",
    "type": "attr",
    "attribute": "content"
  }
}

'obj' returns an object or array of objects (if multiple=true) representing the matched rules under 'children'. This is useful for nested element extraction.

{
  "rule_name": {
    "selector": "article.card-item",
    "type": "obj",
    "children": {
      "title": {
        "selector": "h1",
        "type": "text"
      },
      "link": {
        "selector": "a",
        "type": "link"
      }
    }
  }
}

The following extracts the displayed user agent value from URL 'whatsmyuseragent.org':

{
  "user-agent": {
    "selector": ".user-agent .intro-text",
    "type": "text"
  }
}


curl --location --request POST 'https://api.ujeebu.com/scrape' \
--header 'ApiKey: <API Key>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "http://whatsmyuseragent.org",
    "js": true,
    "json": false,
    "response_type": "html",
    "extract_rules": {
        "user-agent": {
            "selector": ".user-agent .intro-text",
            "type": "text"
        }
    }
}'


var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://api.ujeebu.com/scrape',
  'headers': {
    'ApiKey': '<API Key>',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    "url": "http://whatsmyuseragent.org",
    "js": true,
    "response_type": "json",
    "extract_rules": {
      "user-agent": {
        "selector": ".user-agent .intro-text",
        "type": "text"
      }
    }
  })

};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests
import json

url = "https://api.ujeebu.com/scrape"

payload = json.dumps({
  "url": "http://whatsmyuseragent.org",
  "js": True,
  "response_type": "json",
  "extract_rules": {
    "user-agent": {
      "selector": ".user-agent .intro-text",
      "type": "text"
    }
  }
})
headers = {
  'ApiKey': '<API Key>',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"http://whatsmyuseragent.org\",\n    \"js\": true,\n    \"response_type\": \"json\",\n    \"extract_rules\": {\n        \"user-agent\": {\n            \"selector\": \".user-agent .intro-text\",\n            \"type\": \"text\"\n        }\n    }\n}");
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape")
  .method("POST", body)
  .addHeader("ApiKey", "<API Key>")
  .addHeader("Content-Type", "application/json")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_POSTFIELDS =>'{
    "url": "http://whatsmyuseragent.org",
    "js": true,
    "response_type": "json",
    "extract_rules": {
        "user-agent": {
            "selector": ".user-agent .intro-text",
            "type": "text"
        }
    }
}',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>',
    'Content-Type: application/json'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "strings"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape"
  method := "POST"

  payload := strings.NewReader(`{
    "url": "http://whatsmyuseragent.org",
    "js": true,
    "json": false,
    "response_type": "html",
    "extract_rules": {
        "user-agent": {
            "selector": ".user-agent .intro-text",
            "type": "text"
        }
    }
}`)

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, payload)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")
  req.Header.Add("Content-Type", "application/json")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Running the code above produces:

{
  "success": true,
  "result": {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.0 Safari/537.36"
  }
}

Scraping multiple items

You can extract multiple items from a page by using the 'multiple' attribute. Here is an example of how to extract all quotes from URL https://scrape.li/quotes:

{
  "quote": {
    "selector": ".quote-card .description",
    "type": "text",
    "multiple": true
  }
}


curl --location --request POST 'https://api.ujeebu.com/scrape' \
--header 'ApiKey: <API Key>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
        "quote": {
            "selector": ".quote-card .description",
            "type": "text",
            "multiple": true
        }
    }
}'


var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://api.ujeebu.com/scrape',
  'headers': {
    'ApiKey': '<API Key>',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
      "quote": {
        "selector": ".quote-card .description",
        "type": "text",
        "multiple": true
      }
    }
  })

};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests
import json

url = "https://api.ujeebu.com/scrape"

payload = json.dumps({
  "url": "https://scrape.li/quotes",
  "js": True,
  "wait_for": 2000,
  "response_type": "json",
  "extract_rules": {
    "quote": {
      "selector": ".quote-card .description",
      "type": "text",
      "multiple": True
    }
  }
})
headers = {
  'ApiKey': '<API Key>',
  'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"https://scrape.li/quotes\", \n    \"wait_for\": 2000,  \n    \"js\": true,\n    \"response_type\": \"json\",\n    \"extract_rules\": {\n        \"quote\": {\n            \"selector\": \".quote-card .description\",\n            \"type\": \"text\",\n            \"multiple\": true\n        }\n    }\n}");
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape")
  .method("POST", body)
  .addHeader("ApiKey", "<API Key>")
  .addHeader("Content-Type", "application/json")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_POSTFIELDS =>'{
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
        "quote": {
            "selector": ".quote-card .description",
            "type": "text",
            "multiple": true
        }
    }
}',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>',
    'Content-Type: application/json'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "strings"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape"
  method := "POST"

  payload := strings.NewReader(`{
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
        "quote": {
            "selector": ".quote-card .description",
            "type": "text",
            "multiple": true
        }
    }
}`)

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, payload)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")
  req.Header.Add("Content-Type", "application/json")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

The code above produces the following:

{
  "success": true,
  "result": {
    "quote": [
      "“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”",
      "“It is our choices, Harry, that show what we truly are, far more than our abilities.”",
      "“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”",
      "“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”",
      "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
      "“Try not to become a man of success. Rather become a man of value.”",
      "“It is better to be hated for what you are than to be loved for what you are not.”",
      "“I have not failed. I've just found 10,000 ways that won't work.”",
      "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”",
      "“A day without sunshine is like, you know, night.”"
    ]
  }
}

Scraping Nested Items

It is possible to extract nested items using the 'children' attribute of rule 'object'.

Below, we extract all quotes and their authors from URL https://scrape.li/quotes:

{
  "quote": {
    "selector": ".quote-card",
    "type": "obj",
    "multiple": true,
    "children": {
      "quote": {
        "selector": ".description",
        "type": "text"
      },
      "author": {
        "selector": ".author",
        "type": "text"
      },
      "tags": {
        "selector": ".tags .tag",
        "type": "text",
        "multiple": true
      }
    }
  }
}


curl --location --request POST 'https://api.ujeebu.com/scrape' \
--header 'ApiKey: <API Key>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
        "quote": {
            "selector": ".quote-card",
            "type": "obj",
            "multiple": true,
            "children": {
                "quote": {
                    "selector": ".description",
                    "type": "text"
                },
                "author": {
                    "selector": ".description+h5",
                    "type": "obj",
                    "children": {
                        "name": {
                            "selector": ".author",
                            "type": "text"
                        },
                        "profile": {
                            "selector": "a",
                            "type": "link"
                        }
                    }
                },
                "tags": {
                    "selector": ".tags",
                    "type": "obj",
                    "children": {
                        "list": {
                            "selector": ".tag",
                            "type": "text",
                            "multiple": true
                        },
                        "csv": {
                            "selector": "meta",
                            "type": "attr",
                            "attribute": "content"
                        }
                    }
                }
            }
        }
    }
}'


var request = require('request');
var options = {
  'method': 'POST',
  'url': 'https://api.ujeebu.com/scrape',
  'headers': {
    'ApiKey': '<API Key>',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
      "quote": {
        "selector": ".quote-card",
        "type": "obj",
        "multiple": true,
        "children": {
          "quote": {
            "selector": ".description",
            "type": "text"
          },
          "author": {
            "selector": ".description+h5",
            "type": "obj",
            "children": {
              "name": {
                "selector": ".author",
                "type": "text"
              },
              "profile": {
                "selector": "a",
                "type": "link"
              }
            }
          },
          "tags": {
            "selector": ".tags",
            "type": "obj",
            "children": {
              "list": {
                "selector": ".tag",
                "type": "text",
                "multiple": true
              },
              "csv": {
                "selector": "meta",
                "type": "attr",
                "attribute": "content"
              }
            }
          }
        }
      }
    }
  })

};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import http.client
import json

conn = http.client.HTTPSConnection("154.53.53.37")
payload = json.dumps({
  "url": "https://scrape.li/quotes",
  "js": True,
  "wait_for": 2000,  
  "response_type": "json",
  "extract_rules": {
    "quote": {
      "selector": ".quote-card",
      "type": "obj",
      "multiple": True,
      "children": {
        "quote": {
          "selector": ".description",
          "type": "text"
        },
        "author": {
          "selector": ".description+h5",
          "type": "obj",
          "children": {
            "name": {
              "selector": ".author",
              "type": "text"
            },
            "profile": {
              "selector": "a",
              "type": "link"
            }
          }
        },
        "tags": {
          "selector": ".tags",
          "type": "obj",
          "children": {
            "list": {
              "selector": ".tag",
              "type": "text",
              "multiple": True
            },
            "csv": {
              "selector": "meta",
              "type": "attr",
              "attribute": "content"
            }
          }
        }
      }
    }
  }
})
headers = {
  'ApiKey': '<API Key>',
  'Content-Type': 'application/json'
}
conn.request("POST", "/v1.1/scrape", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
MediaType mediaType = MediaType.parse("application/json");
RequestBody body = RequestBody.create(mediaType, "{\n    \"url\": \"https://scrape.li/quotes\",\n    \"js\": true,\n \"wait_for\": 2000,\n    \"response_type\": \"json\",\n    \"extract_rules\": {\n        \"quote\": {\n            \"selector\": \".quote-card\",\n            \"type\": \"obj\",\n            \"multiple\": true,\n            \"children\": {\n                \"quote\": {\n                    \"selector\": \".text\",\n                    \"type\": \"text\"\n                },\n                \"author\": {\n                    \"selector\": \".description+h5\",\n                    \"type\": \"obj\",\n                    \"children\": {\n                        \"name\": {\n                            \"selector\": \".author\",\n                            \"type\": \"text\"\n                        },\n                        \"profile\": {\n                            \"selector\": \"a\",\n                            \"type\": \"link\"\n                        }\n                    }\n                },\n                \"tags\": {\n                    \"selector\": \".tags\",\n                    \"type\": \"obj\",\n                    \"children\": {\n                        \"list\": {\n                            \"selector\": \".tag\",\n                            \"type\": \"text\",\n                            \"multiple\": true\n                        },\n                        \"csv\": {\n                            \"selector\": \"meta\",\n                            \"type\": \"attr\",\n                            \"attribute\": \"content\"\n                        }\n                    }\n                }\n            }\n        }\n    }\n}");
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/scrape")
  .method("POST", body)
  .addHeader("ApiKey", "<API Key>")
  .addHeader("Content-Type", "application/json")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/scrape',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_POSTFIELDS =>'{
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
        "quote": {
            "selector": ".quote-card",
            "type": "obj",
            "multiple": true,
            "children": {
                "quote": {
                    "selector": ".description",
                    "type": "text"
                },
                "author": {
                    "selector": ".description+h5",
                    "type": "obj",
                    "children": {
                        "name": {
                            "selector": ".author",
                            "type": "text"
                        },
                        "profile": {
                            "selector": "a",
                            "type": "link"
                        }
                    }
                },
                "tags": {
                    "selector": ".tags",
                    "type": "obj",
                    "children": {
                        "list": {
                            "selector": ".tag",
                            "type": "text",
                            "multiple": true
                        },
                        "csv": {
                            "selector": "meta",
                            "type": "attr",
                            "attribute": "content"
                        }
                    }
                }
            }
        }
    }
}',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>',
    'Content-Type: application/json'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "strings"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/scrape"
  method := "POST"

  payload := strings.NewReader(`{
    "url": "https://scrape.li/quotes",
    "js": true,
    "wait_for": 2000,
    "response_type": "json",
    "extract_rules": {
        "quote": {
            "selector": ".quote-card",
            "type": "obj",
            "multiple": true,
            "children": {
                "quote": {
                    "selector": ".description",
                    "type": "text"
                },
                "author": {
                    "selector": ".description+h5",
                    "type": "obj",
                    "children": {
                        "name": {
                            "selector": ".author",
                            "type": "text"
                        },
                        "profile": {
                            "selector": "a",
                            "type": "link"
                        }
                    }
                },
                "tags": {
                    "selector": ".tags",
                    "type": "obj",
                    "children": {
                        "list": {
                            "selector": ".tag",
                            "type": "text",
                            "multiple": true
                        },
                        "csv": {
                            "selector": "meta",
                            "type": "attr",
                            "attribute": "content"
                        }
                    }
                }
            }
        }
    }
}`)

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, payload)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")
  req.Header.Add("Content-Type", "application/json")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

Response produced by code above:


{
    "success": true,
    "result": {
        "quote": [
            {
                "quote": "“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”",
                "author": [
                    {
                        "name": "Albert Einstein",
                        "profile": "/author/Albert-Einstein"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "change",
                            "deep-thoughts",
                            "thinking",
                            "world"
                        ],
                        "csv": "change,deep-thoughts,thinking,world"
                    }
                ]
            },
            {
                "quote": "“It is our choices, Harry, that show what we truly are, far more than our abilities.”",
                "author": [
                    {
                        "name": "J.K. Rowling",
                        "profile": "/author/J-K-Rowling"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "abilities",
                            "choices"
                        ],
                        "csv": "abilities,choices"
                    }
                ]
            },
            {
                "quote": "“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”",
                "author": [
                    {
                        "name": "Albert Einstein",
                        "profile": "/author/Albert-Einstein"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "inspirational",
                            "life",
                            "live",
                            "miracle",
                            "miracles"
                        ],
                        "csv": "inspirational,life,live,miracle,miracles"
                    }
                ]
            },
            {
                "quote": "“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”",
                "author": [
                    {
                        "name": "Jane Austen",
                        "profile": "/author/Jane-Austen"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "aliteracy",
                            "books",
                            "classic",
                            "humor"
                        ],
                        "csv": "aliteracy,books,classic,humor"
                    }
                ]
            },
            {
                "quote": "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
                "author": [
                    {
                        "name": "Marilyn Monroe",
                        "profile": "/author/Marilyn-Monroe"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "be-yourself",
                            "inspirational"
                        ],
                        "csv": "be-yourself,inspirational"
                    }
                ]
            },
            {
                "quote": "“Try not to become a man of success. Rather become a man of value.”",
                "author": [
                    {
                        "name": "Albert Einstein",
                        "profile": "/author/Albert-Einstein"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "adulthood",
                            "success",
                            "value"
                        ],
                        "csv": "adulthood,success,value"
                    }
                ]
            },
            {
                "quote": "“It is better to be hated for what you are than to be loved for what you are not.”",
                "author": [
                    {
                        "name": "André Gide",
                        "profile": "/author/Andre-Gide"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "life",
                            "love"
                        ],
                        "csv": "life,love"
                    }
                ]
            },
            {
                "quote": "“I have not failed. I've just found 10,000 ways that won't work.”",
                "author": [
                    {
                        "name": "Thomas A. Edison",
                        "profile": "/author/Thomas-A-Edison"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "edison",
                            "failure",
                            "inspirational",
                            "paraphrased"
                        ],
                        "csv": "edison,failure,inspirational,paraphrased"
                    }
                ]
            },
            {
                "quote": "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”",
                "author": [
                    {
                        "name": "Eleanor Roosevelt",
                        "profile": "/author/Eleanor-Roosevelt"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "misattributed-eleanor-roosevelt"
                        ],
                        "csv": "misattributed-eleanor-roosevelt"
                    }
                ]
            },
            {
                "quote": "“A day without sunshine is like, you know, night.”",
                "author": [
                    {
                        "name": "Steve Martin",
                        "profile": "/author/Steve-Martin"
                    }
                ],
                "tags": [
                    {
                        "list": [
                            "humor",
                            "obvious",
                            "simile"
                        ],
                        "csv": "humor,obvious,simile"
                    }
                ]
            }
        ]
    }
}

Usage endpoint

Introduction

To keep track of how much credit you're using programmatically, you can use the /account endpoint in your program.

Calls to this endpoint will not affect your calls per second, but you can only make 10 /account calls per minute.

To use the API:

GET https://api.ujeebu.com/account

Usage Endpoint Code Example

This will get the current usage of the account with the given ApiKey


curl --location --request GET 'https://api.ujeebu.com/account' \
--header 'ApiKey: <API Key>'


var request = require('request');
var options = {
  'method': 'GET',
  'url': 'https://api.ujeebu.com/account',
  'headers': {
    'ApiKey': '<API Key>'
  }
};
request(options, function (error, response) {
  if (error) throw new Error(error);
  console.log(response.body);
});


import requests

url = "https://api.ujeebu.com/account"

payload={}
headers = {
  'ApiKey': '<API Key>'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)


OkHttpClient client = new OkHttpClient().newBuilder()
  .build();
Request request = new Request.Builder()
  .url("https://api.ujeebu.com/account")
  .method("GET", null)
  .addHeader("ApiKey", "<API Key>")
  .build();
Response response = client.newCall(request).execute();


<?php

$curl = curl_init();

curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.ujeebu.com/account',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_ENCODING => '',
  CURLOPT_MAXREDIRS => 10,
  CURLOPT_TIMEOUT => 0,
  CURLOPT_FOLLOWLOCATION => true,
  CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
  CURLOPT_CUSTOMREQUEST => 'GET',
  CURLOPT_HTTPHEADER => array(
    'ApiKey: <API Key>'
  ),
));

$response = curl_exec($curl);

curl_close($curl);
echo $response;


package main

import (
  "fmt"
  "net/http"
  "io/ioutil"
)

func main() {

  url := "https://api.ujeebu.com/account"
  method := "GET"

  client := &http.Client {
  }
  req, err := http.NewRequest(method, url, nil)

  if err != nil {
    fmt.Println(err)
    return
  }
  req.Header.Add("ApiKey", "<API Key>")

  res, err := client.Do(req)
  if err != nil {
    fmt.Println(err)
    return
  }
  defer res.Body.Close()

  body, err := ioutil.ReadAll(res.Body)
  if err != nil {
    fmt.Println(err)
    return
  }
  fmt.Println(string(body))
}

The code above will generate the following response:

{
    "balance": 0,
    "concurrent_requests": 50,
    "days_till_next_billing": 10,
    "next_billing_date": "2024-11-01 18:38:00",
    "plan": "MEDIUM",
    "quota": 1000000,
    "total_requests": 7,
    "used": 70,
    "used_percent": 0.01,
    "userid": "90834083"
}

Introduction​

Parameters​

Response​

Error Response​

Response Codes​

Credits​

info

Waiting for JavaScript to execute​

Running custom JavaScript​

Downloading files​

Downloading page as PDF​

Taking a screenshot​

Rendering on a mobile device​

Removing unwanted elements​

Using a custom user agent​

Passing custom header/cookies​

Transparent Header​

Using Proxies​

tip

info

info

Using Ujeebu Scrape with your own proxy​

info

Route all your request from same proxy instance​

Passing POST/PUT data​

info

Passing POST data​

Passing PUT data​

Scrolling down​

Scroll conditions​

Scroll behavior​

Blocking Ads​

Returning output as JSON​

Scraping Data​

Scraping multiple items​

Scraping Nested Items​

Usage endpoint​

Introduction​

Usage Endpoint Code Example​

Introduction

Parameters

Response

Error Response

Response Codes

Credits

Waiting for JavaScript to execute

Running custom JavaScript

Downloading files

Downloading page as PDF

Taking a screenshot

Rendering on a mobile device

Removing unwanted elements

Using a custom user agent

Passing custom header/cookies

Transparent Header

Using Proxies

Using Ujeebu Scrape with your own proxy

Route all your request from same proxy instance

Passing POST/PUT data

Passing POST data

Passing PUT data

Scrolling down

Scroll conditions

Scroll behavior

Blocking Ads

Returning output as JSON

Scraping Data

Scraping multiple items

Scraping Nested Items

Usage endpoint

Introduction

Usage Endpoint Code Example