Ujeebu Scrape
Introduction​
The Ujeebu Scrape API returns a web page or parts of a web page after rendering it using the Google Chrome browser. It will run JavaScript, and manage headless browser instances for you. It also automatically routes your requests through rotating proxies to lower your chances of getting blocked.
To use API, subscribe to a plan here and connect to :
GET https://api.ujeebu.com/scrape
POST https://api.ujeebu.com/scrape
Parameters​
Parameter | Type | Description | Default Value |
---|---|---|---|
url REQUIRED | string | URL to render. | |
response_type | string | indicates what to return. Possible values are: 'html','raw', 'pdf' or 'screenshot'. | 'html' |
json | boolean | when set to true, returns a JSON response instead of raw content as specified by response_type . | false |
useragent | string | override default headless browser user agent. | null |
cookies | string | indicates custom cookies to send with request. | null |
timeout | number | maximum number of seconds before request timeout. | 60 |
js | boolean | indicates whether to execute JavaScript or not. | true |
js_timeout | number | when js is enabled, indicates how many seconds the API should wait for the JS engine to render the supplied URL. | timeout /2 |
custom_js | string , base64-string | JavaScript code to execute in page context when js is enabled | null |
wait_for | number , string , base64-string | indicates number of milliseconds to wait before returning response, a selector to wait for, or custom JavaScript to handle the wait. Needs 'js' to be on | 0 |
wait_for_timeout | number | indicates timeout (in milliseconds) for the wait_for param. | null |
screenshot_fullpage | boolean | when response_type is screenshot , indicates whether to take a screenshot of the full page or just the visible viewport. | false |
screenshot_partial | string | when response_type is screenshot , a valid selector of element to screenshot or json-string with coordinates (x, y, width, height ) of the rect to screenshot. | null |
scroll_down | boolean | indicates whether to scroll down the page or not, this applies only when js is enabled. | false |
scroll_wait | number | when scroll_down is enabled, indicates the duration of the wait (in milliseconds) between two scrolls. | 100 |
progressive_scroll | boolean | indicates type of scroll. If set to true : progressively scrolls down until page height no longer increases or URL changes. If set to false (default), goes to scroll_to_selector or end of page. | false |
proxy_type | string | indicates type of proxy to use. Possible values: 'rotating', 'premium'. | rotating |
proxy_country | string | country ISO 3166-1 alpha-2 code to proxy from. Valid only when premium proxy type is chosen. | US |
auto_premium_proxy | string | enable premium proxy by default when rotating proxy is not working. | false |
scroll_callback | string , base64-string | defines a JavaScript function with boolean output that determines whether to stop scrolling or not. | null |
scroll_to_selector | string | when scroll_down is enabled, indicates the element to scroll to in each scroll. If 'null' the scroll is performed until the end of the page. | null |
device | string | indicates type of device to use to render page. Possible values: 'desktop', 'mobile' | desktop |
window_width | number | indicates browser viewport width. | null |
window_height | number | indicates browser viewport height. | null |
block_ads | boolean | indicates whether to block ads or not. | false |
extract_rules | json-string | defines rules used to Extract data from supplied web page. | null |
strip_tags | csv-string | indicates comma-separated list of tags to remove from page after rendering. | null |
http_method | string | indicates the http method (GET , POST , PUT ) to use to request the target web page | GET |
post_data | string | date to forward to target web page in case of POST or PUT http method | null |
Response​
The response returned depends on the response_type
and json
parameters. It can be either a byte array in the case of 'pdf' and 'screenshot', text when response_type='raw' or 'html', or JSON when json
=1. response_type
possible values are as follows:
html
: returns the html code of the page . Ifjs
= 1 it will first execute JavaScript.raw
: returns the source html (or file content if URL is not an HTML page) as received from the URL without running JavaScript.js=1
is ignored.pdf
: converts page to PDF and returns the PDF binary data.If the
json
parameter is set to 'true' a JSON response is returned with the base64 encoded value of the pdf file. e.g.:{
"success": true,
"screenshot": null,
"html_source": null,
"pdf": "JVBERi0xLjQKJeLjz9MKNCAwIG9iaiAKPDwKL1N1YnR5cGUgL0xpbms...",
"html": null
}
screenshot
: produces a screenshot of the URL in PNG format and returns the binary data.If
screenshot_fullpage
is set to 'true', takes a screenshot of the full page. If set to 'false', takes a screenshot of the visible viewport only.If the
json
parameter is set to 'true', a JSON response is returned with the base64 encoded value of the image file. e.g.:{
"success": true,
"screenshot": "iVBORw0KGgoAAAANSUhEUgAAA2oAACyOCA...",
"html_source": null,
"pdf": null,
"html": null
}
Error Response​
If an error occurred you will receive as json as following
{
"url": "string",
"message": "string",
"error_code": "string",
"errors": ["string"]
}
Name | Type | Description |
---|---|---|
url | string | Given URL |
message | string | Error message |
error_code | string | Error code |
errors | [string] | List of all errors |
Response Codes​
Code | Billed | Meaning | Suggestion |
---|---|---|---|
200 | Yes | Successful request | - |
400 | NO | Some required parameter is missing (URL) | Set |
401 | NO | Missing API-KEY | Provide API-KEY |
404 | YES | Provided URL not found | Provide a valid URL |
408 | YES | Request timeout | Increase timeout parameter, use premium proxy or force JS |
429 | NO | Too many requests | upgrade your plan |
500 | NO | Internal error | Try request or contact us |
Waiting for JavaScript to execute​
By default, the Scrape API stops when the 'DOMContentLoaded' event is fired, but you can use the wait_for
parameter to implement additional waiting logic.
wait_for
can take three type of values:
- an
integer
: to specify the number of milliseconds the JS engine must wait before returning a response - a
string
CSS selector: the JS engine will wait for the element to appear in the page - a
base64-string
JS callable: the JS engine will try to evaluate the callable which implements its own waiting logic.
The following code is an example of a callable which will wait until the loader .loaderBox
is hidden:
async () => {while (true) {// select the loader boxlet loader = document.querySelector('.loaderBox');if (!loader) {return;}// get the style of loader boxlet style = window.getComputedStyle(loader);// stop if box is hiddenif (style.display === 'none') {return;}// wait 10ms to prevent app from hangingawait new Promise(resolve => setTimeout(resolve, 10));}}
// base64 string of the js callable abovewait_for=CmFzeW5jICgpID0+IHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo=
Note: Make sure to URL-encode the value if you're using
GET
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location -g --request GET 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D','headers': {'ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D"payload={}headers = {'ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D").method("GET", null).addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&wait_for=CmFzeW5jICgpID0%2BIHsKICAgIHdoaWxlICh0cnVlKSB7CiAgICAgICAgLy8gc2VsZWN0IHRoZSBsb2FkZXIgYm94CiAgICAgICAgbGV0IGxvYWRlciA9IGRvY3VtZW50LnF1ZXJ5U2VsZWN0b3IoJy5sb2FkZXJCb3gnKTsKICAgICAgICBpZiAoIWxvYWRlcikgewogICAgICAgICAgICByZXR1cm47CiAgICAgICAgfQogICAgICAgIC8vIGdldCB0aGUgc3R5bGUgb2YgbG9hZGVyIGJveAogICAgICAgIGxldCBzdHlsZSA9IHdpbmRvdy5nZXRDb21wdXRlZFN0eWxlKGxvYWRlcik7CgogICAgICAgIC8vIHN0b3AgaWYgYm94IGlzIGhpZGRlbgogICAgICAgIGlmIChzdHlsZS5kaXNwbGF5ID09PSAnbm9uZScpIHsKICAgICAgICAgICAgcmV0dXJuOwogICAgICAgIH0KCiAgICAgICAgLy8gd2FpdCAxMG1zIHRvIHByZXZlbnQgYXBwIGZyb20gaGFuZ2luZwogICAgICAgIGF3YWl0IG5ldyBQcm9taXNlKHJlc29sdmUgPT4gc2V0VGltZW91dChyZXNvbHZlLCAxMCkpOwoKICAgIH0KfQo%3D"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Running custom JavaScript​
custom_js
= string|base64-string
It is sometimes necessary to run custom JavaScript before rendering a page to do things like click a button to load items, or delete unwanted HTML elements etc... You can send custom JavsScript to be executed in the page context as base64-string
like in the following example:
// base64 string of 'document.querySelector(".load-more").click()'
custom_js = 'ZG9jdW1lbnQucXVlcnlTZWxlY3RvcigiLmxvYWQtbW9yZSIpLmNsaWNrKCk=';
Note: Make sure to URL-encode the value if you're using
GET
Downloading files​
When the content type returned by the target URL is not HTML, text or image, the API will download files up to 2MB and return the binary file.
Downloading page as PDF​
Set parameter response_type
to 'pdf' and json
to 'false' (default value) to generate PDF file of an HTML page.
Taking a screenshot​
Set parameter response_type
to 'screenshot' and json
to 'false' (default value) to return the screenshot of an HTML page.
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request GET 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1','headers': {'ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1"payload={}headers = {'ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1").method("GET", null).addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=screenshot&screenshot_fullpage=true&json=false&js=1"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
You can take a partial screenshot by setting the parameter screenshot_partial
to - A valid selector of the element to screenshot - A valid json string with the following coordinates x, y, width and height to specify the partial to screenshot
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request GET 'https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text','headers': {'ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text"payload={}headers = {'ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text").method("GET", null).addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?response_type=screenshot&js=1&url=http://whatsmyuseragent.org/&screenshot_partial=.intro-text"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Rendering on a mobile device​
Set parameter device
to the name of the mobile device you would like to emulate. Below is a list of devices we currently support:
List of available mobiles devices
- Blackberry PlayBook
- Blackberry PlayBook landscape
- BlackBerry Z30
- BlackBerry Z30 landscape
- Galaxy Note 3
- Galaxy Note 3 landscape
- Galaxy Note II
- Galaxy Note II landscape
- Galaxy S III
- Galaxy S III landscape
- Galaxy S5
- Galaxy S5 landscape
- iPad
- iPad landscape
- iPad Mini
- iPad Mini landscape
- iPad Pro
- iPad Pro landscape
- iPhone 4
- iPhone 4 landscape
- iPhone 5
- iPhone 5 landscape
- iPhone 6
- iPhone 6 landscape
- iPhone 6 Plus
- iPhone 6 Plus landscape
- iPhone 7
- iPhone 7 landscape
- iPhone 7 Plus
- iPhone 7 Plus landscape
- iPhone 8
- iPhone 8 landscape
- iPhone 8 Plus
- iPhone 8 Plus landscape
- iPhone SE
- iPhone SE landscape
- iPhone X
- iPhone X landscape
- iPhone XR
- iPhone XR landscape
- JioPhone 2
- JioPhone 2 landscape
- Kindle Fire HDX
- Kindle Fire HDX landscape
- LG Optimus L70
- LG Optimus L70 landscape
- Microsoft Lumia 550
- Microsoft Lumia 950
- Microsoft Lumia 950 landscape
- Nexus 10
- Nexus 10 landscape
- Nexus 4
- Nexus 4 landscape
- Nexus 5
- Nexus 5 landscape
- Nexus 5X
- Nexus 5X landscape
- Nexus 6
- Nexus 6 landscape
- Nexus 6P
- Nexus 6P landscape
- Nexus 7
- Nexus 7 landscape
- Nokia Lumia 520
- Nokia Lumia 520 landscape
- Nokia N9
- Nokia N9 landscape
- Pixel 2
- Pixel 2 landscape
- Pixel 2 XL
- Pixel 2 XL landscape
To use a custom mobile device outside the list above, set parameter device
to 'mobile', then specify your custom device viewport dimensions using window_width
and window_height
.
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request GET 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true','headers': {'ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true"payload={}headers = {'ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone 8&wait_for=1000&screenshot_fullpage=true").method("GET", null).addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone%208&wait_for=1000&screenshot_fullpage=true',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=https://ujeebu.com&response_type=screenshot&json=false&js=true&device=iPhone%208&wait_for=1000&screenshot_fullpage=true"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Removing unwanted elements​
Use parameter strip_tags
to pass a comma separated list of css selectors of the elements you would like to delete before returning the result. In the example below we remove all 'meta', 'form' and 'input' tags, as well as any elements with class 'hidden':
strip_tags=meta,form,.hidden,input
In the example below we take a screenshot of page 'http://whatsmyuseragent.org/' after removing the div that contains the IP address part:
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request GET 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true','headers': {'ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true"payload={}headers = {'ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true").method("GET", null).addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&strip_tags=.ip-address&screenshot_fullpage=true"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Using a custom user agent​
The Scrape API will by default send its own user agent header (Chrome's headless browser header), but if you want to use a different user agent you need to set parameter useragent
:
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request GET 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=source&json=false&js=false' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My custom user agent','headers': {'ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My custom user agent"payload={}headers = {'ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My custom user agent").method("GET", null).addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=screenshot&json=false&js=true&useragent=My%20custom%20user%20agent',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=http://whatsmyuseragent.org&response_type=source&json=false&js=false"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Passing custom header/cookies​
The following will forward headers 'Username' and 'APIKEY' using prefix 'UJB-':
curl -i \
-H 'UJB-Username: ujeebu' \
-H 'UJB-Authorisation: Basic dXNlcm5dhsWU6cGFzc3dvcmQ=' \
-H 'ApiKey: <API Key>' \
-X GET \
https://api.ujeebu.com/scrape?url=https://medium.com/personal-growth/how-to-be-yourself-2221085391a3
To send cookies to a target URL, use parameter cookies
to pass a list of semi-colon separated 'CookieName=CookieValue' values. e.g.: Cookie1=Value1;Cookie2=Value2
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request GET 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=source&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89'\'',' \--header 'UJB-CUSTOM-HEADER: <CUSTOM-HEADER-VALUE>' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=source&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89','headers': {'ApiKey': '<API Key>','UJB-CUSTOM-HEADER': '<CUSTOM-HEADER-VALUE>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=source&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89"payload={}headers = {'ApiKey': '<API Key>','UJB-CUSTOM-HEADER': '<CUSTOM-HEADER-VALUE>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=source&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89").method("GET", null).addHeader("UJB-CUSTOM-HEADER", "<CUSTOM-HEADER-VALUE>").addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=source&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>','UJB-CUSTOM-HEADER: <CUSTOM-HEADER-VALUE>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=https://ipinfo.io&response_type=source&json=false&js=1&cookies=sessionId=cffbc7e688864b6811f676e181bc29e6;userToken=d6a1e2453d44f89',"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("UJB-CUSTOM-HEADER", "<CUSTOM-HEADER-VALUE>")req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Transparent Header​
when response_type
is either html
, pdf
or screenshot
the endpoint will forward all URL headers prefixed with UJB-
.
when response_type
is raw
the headers will be forwarded as they are with no prefix.
Using premium (residential) proxies​
For "tough" URLs that block data center IPs, or if you would like to access a URL from a location other than where our proxies are located, you might want to try using a premium/residential proxy.
To use our premium proxy feature set the parameter proxy_type
to premium
info
To use premium proxy from a specific country, set the parameter proxy_country
to the ISO 3166-1 alpha-2
country code of one of the following:
Supported countries
- Algeria:
DZ
- Angola:
AO
- Benin:
BJ
- Botswana:
BW
- Burkina Faso:
BF
- Burundi:
BI
- Cameroon:
CM
- Central African Republic:
CF
- Chad:
TD
- Democratic Republic of the Congo:
CD
- Djibouti:
DJ
- Egypt:
EG
- Equatorial Guinea:
GN
- Eritrea:
ER
- Ethiopia:
ET
- Gabon:
GA
- Gambia:
GM
- Ghana:
GH
- Guinea:
PG
- Guinea Bissau:
GN
- Ivory Coast:
CI
- Kenya:
KE
- Lesotho:
LS
- Liberia:
LR
- Libya:
LY
- Madagascar:
MG
- Malawi:
MW
- Mali:
SO
- Mauritania:
MR
- Morocco:
MA
- Mozambique:
MZ
- Namibia:
NA
- Niger:
NE
- Nigeria:
NE
- Republic of the Congo:
CG
- Rwanda:
RW
- Senegal:
SN
- Sierra Leone:
SL
- Somalia:
SO
- Somaliland:
ML
- South Africa:
ZA
- South Sudan:
SS
- Sudan:
SD
- Swaziland:
SZ
- Tanzania:
TZ
- Togo:
TG
- Tunisia:
TN
- Uganda:
UG
- Western Sahara:
EH
- Zambia:
ZM
- Zimbabwe:
ZW
- Afghanistan:
AF
- Armenia:
AM
- Azerbaijan:
AZ
- Bangladesh:
BD
- Bhutan:
BT
- Brunei:
BN
- Cambodia:
KH
- China:
CN
- East Timor:
TL
- Hong Kong:
HK
- India:
IN
- Indonesia:
ID
- Iran:
IR
- Iraq:
IQ
- Israel:
IL
- Japan:
JP
- Jordan:
JO
- Kazakhstan:
KZ
- Kuwait:
KW
- Kyrgyzstan:
KG
- Laos:
LA
- Lebanon:
LB
- Malaysia:
MY
- Maldives:
MV
- Mongolia:
MN
- Myanmar:
MM
- Nepal:
NP
- North Korea:
KP
- Oman:
OM
- Pakistan:
PK
- Palestine:
PS
- Philippines:
PH
- Qatar:
QA
- Saudi Arabia:
SA
- Singapore:
SG
- South Korea:
KR
- Sri Lanka:
LK
- Syria:
SY
- Taiwan:
TW
- Tajikistan:
TJ
- Thailand:
TH
- Turkey:
TR
- Turkmenistan:
TM
- United Arab Emirates:
AE
- Uzbekistan:
UZ
- Vietnam:
VN
- Yemen:
YE
- Albania:
AL
- Andorra:
AD
- Austria:
AT
- Belarus:
BY
- Belgium:
BE
- Bosnia and Herzegovina:
BA
- Bulgaria:
BG
- Croatia:
HR
- Cyprus:
CY
- Czech Republic:
CZ
- Denmark:
DK
- Estonia:
EE
- Finland:
FI
- France:
FR
- Germany:
DE
- Gibraltar:
GI
- Greece:
GR
- Hungary:
HU
- Iceland:
IS
- Ireland:
IE
- Italy:
IT
- Kosovo:
XK
- Latvia:
LV
- Liechtenstein:
LI
- Lithuania:
LT
- Luxembourg:
LU
- Macedonia:
MK
- Malta:
MT
- Moldova:
MD
- Monaco:
MC
- Montenegro:
ME
- Netherlands:
NL
- Northern Cyprus:
CY
- Norway:
NO
- Poland:
PL
- Portugal:
PT
- Romania:
OM
- Russia:
RU
- San Marino:
SM
- Serbia:
RS
- Slovakia:
SK
- Slovenia:
SI
- Spain:
ES
- Sweden:
SE
- Switzerland:
CH
- Ukraine:
UA
- United Kingdom:
GB
- Bahamas:
BS
- Belize:
BZ
- Bermuda:
BM
- Canada:
CA
- Costa Rica:
CR
- Cuba:
CU
- Dominican Republic:
DM
- El Salvador:
SV
- Greenland:
GL
- Guatemala:
GT
- Haiti:
HT
- Honduras:
HN
- Jamaica:
JM
- Nicaragua:
NI
- Panama:
PA
- Puerto Rico:
PR
- Trinidad And Tobago:
TT
- United States:
US
- Australia:
AU
- Fiji:
FJ
- New Caledonia:
NC
- New Zealand:
NZ
- Papua New Guinea:
PG
- Solomon Islands:
SB
- Vanuatu:
VU
- Argentina:
AR
- Bolivia:
BO
- Brazil:
BR
- Chile:
CL
- Colombia:
CO
- Ecuador:
EC
- Falkland Islands:
FK
- French Guiana:
GF
- Guyana:
GY
- Mexico:
MX
- Paraguay:
PY
- Peru:
PE
- Suriname:
SR
- Uruguay:
UY
- Venezuela:
VE
Passing POST/PUT data​
You can use parameter http_method
to request a web page via either POST
or PUT
. And if you need to send data in the request body you can use parameter post_data
to specify the data to forward.
info
You can specify the type of forwarded POST/PUT data in post_data
by using a custom header:
- To forward JSON data you can send a valid JSON value in
post_data
and set headerUJB-Content-Type: application/json
- To forward form data you can send your form encoded data in
post_data
and set headerUJB-Content-Type: application/x-www-form-urlencoded
multipart/form-data
is not supported ( sending files )
Passing POST data​
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location -g --request GET 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=source&json=false&js=false&http_method=POST&post_data={"title": "hello title"}' \--header 'UJB-Content-Type: application/json' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=source&json=false&js=false&http_method=POST&post_data={"title": "hello title"}','headers': {'UJB-Content-Type': 'application/json','ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=source&json=false&js=false&http_method=POST&post_data={\"title\": \"hello title\"}"payload={}headers = {'UJB-Content-Type': 'application/json','ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=source&json=false&js=false&http_method=POST&post_data={\"title\": \"hello title\"}").method("GET", null).addHeader("UJB-Content-Type", "application/json").addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=source&json=false&js=false&http_method=POST&post_data=%7B%22title%22:%20%22hello%20title%22%7D',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('UJB-Content-Type: application/json','ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts&response_type=source&json=false&js=false&http_method=POST&post_data=%7B%22title%22:%20%22hello%20title%22%7D"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("UJB-Content-Type", "application/json")req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Response produced by code above:
{"title": "hello title","id": 101}
Passing PUT data​
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location -g --request GET 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=source&json=false&js=false&http_method=PUT&post_data={"title": "put title"}' \--header 'UJB-Content-Type: application/json' \--header 'ApiKey: <API Key>'
var request = require('request');var options = {'method': 'GET','url': 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=source&json=false&js=false&http_method=PUT&post_data={"title": "put title"}','headers': {'UJB-Content-Type': 'application/json','ApiKey': '<API Key>'}};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsurl = "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=source&json=false&js=false&http_method=PUT&post_data={\"title\": \"put title\"}"payload={}headers = {'UJB-Content-Type': 'application/json','ApiKey': '<API Key>'}response = requests.request("GET", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();Request request = new Request.Builder().url("https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=source&json=false&js=false&http_method=PUT&post_data={\"title\": \"put title\"}").method("GET", null).addHeader("UJB-Content-Type", "application/json").addHeader("ApiKey", "<API Key>").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=source&json=false&js=false&http_method=PUT&post_data=%7B%22title%22:%20%22put%20title%22%7D',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'GET',CURLOPT_HTTPHEADER => array('UJB-Content-Type: application/json','ApiKey: <API Key>'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape?url=https://jsonplaceholder.typicode.com/posts/1&response_type=source&json=false&js=false&http_method=PUT&post_data=%7B%22title%22:%20%22put%20title%22%7D"method := "GET"client := &http.Client {}req, err := http.NewRequest(method, url, nil)if err != nil {fmt.Println(err)return}req.Header.Add("UJB-Content-Type", "application/json")req.Header.Add("ApiKey", "<API Key>")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Response produced by code above:
{"title": "put title","id": 1}
Scrolling down​
By default Ujeebu Scrape does not scroll after executing JavaScript.
Set parameter scroll_down
to 'true' to scroll to the end of the page before returning a response.
Scroll conditions​
The scroll script will continuously scroll until one of the following conditions is satisfied:
scroll_callback
parameter's function (if provided) return 'false'// scroll until the .load-more button disappears
() => document.querySelector(".load-more") !== null;The height of the page doesn't change after two consecutive scrolls
The URL of the page changes (this will throw an error, disable scrolling and re-run the scrape request).
Scroll behavior​
The API's scroll script will by default scroll down to the end of page, but this behavior can be customized using parameter scroll_to_selector
which will define the selector of the element to scroll to.
Parameter scroll_wait
defines the time in milliseconds to wait between two scrolls (default = 50)
Blocking Ads​
Block ads is disabled by default. To change this, set parameter block_ads
to 'true'.
Returning output as JSON​
By default, the API returns a response matching the given response_type
parameter:
- 'pdf' will return 'application/pdf'
- 'screenshot' will return 'image/png'
- 'html' and 'raw' will return 'text/html'
If parameter json
is set to 'true' the API will return a JSON response in all cases.
Scraping Data​
extract_rules
= json string
To extract specific data from a page, add extraction rules to your API call. The simplest way to use extract rules is through the following format:
{
"key_name": {
"selector": "css_selector",
"type": "rule_type" // 'text', 'link', 'image', 'attr', or 'obj'
}
}
There are 5 types of rules
'text' will return the text content of the matched element
'link' if the matched element is an
a
tag, will return thehref
attribute of the element'image' if the matched element is an
img
tag, will return thesrc
attribute of the element'attr' returns specified
attribute
for matching element. e.g.:{
"rule_name": {
"selector": "meta[name=description]",
"type": "attr",
"attribute": "content"
}
}'obj' returns an object or array of objects (if
multiple
=true) representing the matched rules under 'children'. This is useful for nested element extraction.{
"rule_name": {
"selector": "article.card-item",
"type": "obj",
"children": {
"title": {
"selector": "h1",
"type": "text"
},
"link": {
"selector": "a",
"type": "link"
}
}
}
}
The following extracts the displayed user agent value from URL 'whatsmyuseragent.org':
{
"user-agent": {
"selector": ".user-agent .intro-text",
"type": "text"
}
}
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request POST 'https://api.ujeebu.com/scrape' \--header 'ApiKey: <API Key>' \--header 'Content-Type: application/json' \--data-raw '{"url": "http://whatsmyuseragent.org","js": true,"json": false,"response_type": "html","extract_rules": {"user-agent": {"selector": ".user-agent .intro-text","type": "text"}}}'
var request = require('request');var options = {'method': 'POST','url': 'https://api.ujeebu.com/scrape','headers': {'ApiKey': '<API Key>','Content-Type': 'application/json'},body: JSON.stringify({"url": "http://whatsmyuseragent.org","js": true,"response_type": "json","extract_rules": {"user-agent": {"selector": ".user-agent .intro-text","type": "text"}}})};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsimport jsonurl = "https://api.ujeebu.com/scrape"payload = json.dumps({"url": "http://whatsmyuseragent.org","js": True,"response_type": "json","extract_rules": {"user-agent": {"selector": ".user-agent .intro-text","type": "text"}}})headers = {'ApiKey': '<API Key>','Content-Type': 'application/json'}response = requests.request("POST", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();MediaType mediaType = MediaType.parse("application/json");RequestBody body = RequestBody.create(mediaType, "{\n \"url\": \"http://whatsmyuseragent.org\",\n \"js\": true,\n \"response_type\": \"json\",\n \"extract_rules\": {\n \"user-agent\": {\n \"selector\": \".user-agent .intro-text\",\n \"type\": \"text\"\n }\n }\n}");Request request = new Request.Builder().url("https://api.ujeebu.com/scrape").method("POST", body).addHeader("ApiKey", "<API Key>").addHeader("Content-Type", "application/json").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'POST',CURLOPT_POSTFIELDS =>'{"url": "http://whatsmyuseragent.org","js": true,"response_type": "json","extract_rules": {"user-agent": {"selector": ".user-agent .intro-text","type": "text"}}}',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>','Content-Type: application/json'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""strings""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape"method := "POST"payload := strings.NewReader(`{"url": "http://whatsmyuseragent.org","js": true,"json": false,"response_type": "html","extract_rules": {"user-agent": {"selector": ".user-agent .intro-text","type": "text"}}}`)client := &http.Client {}req, err := http.NewRequest(method, url, payload)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")req.Header.Add("Content-Type", "application/json")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Running the code above produces:
{
"success": true,
"result": {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.0 Safari/537.36"
}
}
Scraping multiple items​
You can extract multiple items from a page by using the 'multiple' attribute. Here is an example of how to extract all quotes from URL 'quotes.toscrape.com':
{
"quote": {
"selector": ".quote .text",
"type": "text",
"multiple": true
}
}
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request POST 'https://api.ujeebu.com/scrape' \--header 'ApiKey: <API Key>' \--header 'Content-Type: application/json' \--data-raw '{"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote .text","type": "text","multiple": true}}}'
var request = require('request');var options = {'method': 'POST','url': 'https://api.ujeebu.com/scrape','headers': {'ApiKey': '<API Key>','Content-Type': 'application/json'},body: JSON.stringify({"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote .text","type": "text","multiple": true}}})};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import requestsimport jsonurl = "https://api.ujeebu.com/scrape"payload = json.dumps({"url": "http://quotes.toscrape.com","js": True,"response_type": "json","extract_rules": {"quote": {"selector": ".quote .text","type": "text","multiple": True}}})headers = {'ApiKey': '<API Key>','Content-Type': 'application/json'}response = requests.request("POST", url, headers=headers, data=payload)print(response.text)
OkHttpClient client = new OkHttpClient().newBuilder().build();MediaType mediaType = MediaType.parse("application/json");RequestBody body = RequestBody.create(mediaType, "{\n \"url\": \"http://quotes.toscrape.com\",\n \"js\": true,\n \"response_type\": \"json\",\n \"extract_rules\": {\n \"quote\": {\n \"selector\": \".quote .text\",\n \"type\": \"text\",\n \"multiple\": true\n }\n }\n}");Request request = new Request.Builder().url("https://api.ujeebu.com/scrape").method("POST", body).addHeader("ApiKey", "<API Key>").addHeader("Content-Type", "application/json").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'POST',CURLOPT_POSTFIELDS =>'{"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote .text","type": "text","multiple": true}}}',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>','Content-Type: application/json'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""strings""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape"method := "POST"payload := strings.NewReader(`{"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote .text","type": "text","multiple": true}}}`)client := &http.Client {}req, err := http.NewRequest(method, url, payload)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")req.Header.Add("Content-Type", "application/json")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
The code above produces the following:
{
"success": true,
"result": {
"quote": [
"“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”",
"“It is our choices, Harry, that show what we truly are, far more than our abilities.”",
"“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”",
"“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”",
"“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
"“Try not to become a man of success. Rather become a man of value.”",
"“It is better to be hated for what you are than to be loved for what you are not.”",
"“I have not failed. I've just found 10,000 ways that won't work.”",
"“A woman is like a tea bag; you never know how strong it is until it's in hot water.”",
"“A day without sunshine is like, you know, night.”"
]
}
}
Scraping Nested Items​
It is possible to extract nested items using the 'children' attribute of rule 'object'.
Below, we extract all quotes and their authors from URL 'quotes.toscrape.com':
{
"quote": {
"selector": ".quote",
"type": "obj",
"multiple": true,
"children": {
"quote": {
"selector": ".text",
"type": "text"
},
"author": {
"selector": ".author",
"type": "text"
},
"tags": {
"selector": ".tags .tag",
"type": "text",
"multiple": true
}
}
}
}
- cURL
- NodeJs
- Python
- Java
- PHP
- Go
curl --location --request POST 'https://api.ujeebu.com/scrape' \--header 'ApiKey: <API Key>' \--header 'Content-Type: application/json' \--data-raw '{"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote","type": "obj","multiple": true,"children": {"quote": {"selector": ".text","type": "text"},"author": {"selector": ".text+span","type": "obj","children": {"name": {"selector": ".author","type": "text"},"profile": {"selector": "a","type": "link"}}},"tags": {"selector": ".tags","type": "obj","children": {"list": {"selector": ".tag","type": "text","multiple": true},"csv": {"selector": "meta","type": "attr","attribute": "content"}}}}}}}'
var request = require('request');var options = {'method': 'POST','url': 'https://api.ujeebu.com/scrape','headers': {'ApiKey': '<API Key>','Content-Type': 'application/json'},body: JSON.stringify({"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote","type": "obj","multiple": true,"children": {"quote": {"selector": ".text","type": "text"},"author": {"selector": ".text+span","type": "obj","children": {"name": {"selector": ".author","type": "text"},"profile": {"selector": "a","type": "link"}}},"tags": {"selector": ".tags","type": "obj","children": {"list": {"selector": ".tag","type": "text","multiple": true},"csv": {"selector": "meta","type": "attr","attribute": "content"}}}}}}})};request(options, function (error, response) {if (error) throw new Error(error);console.log(response.body);});
import http.clientimport jsonconn = http.client.HTTPSConnection("154.53.53.37")payload = json.dumps({"url": "http://quotes.toscrape.com","js": True,"response_type": "json","extract_rules": {"quote": {"selector": ".quote","type": "obj","multiple": True,"children": {"quote": {"selector": ".text","type": "text"},"author": {"selector": ".text+span","type": "obj","children": {"name": {"selector": ".author","type": "text"},"profile": {"selector": "a","type": "link"}}},"tags": {"selector": ".tags","type": "obj","children": {"list": {"selector": ".tag","type": "text","multiple": True},"csv": {"selector": "meta","type": "attr","attribute": "content"}}}}}}})headers = {'ApiKey': '<API Key>','Content-Type': 'application/json'}conn.request("POST", "/v1.1/scrape", payload, headers)res = conn.getresponse()data = res.read()print(data.decode("utf-8"))
OkHttpClient client = new OkHttpClient().newBuilder().build();MediaType mediaType = MediaType.parse("application/json");RequestBody body = RequestBody.create(mediaType, "{\n \"url\": \"http://quotes.toscrape.com\",\n \"js\": true,\n \"response_type\": \"json\",\n \"extract_rules\": {\n \"quote\": {\n \"selector\": \".quote\",\n \"type\": \"obj\",\n \"multiple\": true,\n \"children\": {\n \"quote\": {\n \"selector\": \".text\",\n \"type\": \"text\"\n },\n \"author\": {\n \"selector\": \".text+span\",\n \"type\": \"obj\",\n \"children\": {\n \"name\": {\n \"selector\": \".author\",\n \"type\": \"text\"\n },\n \"profile\": {\n \"selector\": \"a\",\n \"type\": \"link\"\n }\n }\n },\n \"tags\": {\n \"selector\": \".tags\",\n \"type\": \"obj\",\n \"children\": {\n \"list\": {\n \"selector\": \".tag\",\n \"type\": \"text\",\n \"multiple\": true\n },\n \"csv\": {\n \"selector\": \"meta\",\n \"type\": \"attr\",\n \"attribute\": \"content\"\n }\n }\n }\n }\n }\n }\n}");Request request = new Request.Builder().url("https://api.ujeebu.com/scrape").method("POST", body).addHeader("ApiKey", "<API Key>").addHeader("Content-Type", "application/json").build();Response response = client.newCall(request).execute();
<?php$curl = curl_init();curl_setopt_array($curl, array(CURLOPT_URL => 'https://api.ujeebu.com/scrape',CURLOPT_RETURNTRANSFER => true,CURLOPT_ENCODING => '',CURLOPT_MAXREDIRS => 10,CURLOPT_TIMEOUT => 0,CURLOPT_FOLLOWLOCATION => true,CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,CURLOPT_CUSTOMREQUEST => 'POST',CURLOPT_POSTFIELDS =>'{"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote","type": "obj","multiple": true,"children": {"quote": {"selector": ".text","type": "text"},"author": {"selector": ".text+span","type": "obj","children": {"name": {"selector": ".author","type": "text"},"profile": {"selector": "a","type": "link"}}},"tags": {"selector": ".tags","type": "obj","children": {"list": {"selector": ".tag","type": "text","multiple": true},"csv": {"selector": "meta","type": "attr","attribute": "content"}}}}}}}',CURLOPT_HTTPHEADER => array('ApiKey: <API Key>','Content-Type: application/json'),));$response = curl_exec($curl);curl_close($curl);echo $response;
package mainimport ("fmt""strings""net/http""io/ioutil")func main() {url := "https://api.ujeebu.com/scrape"method := "POST"payload := strings.NewReader(`{"url": "http://quotes.toscrape.com","js": true,"response_type": "json","extract_rules": {"quote": {"selector": ".quote","type": "obj","multiple": true,"children": {"quote": {"selector": ".text","type": "text"},"author": {"selector": ".text+span","type": "obj","children": {"name": {"selector": ".author","type": "text"},"profile": {"selector": "a","type": "link"}}},"tags": {"selector": ".tags","type": "obj","children": {"list": {"selector": ".tag","type": "text","multiple": true},"csv": {"selector": "meta","type": "attr","attribute": "content"}}}}}}}`)client := &http.Client {}req, err := http.NewRequest(method, url, payload)if err != nil {fmt.Println(err)return}req.Header.Add("ApiKey", "<API Key>")req.Header.Add("Content-Type", "application/json")res, err := client.Do(req)if err != nil {fmt.Println(err)return}defer res.Body.Close()body, err := ioutil.ReadAll(res.Body)if err != nil {fmt.Println(err)return}fmt.Println(string(body))}
Response produced by code above:
{"success": true,"result": {"quote": [{"quote": "“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”","author": [{"name": "Albert Einstein","profile": "/author/Albert-Einstein"}],"tags": [{"list": ["change","deep-thoughts","thinking","world"],"csv": "change,deep-thoughts,thinking,world"}]},{"quote": "“It is our choices, Harry, that show what we truly are, far more than our abilities.”","author": [{"name": "J.K. Rowling","profile": "/author/J-K-Rowling"}],"tags": [{"list": ["abilities","choices"],"csv": "abilities,choices"}]},{"quote": "“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”","author": [{"name": "Albert Einstein","profile": "/author/Albert-Einstein"}],"tags": [{"list": ["inspirational","life","live","miracle","miracles"],"csv": "inspirational,life,live,miracle,miracles"}]},{"quote": "“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”","author": [{"name": "Jane Austen","profile": "/author/Jane-Austen"}],"tags": [{"list": ["aliteracy","books","classic","humor"],"csv": "aliteracy,books,classic,humor"}]},{"quote": "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”","author": [{"name": "Marilyn Monroe","profile": "/author/Marilyn-Monroe"}],"tags": [{"list": ["be-yourself","inspirational"],"csv": "be-yourself,inspirational"}]},{"quote": "“Try not to become a man of success. Rather become a man of value.”","author": [{"name": "Albert Einstein","profile": "/author/Albert-Einstein"}],"tags": [{"list": ["adulthood","success","value"],"csv": "adulthood,success,value"}]},{"quote": "“It is better to be hated for what you are than to be loved for what you are not.”","author": [{"name": "André Gide","profile": "/author/Andre-Gide"}],"tags": [{"list": ["life","love"],"csv": "life,love"}]},{"quote": "“I have not failed. I've just found 10,000 ways that won't work.”","author": [{"name": "Thomas A. Edison","profile": "/author/Thomas-A-Edison"}],"tags": [{"list": ["edison","failure","inspirational","paraphrased"],"csv": "edison,failure,inspirational,paraphrased"}]},{"quote": "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”","author": [{"name": "Eleanor Roosevelt","profile": "/author/Eleanor-Roosevelt"}],"tags": [{"list": ["misattributed-eleanor-roosevelt"],"csv": "misattributed-eleanor-roosevelt"}]},{"quote": "“A day without sunshine is like, you know, night.”","author": [{"name": "Steve Martin","profile": "/author/Steve-Martin"}],"tags": [{"list": ["humor","obvious","simile"],"csv": "humor,obvious,simile"}]}]}}
Credits​
Credit cost per request:
Parameter | Credits |
---|---|
No Javascript, No premium proxy | 1 credit |
Javascript enabled | 5 credits |
Screenshot (with or without Javascript) | 5 credits |
PDF (with or without Javascript) | 10 credits |
Premium Proxy | 10 credits |
Premium Proxy with Javascript/PDF/Screenshot | 25 credits |