Ujeebu | Sam

How to Take Full-Page Screenshots with a Screenshot API

Ever tried to capture an entire webpage in one go, only to end up taking multiple screenshots and stitching them together? Taking a full page screenshot manually is about as fun as printing

Web Scraping

Web Scraping in 2025: Modern Approaches, Legal Landscape, and Future Trends

Web scraping remains a cornerstone of data-driven projects in 2025. As organizations seek competitive insights and real-time information, web scraping has only grown in importance. In fact, the broader alternative data market was valued at around $4.9 billion in 2023...

Enhancing Lead Generation with Web Data Scraping and Content Extraction

In this comprehensive guide, we’ll explore how web scraping and content extraction can optimize key aspects of lead generation – from prospect identification and lead scoring to personalized outreach – all while ensuring best practices and compliance

Web Scraping

Web Scraping Customer Reviews for Boosting Business Growth

In today's digital age, where 89% of consumers read online reviews before purchasing (BrightLocal), customer feedback has become a critical driver of business success. Web scraping has emerged as a powerful tool for companies to gather customer reviews and feedback at scale.

Web Scraping

Mastering HTML Text Extraction in Python: 7 Proven Techniques

With the vast amount of information available on the internet, extracting relevant text content from an HTML page can be a challenging task. HTML, or Hypertext Markup Language, is the standard markup language used to create web pages.

AI

The Rise of AI-Generated Content and Its Impact on Genuine Online Production

In this article we examine the recent studies, statistics, and research about AI generated content, highlighting how training data and web scraping play a major role in shaping the future of online content.

Web Scraping

Safeguarding Your Website from Abusive Web Scraping

Abusive scraping can cause significant problems for website owners, including server overload, unauthorized data extraction, and the potential exposure of sensitive information. Implementing effective anti-scraping mechanisms is crucial to protect your website from these threats.

Overcoming Web Scraping Blocks: How IP Classification and CGNAT Affect Your Scraping Strategy

Web servers use various techniques to mitigate scraping attempts, including IP classification and identifying data center or suspicious traffic. Understanding how IP addresses are classified and how technologies like CGNAT (Carrier-Grade NAT) work is critical for overcoming these challenges.

Content Extraction

How to Scrape TikTok: A Comprehensive Guide

The TikTok API has several restrictions that limit what data you can access and how frequently you can query it. For this reason, web scraping becomes a viable solution, as long as it is done in compliance with TikTok’s Terms of Service.

Web Scraping

Web Scraping: An Essential Tool for Business Intelligence

One of the most powerful resources available to businesses nowadays is web scraping, an automated technique for extracting substantial amounts of publicly accessible data from online sources.

Puppeteer

Puppeteer based Simple Data Scraper: Advanced Options

In this article, we show how Puppeteer's advanced capabilities can be used to make our scraper better equipped for handling real world use cases. Namely, we will explore options such as controlling page load behavior, HTTP Authentication, adding extra headers, changing user agent, etc...

Content Extraction

A Simple Rule-based Scraper using Puppeteer's native methods

In our previous article of the Puppeteer series we implemented a rule-based scraper based on headless Chrome using Puppeteer. We injected our scraping functions into the browser's context (window) then used those to

Content Extraction

Simple Puppeteer-based Scraper: Rule based extraction

In this article, we show how to scrape any website with a given set of rules using the Puppeteer library.

Content Extraction

A Simple Scraper using Puppeteer

Web scraping is the process of extracting data from websites. One popular library for web scraping is Puppeteer. Puppeteer is a Node.js library that provides a high-level API to control headless Chrome or Chromium over the DevTools Protocol.

Content Extraction

Is Web Scraping Legal?

The issues of legality and ethics surrounding web scraping are a massive grey area. While some may be in favor of web scraping, others might not share the same enthusiasm. This is what makes the subject so controversial.

What is Web scraping?

The amount of data Google handles is extraordinary; it processes 200 petabytes daily. This points to the sheer volume of often invaluable data on websites, including business contacts, stock prices, product descriptions, sports

Web Scraping

Rendering Javascript Heavy Web Pages using Puppeteer

With the increasing adoption of client-side frameworks, being able to render web pages often requires JavaScript execution. This is paramount for data scraping.

Content Extraction

Extracting clean data from blog and news articles

Several open source tools allow the extraction of clean text from article HTML. We list the most popular ones below, and run a benchmark to see how they stack up against the Ujeebu API