Join our newsletter

#noSpamWePromise
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
cross-icon
Subscribe

Run your data operations on a single, unified platform.

  • Easy setup, no data storage required
  • Free forever for core features
  • Simple expansion with additional credits
cross-icon
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Download the file

#getsmarter
Oops! Something went wrong while submitting the form.
cross-icon
Use Cases
September 2, 2021
Automated Competition Scraping with Apify and Keboola
Learn how you can easily scrape data with Apify and Keboola

Whether you saw or missed our webinar, we thought it would be useful to provide a step-by-step guide on how to set up quick competition monitoring (or, any other web scraping and data processing automation) with Apify and Keboola. Thank you Apify and Revolt.bi for the collaboration!

So what can you do with automated competition data processing? In this article, we’ll take an example of daily monitoring of the best-sellers list at Amazon. But in reality, you can apply the same process to similar use cases. 

Follow these instructions, create a free account and start automated data processing in minutes. 

Set up Apify

1. Create a free Apify account

Go to apify.com and create a free account using a Sign up button. On a free plan, you have a $5 monthly credit and proxy trial for a first month. Once you verify your account using an email verification link, go to the Apify Store and look for an Amazon Best Sellers Scraper

2. Select and configure an actor

Let’s start with configuring an actor, a serverless cloud programs running on the Apify platform that can perform arbitrary computing jobs such as send an email or crawl a website with millions of pages. 

Click on the Try me button.

apify try me button


After clicking the Try me button you’ll be redirected to your fresh Apify account and a new task for this public actor is automatically created.

All you need to set up now is a category URL on Amazon and depth for crawling. That’s handy when you want to extract not just 100 books for the top category, but also all other sub-categories.

Let’s check an Amazon website with best sellers. Amazon shows 100 best selling products in each category and we can extract any category we want. Let’s choose a Books category.


grab the url from amazon

Now you need to copy the page URL, in this case it’s:

https://www.amazon.com/best-sellers-books-Amazon/zgbs/books/ref=zg_bs_nav_0

Go back to the task configuration, click on Add URL in the Category URLs input and paste the URL above.

pasted url in apify

3. Run and test

Hitting the Save & Run button will start your scraper on Apify servers (everything is running on the Apify cloud platform).

The task should take just a couple of seconds. Once finished, click on a results box.

results box in apify

In the Export section, select a HTML table format and click on View in another tab button.

view option in apify

In a new tab, you’ll see extracted data in a structured table format.

The next step is integration with Keboola, so you can schedule scraper runs from there and automatically fetch data from Apify to Keboola. For a better reference, change the name of your task from my-task to e.g. Amazon-best-sellers-books. You can do that in the Settings tab of your task.

integration with keboola

Set up Keboola account

1. Create a free project

Go to keboola.com and create a free pay-as-you-go project.

2. Set up and test the Apify Extractor

Go to Components -> Extractors and search for Apify extractor.

extractors in keboola

Click on New configuration.

new configuration keboola

Name your configuration, agree with terms and click on Create configuration.

Click on Configure extractor. Select Run task and click Next. We’re not using our own actor, but a task for a public actor.

run component keboola

On the next screen you’re asked for an Apify user ID and an API token. To get these, go back to the Apify console, navigate to Settings -> Integrations and copy your Used ID and API token.

used ID apify


Go back to the Keboola project, paste Apify user ID and API token to the extractor wizard and click on Next. On the last page the only thing you have to set is the task you created. Select the Amazon-best-sellers-books you’ve created in previous steps and click on Save.

select task keboola

Now you can start the scraper (Apify task) from the Keboola project using a Run component action and confirming by Run button. If you go back to the Apify console (Tasks -> Amazon-best-sellers-books -> Runs), you can see a new run of your Task with an API as an Origin (started from Keboola via API).

Once the job is done (you see finished job in the last runs section), you can click on a link in a Results table.

check results keboola

You can see a basic info about the extracted dataset there and click on Explore in storage to see more info and sample data. Once you're on storage, you can click on the Data sample to see data - exactly the same data you saw in Apify. 

storage data Keboola

Now the integration between Keboola and Apify is completed. Once you start the extractor in Keboola, a task in Apify will be started automatically via API. Extractor will wait till the Apify task run finishes and then will fetch data from Apify and store them in Keboola.

Start automated competition scraping today


Congrats, you just learned how to use Keboola and Apify platforms to automate competitor scaping process. Try it yourself by creating a forever-free account. No credit card is required. 



Run a 100% data-driven business without any extra hassle.
Pay as you go, starting with our free tier.

2. Set up and test the Apify Extractor

Go to Components -> Extractors and search for Apify extractor.

extractors in keboola

Click on New configuration.

new configuration keboola

Name your configuration, agree with terms and click on Create configuration.

Click on Configure extractor. Select Run task and click Next. We’re not using our own actor, but a task for a public actor.

run component keboola

On the next screen you’re asked for an Apify user ID and an API token. To get these, go back to the Apify console, navigate to Settings -> Integrations and copy your Used ID and API token.

used ID apify


Go back to the Keboola project, paste Apify user ID and API token to the extractor wizard and click on Next. On the last page the only thing you have to set is the task you created. Select the Amazon-best-sellers-books you’ve created in previous steps and click on Save.

select task keboola

Now you can start the scraper (Apify task) from the Keboola project using a Run component action and confirming by Run button. If you go back to the Apify console (Tasks -> Amazon-best-sellers-books -> Runs), you can see a new run of your Task with an API as an Origin (started from Keboola via API).

Once the job is done (you see finished job in the last runs section), you can click on a link in a Results table.

check results keboola

You can see a basic info about the extracted dataset there and click on Explore in storage to see more info and sample data. Once you're on storage, you can click on the Data sample to see data - exactly the same data you saw in Apify. 

storage data Keboola

Now the integration between Keboola and Apify is completed. Once you start the extractor in Keboola, a task in Apify will be started automatically via API. Extractor will wait till the Apify task run finishes and then will fetch data from Apify and store them in Keboola.

Start automated competition scraping today


Congrats, you just learned how to use Keboola and Apify platforms to automate competitor scaping process. Try it yourself by creating a forever-free account. No credit card is required. 

Run a 100% data-driven business without any extra hassle.
Pay as you go, starting with our free tier.

Recomended Articles