Using CasperJS to scrape website data

CasperJS can be used for Navigation Scipting, Scraping and testing. In this Tutorial we will see how to scrape data using CasperJS. To run casperJS you will need a Headless browser like PhantomJS or SlimerJS.Latest versions of casperJS need PhantomJS 1.9+

Installing PhantomJS:

sudo apt-get install libfontconfig1
cd /opt
wget https://phantomjs.googlecode.com/files/phantomjs-1.9.1-linux-x86_64.tar.bz2
tar xjf phantomjs-1.9.1-linux-x86_64.tar.bz2
rm -f phantomjs-1.9.1-linux-x86_64.tar.bz2
ln -s phantomjs-1.9.1-linux-x86_64 phantomjs
sudo ln -s /opt/phantomjs/bin/phantomjs /usr/bin/phantomjs

Installing CasperJS:

cd /opt/
git clone git://github.com/n1k0/casperjs.git
cd casperjs
ln -sf `pwd`/bin/casperjs /usr/local/bin/casperjs

Simple JS Script to Login and print page title:

phantom.casperTest = true;
var fs = require('fs');
var utils = require('utils');

var casper = require('casper').create({
    pageSettings: {
         loadImages:  false,         // The WebPage instance used by Casper will
         loadPlugins: false,         // use these settings
         userAgent: 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36'
    }
});

url = <url-of-login-page>

casper.start(url, function() {
   // replace in below format form.<class-name> or form#<form-id> 
    this.fill('form.<form-class>', {
        email: <enter-email-id-here>,
        password:  <enter-password-here>
    }, true);
});

 casper.then(function() {
       this.echo(this.getTitle());
 });

casper.run();

This script will login into website and print page Title. In this script we performed both Navigation and Scraping. 

Posted On 22 January 2016 By MicroPyramid


Need any Help in your Project?Let's Talk

Latest Comments
Setting up reactjs environment & first reactjs app(hello-world)

create your first react application hello-world from scratch. A step by step guide to understand and setup the reactjs environment. Understand the node package manager …

Continue Reading...
How to drag and drop multiple files using Dropzone.js

Dropzone is a free open source library which makes a HTML element as dropzone, which enables user to drag files on to that area and …

Continue Reading...
How to use jQuery mobile touch events

JQuery mobile touch events - Event is nothing but all possible and different actions of visitors that a webpage can respond to. We have the …

Continue Reading...
Copy Text to ClipBoard event using Javascript

Here you can learn about copy to clipboard using JavaScript. Copying content from a web form without needing to use the default browser functions. For …

Continue Reading...
Using CasperJS to scrape website data

CasperJS can be used for Navigation Scipting, Scraping and testing. In this Tutorial we will see how to scrape data from website using CasperJS and …

Continue Reading...
Google Analytics Graphs to your Dashboard in Python Web Frameworks

Ecommerce solution providers like OpenCart, Magento Provide extensions to see Google analytics data in their own dashboards as graphs. whereas there are no such plugins …

Continue Reading...
Tracking your Product Sales, Views and Searches with Google Enhanced E-commerce Analytics

Enhanced E-commerce helps improving tracking of an ecommerce website. It gives Statistics in variable measurements. It is not be used alongside analytics plugin, as this …

Continue Reading...
jQuery mouse events and touch events

jQuery is a fast, small, and feature-rich JavaScript library. It makes things like HTML document traversal and manipulation, event handling, animation, and Ajax much simpler …

Continue Reading...
Image Cropping in Jquery (with Jcrop)

We are having many image cropping plugins developed in jquery that are being used to crop an image. Jcrop is one of the plugins developed …

Continue Reading...
Get tweets with Twitter API Javascript

Due to change in twitter API, its been hard to get tweets from twitter to your site using javascript. We got new API version 1.1 …

Continue Reading...
Event Delegation in Jquery

Event handling is the basic need to develop Rich Internet Web Applications and that will become very tough with elements added to DOM dynamically. Jquery …

Continue Reading...

Subscribe To our news letter

Subscribe and Stay Updated about our Webinars, news and articles on Django, Python, Machine Learning, Amazon Web Services, DevOps, Salesforce, ReactJS, AngularJS, React Native.
* We don't provide your email contact details to any third parties