Sending script requests through multiple IPs

One of the more useful things you may want to do with Perl scripts if you’re into crawling websites is to pipe your script’s requests through multiple IP addresses.

This is actually pretty simple when you know how, but doesn’t seem to be documented that well across the web. So the following steps should work pretty well if you’re running an Apache server:

1. Configure Apache

Make sure you’re running the mod_proxy module. Then add the following code to your Apache conf file:

1
2
3
4
5
6
ProxyRequests On
<Proxy *>
Order deny,allow
Deny from all
Allow from internal.example.com
</Proxy>

2. Install Squid

A dual purpose caching & proxying program it can be installed on RHEL 5 by following these instructions.

3. Configure Squid

Open /etc/squid/squid.conf in a text editor and make sure the following lines are included:

1
2
3
4
acl ip1 myip 192.168.100.1
acl ip2 myip 192.168.100.2
tcp_outgoing_address 192.168.100.1 ip1
tcp_outgoing_address 192.168.100.2 ip2

Add a new line for each IP you want to pipe requests through.

4. Create proxy scripts

One way of piping script requests through random proxies in Perl (for example) would be to create a module such as the one below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package random_IP;
use strict;
use warnings;

require Exporter;
our @ISA = qw(Exporter);
our @EXPORT_OK = qw();

sub ip_random {
my $rand = int(rand(2));

my $ip = 'http://192.168.100.';

if ($rand == 0) {
  $ip .= '1';
} elsif ($rand == 1) {
  $ip .= '2';
}

$ip .= ':3128';

return $ip;
}

1;

It’s important to send the request to port 3128 as that’s the port that Squid is listening on. By creating a module you can send requests through random IP addresses from any Perl script by including the module in your script and calling it directly:

1
2
3
4
my $ip = random_IP::ip_random;
my $mech = WWW::Mechanize->new();
$mech->proxy('http', $ip);
$mech->get('http://www.example.com/');
2 Responses to Sending script requests through multiple IPs

Leave a Reply

Your email address will not be published. Please enter your name, email and a comment.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>