Smart traffic shaping with tcng

An year ago i was set up my own home router based on PC and faced with the necessity of prioritizing network traffic. First i drown small diagram that displayed traffic that i had and priorities that i was will to give to that traffic.

I placed diagram here, but i must warn you, i made this diagram for myself, so it may look ugly for other. I had one channel to internet and few VPN tunnels for personal use. On the diagram i specified sources of possible traffic and assigned different priorities for diffirent types of traffic (from “realtime” — red, till ‘does not care’ – ‘grey’). This diagram helped me to distinguish different types of traffic and to planned how i can construct rules for them.

After that i started to learn how to shape my traffic. First i found that i can do it only by ‘tc’ from iproute package, i found this method is too complex if i will wanted to remember what tc rules do in future. Also at this stage i select HTB discipline for shaping in future.

Next i tried HTB.init script and found that metod too ugly.

I started to search more ‘high-level’ method and found tcng, it is just a brilliant. It is special language interpreter, that allow you to write how you want to shape your traffic in ‘script’ manner. First, i was worried, because last release  of tcng was in 2004 year, but later i found, that tcng work perfect.

Before i started to learn tcng i do not know, that you can use tc in very different manner. For example you can use only tc for prioritizing traffic by port and protocol, or by bits in field of the packet. So, let’s look at simple example:

dev ppp0 {
  egress {
    class ( <$interactive> )  if tcp_dport == 22;
    class ( <$medium> )  if tcp_dport == 80 || tcp_dport == 443;
    class ( <$bulk> )  if tcp_dport == 21;
    class ( <$default> ) ;
 
    htb () {
      $interactive  = class ( rate 64kbps, ceil 128kbps ) {
                        sfq ( perturb 10s ); }
      $medium       = class ( rate 128kbps, ceil 128kbps ) {
                        sfq ( perturb 10s ); }
      $bulk         = class ( rate 256kbps, ceil 512kbps ) {
                        sfq ( perturb 10s ); }
      $default       = class ( rate 256kbps, ceil 512kbps ) {
                        sfq ( perturb 10s ); }
    }
  }
}

I think, even if you do not know anything about tc and about traffic shaping at all, when you will look on this example, you can easily suggest what’s happening there. I writed before, that i found clear tc syntaxis too complex, tcng convert rules writed on his own language to tc syntaxis, let’s look how it will look after tcng ‘compile’ this rules:

tc qdisc add dev ppp0 handle 1:0 root dsmark indices 8 default_index 0
tc qdisc add dev ppp0 handle 2:0 parent 1:0 htb
tc class add dev ppp0 parent 2:0 classid 2:1 htb rate 8000bps ceil 16000bps
tc qdisc add dev ppp0 handle 3:0 parent 2:1 sfq perturb 10
tc class add dev ppp0 parent 2:0 classid 2:2 htb rate 16000bps ceil 16000bps
tc qdisc add dev ppp0 handle 4:0 parent 2:2 sfq perturb 10
tc class add dev ppp0 parent 2:0 classid 2:3 htb rate 32000bps ceil 64000bps
tc qdisc add dev ppp0 handle 5:0 parent 2:3 sfq perturb 10
tc class add dev ppp0 parent 2:0 classid 2:4 htb rate 32000bps ceil 64000bps
tc qdisc add dev ppp0 handle 6:0 parent 2:4 sfq perturb 10
tc filter add dev ppp0 parent 2:0 protocol all prio 1 tcindex mask 0x7 shift 0
tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 4 tcindex classid 2:4
tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 3 tcindex classid 2:3
tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 2 tcindex classid 2:2
tc filter add dev ppp0 parent 2:0 protocol all prio 1 handle 1 tcindex classid 2:1
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 1:0:0 u32 divisor 1
tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 1:0:0
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 1:0:1 u32 ht 1:0:0 match u16 0x16 0xffff at 2 classid 1:1
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 2:0:0 u32 divisor 1
tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 2:0:0
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 2:0:1 u32 ht 2:0:0 match u16 0x50 0xffff at 2 classid 1:2
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 3:0:0 u32 divisor 1
tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 3:0:0
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 3:0:1 u32 ht 3:0:0 match u16 0x1bb 0xffff at 2 classid 1:2
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 4:0:0 u32 divisor 1
tc filter add dev ppp0 parent 1:0 protocol all prio 1 u32 match u8 0x6 0xff at 9 offset at 0 mask 0f00 shift 6 eat link 4:0:0
tc filter add dev ppp0 parent 1:0 protocol all prio 1 handle 4:0:1 u32 ht 4:0:0 match u16 0x15 0xffff at 2 classid 1:3

Huh now it is not so simple as before.

OK, let’s talk about few things that you need to understand if you do not want to read theory about traffic shaping and about tc.
First, all that i will told, valid only to network  and transport level of tcp/ip stack, it can be valid for other level/situation, but you can found exceptions.
Second, you can not control directly the speed at which the remote host will send you the data, you can control speed which you will send the data or speed which you will recieve the data. I.e. if you have 10 Mbit channel, and remote host sending something to you at speed 10 Mbit/s, your channel will be flooded by remote host.
If we talk about tcp sockets, then you can indirectly control speed, you can recieve packets at 5Mbit/s and tcp protocol must throttle speed which remote host sending the data. If we talk about UDP, you can not do anything on network/transport level, exept shutting down of your interface. =)
So, why i told about that, in most cases when you shaping traffic in linux, you shaping outbound traffic. It is effective, exept situation when inbound bandwidth flooded which udp or which special forged packets ( if your under tcp syn flood attack, for example). When i wrote rules, i wrote it for outound traffic. You can try to control inbound traffic by IFB devices, but it is another story.
All rules and disciplines directly or indirectly bound to on of the network interface
Third important thing, in linux exists few diffirent disciplines for traffic shaping (you need to read tc manual, if you want to know more), i prefer HTB for my purposes. When you use HTB, you must classify traffic, all traffic that will not be classified, will go to the default class.

Huh, i read again what i wrote and now thinking that it look not so simple if you does not read theory about traffic shaping in linux, anyway if you does not understand what about i talking, you can read manual for tc and LARTC or you free to experiment and look what will happen. =)

After i learned tcng, i wrote complete rules for shaping traffic on ppp0, but when i tried to compile rules, i noticed that tcng use amount of cpu and did not response, i thinked that it is a bug, but after few minutes i got near 30 000 lines of tc commands. It happened, because i began using tcng  without understanding how they compile their rules. I give a simple example what happened:

#include "ports.tc"
 
dev ppp0 {
    egress {
        /* all of our classification happens here */
        class ( <$test> )  if tcp_sport > 1024 && tcp_dport < 1024 ||                            udp_sport > 1024 && udp_dport > 1024;
        class ( <$default> );
        /* all of our queues are added here */
        htb () {
            $test   = class ( rate 64kbps, ceil 128kbps ) {
                sfq ( perturb 10s ); }
            $default       = class ( rate 256kbps, ceil 512kbps ) {
                sfq ( perturb 10s ); }
        }
    }
}

This is just example, what will happened if you will try to compile it? Let’s see:

$ tcng /tmp/tcngtest|wc -l
732

You will can seen by yourself how looks result. In short when tcng compile rules, they can not use logical operations like destination port > 1024, instead of that they use bit maching by offset in package, as result you will got hundred rules that will containt bits, bit masks, offsets etc. I do not have my old rules for demonstration how exactly i had near 30k lines, but i will show how i solve this problem by iptables and tcng.

Because i can not use complex rules of packet classification in tcng, i used iptables to mark packets by same rules and simple rules in tcng for shaping by marks. You can learn about mark in iptables manual, as i remember, iptables can mark packets in network stack of OS, this mark visible locally, but will invisible for other hosts.
I slightly modified classification that you can see in diagram, i added additional classes and now traffic divided into realtime, high priority, middle pirority low latency, mid priority, low priority and default.
Also i want to explain rule for low priority traffic, first task is mark as low priority p2p networks, there is two way, use l7 inspection or use dumb rules based on network/transport layer. L7 inspection give low false positive error rates, but slow. Dumb rules give high false positive error rates, but fast. For l7 inspection you can use opendpi, but i prefer second solution. I classify all traffic from unprivileged ports to all  unprivileged as low priority traffic. Anyway, if you have something important that send data from unprivileged to unprivileged port, you can specify additional rules for this traffic (if your know ports).

Ok, let’s look on tcng rules, i created rules only for my internet connection, i connected over pppoe, so i placed next script in /etc/ppp/ip-up.d/qos:

#!/bin/sh
if [ ! -z $PPP_IFACE ]
then
  if [ -f /etc/tcng/${CALL_FILE}.tc ]
  then
    /bin/sed "s/PPP_IFACE/${PPP_IFACE}/" /etc/tcng/${CALL_FILE}.tc|/usr/bin/tcng -r|/bin/bash
  fi
fi

In /etc/tcng/dsl-provider.tc i put my rules, every time when i ‘call’ dsl-provider this script executing. Name of PPP interface can vary, so instead of interface name i put ‘PPP_IFACE’ into rules. When ppp connection established, script change marker to actual interface name, compile rules and execute it.

There is my rules:

#include "fields.tc"
#include "ports.tc"
 
dev PPP_IFACE {
  egress {
    class( <$realtime> ) if meta_nfmark == 0x10;
    class( <$high> ) if meta_nfmark == 0x15;
    class( <$mid_lowlatency> ) if meta_nfmark == 0x20;
    class( <$mid_noname> ) if meta_nfmark == 0x25;
    class( <$lowp> ) if meta_nfmark == 0x30;
    class( <$default> ) if 1; //default
    htb(quantum 1500B) {
      class( rate 100Mbps, ceil 100Mbps ) {
        class( prio 1, rate 2Mbps, ceil 100Mbps ) {
          $realtime = class( prio 1, rate 1Mbps, ceil 100Mbps ) { sfq; }
          $high = class( prio 2, rate 1Mbps, ceil 100Mbps ) { sfq; }
        };
        class( prio 2, rate 2Mbps, ceil 100Mbps, quantum 1250B ) {
          $mid_lowlatency = class( prio 1, rate 1Mbps, ceil 100Mbps ) { sfq; }
          $mid_noname = class( prio 2, rate 1Mbps, ceil 100Mbps ) { sfq; }
        };
        $lowp = class( prio 8, rate 1Mbps, ceil 100Mbps ) { sfq; };
        $default = class( prio 3, rate 1Mbps, ceil 100Mbps ) { sfq; };
      }
    }
  }
}

Pay attention, default class have higher priority than low priority. There i have different class names, but i think it is not a big problem for understanding idea of rules. Next we will talk about packets marking. if you want to mark packet in iptables, you need to use `-J MARK –set-mark=MARK_VALUE`, if you need to match marked packets in tcng, you must use `meta_nfmark` directive.  As i remember, you need to place mark rules into `mangle` table `PREROUTING` chain, otherwise this will not work with `tc`. I create `QOS` chain into `mangle` table and add `-j QOS` rule into `PREROUTING` chain, there is part of iptables-save dump:

-A POSTROUTING -o ppp0 -j QOS 
-A QOS -p icmp -m mark --mark 0x0 -m comment --comment "MARK all icmp traffic as realtime" -j MARK --set-mark 0x10/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m tcp --dport 53 -m comment --comment "MARK tcp DNS queries as realtime" -j MARK --set-mark 0x10/0xffffffff 
-A QOS -p udp -m mark --mark 0x0 -m udp --dport 53 -m comment --comment "MARK udp DNS queries as realtime" -j MARK --set-mark 0x10/0xffffffff 
-A QOS -p udp -m mark --mark 0x0 -m udp --dport 123 -m comment --comment "MARK udp NTP queries as realtime" -j MARK --set-mark 0x10/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m tcp --dport 123 -m comment --comment "MARK tcp NTP queries as realtime" -j MARK --set-mark 0x10/0xffffffff 
-A QOS -m mark --mark 0x0 -m length --length 0:128 -m comment --comment "MARK all small packets as realtime" -j MARK --set-mark 0x10/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --dports 22,5190 -m length --length 0:1500 -m comment --comment "MARK all small SSH and ICQ packets as high priority" -j MARK --set-mark 0x15/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --dports 80,443 -m comment --comment "MARK all HTTP traffic as medium interactive priority" -j MARK --set-mark 0x20/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --dports 143,220,993 -m comment --comment "MARK  IMAP traffic as medium interactive priority" -j MARK --set-mark 0x20/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --dports 110,995 -m comment --comment "MARK POP3 traffic as medium priority" -j MARK --set-mark 0x25/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --dports 25,465 -m comment --comment "MARK SMTP traffic as medium priority" -j MARK --set-mark 0x25/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m tcp --sport 10000 -m comment --comment "MARK outgoing OpenVPN as medium priority" -j MARK --set-mark 0x25/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m tcp --sport 1723 -m comment --comment "MARK outgoing pptp as medium priority" -j MARK --set-mark 0x25/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --sports 80,6080 -m comment --comment "MARK outgoing HTTP as medium priority" -j MARK --set-mark 0x25/0xffffffff 
-A QOS -p tcp -m mark --mark 0x0 -m multiport --dports 1024:65535 -m multiport --sports 1024:65535 -m comment --comment "MARK low priority TCP traffic" -j MARK --set-mark 0x50/0xffffffff 
-A QOS -p udp -m mark --mark 0x0 -m multiport --dports 1024:65535 -m multiport --sports 1024:65535 -m comment --comment "MARK low priority UDP traffic" -j MARK --set-mark 0x50/0xffffffff

I wrote rules with comments, so it is easy to understand. Also, now i noticed, that i must add this rules from script that executed when pppoe connection established, because i does not know name of ppp interface before connection will be established.

In conclusion i want to say, i tried just to show that tcng is very useful tool, it will help you to write clear rules for traffic shaping. Anyway, it can not substitute knowleges about how work traffic shaping in linux. I highly reccomend to read LARTC and manual for `tc` before you going to use `tcng`.

Failsafe zoneminder with gluster and geo-replication.

Near year ago i configured zoneminder for monitoring my approach. But i got one problem, host that running zoneminder placed in same approach, so, if it will be stolen, it will be stolen with recordings. I spend a lot of time while choosing solution to organize replication on remote server. Most solution that i found work in “packet” mode, they start replication once in NN sec or after accumulating a certain number of events and usually use rsync. Rsync allow to reduce traffic usage, but increase time between event when new frame will be wrote on disk and event when frame will be wrote on remote side. In this situation every second counts, so I thought that solutions like “unison”, “inosync”, “csync2”, “lsyncd” not applicable.

I tried to use drbd, but get few problems. First – i do not have separate partition on host and can not shrink existing partitions to get new. I tried it with loopback devices, but drbd have deadlock bug when used with loopback. This solution work near two days after deadlock occurs, you can not read content of mounted fs or unmount it and only one solution to fix it that i found – reboot. Second problem – “split brain” situation, when master host restart. I did not spent time to found solution because all ready had first problem. May be it can be fixed with split-brain handlers script or with “heartbeat”. Drbd has great write speed in asynchronous mode and has small overhead, so, may be in another situation i will choose drbd.

Next what i tried was “glusterfs”, first i tried glusterfs 3.0. It was easily to configure it, and glusterfs was worked, but very slow. For good performance glusterfs need short latency between hosts. It useful for local networks but completely useless for hosts connected over slow links. Also glusterfs 3.0 did not have asynchronous write like in drbd. I temporarily thrown searching for solutions, but after a while in release notes for glusterfs 3.2 i found that gluster got “geo-replication“. First i think that this it what i need and before has understood (see conclusion) how it work i started to configure it.

If you will have troubles with configuration, installation  here you can find manual.
For Debian, first that you need is to add backports repository on host with zonemnider and on hosts where you planned to replicate data:

$ echo 'deb http://backports.debian.org/debian-backports squeeze-backports \
main contrib non-free' >> /etc/apt/sources.list
$ apt-get update
$ apt-get install glusterfs-server

After that you must open ports for gluster on all hosts where you want to use it:

$ iptables -A INPUT -m tcp -p tcp --dport 24007:24047 -j ACCEPT 
$ iptables -A INPUT -m tcp -p tcp --dport 111 -j ACCEPT 
$ iptables -A INPUT -m udp -p udp --dport 111 -j ACCEPT 
$ iptables -A INPUT -m tcp -p tcp --dport 38465:38467 -j ACCEPT

If you do not planned to use glusterfs over NFS (as i) you can skip last rule.
After ports will opened, gluster installed and service running you need to add gluster peers. For example you run zoneminder on host “zhost” and want to replicate it on host “rhost” (do not forget add hosts in /etc/hosts on both sides).  Run “gluster” and add peer (here and below gluster commands must to be executed on master host, i.e. on zhost):

$ gluster
gluster> peer probe rhost

Let’s check it:

gluster> peer status
Number of Peers: 1
 
Hostname: rhost
Uuid: 5e95020c-9550-4c8c-bc73-c9a120a9e96e
State: Peer in Cluster (Connected)

Next  necessary to create volumes on both hosts, do not forget to create directories where gluster will save their files. For examples you planned to use “/var/spool/glusterfs”:

gluster> volume create zoneminder transport tcp zhost:/var/spool/glusterfs
Creation of volume zoneminder has been successful. Please start the volume to access data.
gluster> volume create zoneminder_rep transport tcp rhost:/var/spool/glusterfs
Creation of volume zoneminder_rep has been successful. Please start the volume to access data.
gluster> volume start zoneminder
Starting volume zoneminder has been successfu
gluster> volume start zoneminder_rep
Starting volume zoneminder_rep has been successfu

After that you must configure geo-replication:

gluster&gt; volume geo-replication zoneminder gluster://rhost:zoneminder_rep start
Starting geo-replication session between zoneminder &amp; gluster://rhost:zoneminder_rep has been successful

And check that it is work:

gluster> volume geo-replication status
MASTER               SLAVE                                              STATUS    
--------------------------------------------------------------------------------
zoneminder           gluster://rhost:zoneminder_rep           starting...
gluster> volume geo-replication status
MASTER               SLAVE                                              STATUS    
--------------------------------------------------------------------------------
zoneminder           gluster://rhost:zoneminder_rep           OK

Also i made next configurations:

gluster> volume geo-replication zoneminder gluster://192.168.107.120:zoneminder_rep  config sync-jobs 4
geo-replication config updated successfully
gluster> volume geo-replication zoneminder gluster://192.168.107.120:zoneminder_rep  config timeout 120
geo-replication config updated successfully
gluster> volume set zoneminder nfs.disable on
Set volume successful
gluster> volume set zoneminder_rep nfs.disable on
Set volume successful
gluster> volume set zoneminder nfs.export-volumes off
Set volume successful
gluster> volume set zoneminder_rep nfs.export-volumes off
Set volume successful
gluster> volume set zoneminder performance.stat-prefetch off
Set volume successful
gluster> volume set zoneminder_rep performance.stat-prefetch off
Set volume successful

I disabled nfs because i did not planned to use gluster volumes over nfs, also i disabled prefetch because they produce next error on geo-ip modules:

E [stat-prefetch.c:695:sp_remove_caches_from_all_fds_opened] (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(fuse_setattr_resume+0x1b0) [0x7f2006ba8ed0] (-->/usr/lib/glusterfs/3.2.7/xlator/debug/io-stats.so
(io_stats_setattr+0x14f) [0x7f200467db8f] (-->/usr/lib/glusterfs/3.2.7/xlator/performance/stat-prefetch.so(sp_setattr+0x7c) [0x7f200489c99c]))) 0-zoneminder_rep-stat-prefetch: invalid argument: inode

In conclusion i must to say, that it is not a solution that i looked for, because as it turned out  gluster use rsync for geo-replication. Pros of this solution: it is work and easy to setup. Cons it is use rsync. Also i expected that debian squeeze can not mount gluster volumes in boot sequence, i tried to modify /etc/init.d/mountnfs.sh but without result, so i just add mount command in zoneminder start script.

PS
echo ‘cyclope:zoneminder      /var/cache/zoneminder   glusterfs       defaults        0       0’ >> /etc/fstab
And add ‘mount /var/cache/zoneminder’ in start section of /etc/init.d/zoneminder

PPS
Do not forget to stope zoneminder and copy content of /var/cache/zoneminder on new partition before using it.

How to block large ip subset on the example of TOR

I wrote before how i blocked TOR exit nodes by iptables, disadvantages of method that i used before – big amount of rules (one per each ip). This solution easy and obvious, but had speed penalty. Today i want write about more effective solution that use ipset, let’s see what is ipset:

IP sets are a framework inside the Linux 2.4.x and 2.6.x kernel, which can be administered by the ipset utility. Depending on the type, currently an IP set may store IP addresses, (TCP/UDP) port numbers or IP addresses with MAC addresses in a way, which ensures lightning speed when matching an entry against a set. If you want to store multiple IP addresses or port numbers and match against the collection by iptables at one swoop; dynamically update iptables rules against IP addresses or ports without performance penalty; express complex IP address and ports based rulesets with one single iptables rule and benefit from the speed of IP sets then ipset may be the proper tool for you.

(c)

Because of exit nodes can exist on hosts that using dynamic IP i want to delete addresses from list after timeout (otherwise list will always growing and contain unused addresses or addresses without TOR exit nodes, larger list requires more memory and CPU time for processing). If addresses persist between list updates, timeout will be reseted.

To do this i used iptree set (you can learn more about set types in manual for ipset), because that type provide timeout for each address.

First i installed ipset:

# apt-get install xtables-addons-common

After that i modify scripts that i used before. In perl script i made a pair of minor bug fixes and in shell script i add new facility for loading rules into ipset.
Perl script:

#!/usr/bin/perl -w
use strict;
use LWP::Simple;
 
my $list = get("http://exitlist.torproject.org/exit-addresses");
my $i;
my @ips;
 
if( ! defined( $list ) ) {
    exit( 1 );
}
if( $#ARGV == -1 ) {
exit(0);
}
 
foreach $i (split( /\n/, $list )) {
    push( @ips, $1 ) if( $i =~ m/((?:\d{1,3}\.){3}\d{1,3})/);
}
 
if( $ARGV[0] eq "-ip" ) {
    print( join( "\n", @ips ) . "\n");
}

That script return exit code without arguments that signaled can this script fetch addresses or not, with parameter “-ip” they return list of addresses.
Next shell script:

#!/bin/bash
SOLVE="/root/bin/solve.pl"
IPSET="/usr/sbin/ipset"
 
case "$1" in
ipset)
        if ! /usr/bin/perl $SOLVE
        then
                echo "Can not fetch Tor exit nodes" 1>&2
                exit 1
        fi
        if ! $IPSET -L tor 2>&1 > /dev/null
        then
                $IPSET -N tor iptree --timeout 259200
        fi
        for i in `/usr/bin/perl $SOLVE -ip`
        do
                $IPSET -A tor $i 2> /dev/null
        done
 
;;
 
iptables)
        if /usr/bin/perl $SOLVE
        then
                /sbin/iptables -F TorExitnodes
                /sbin/iptables -I TorExitnodes -j RETURN
 
                for i in `/usr/bin/perl $SOLVE -ip`
                do
                 /sbin/iptables -I TorExitnodes -s $i -j TorBlockAndLog
                done
        else
                        echo "Can not fetch Tor exit nodes" 1>&2
        exit 1;
        fi
 
;;
 
*)
        echo "Usage ./$0 "
        exit 1
;;
esac
 
exit 0

This script can add rules into iptables (old variant) or into ipset. I added this script in cron and run every few hours. When ipset found that address all ready in list, they update timeout, if address will not observed in 72 hours, they will be automatically deleted.

Finishing touch – new rule in iptables:

# iptables -A INPUT -i eth+ -m set --match-set tor src -m comment --comment "Block TOR exit nodes thru IPSET" -j DROP

That’s all. Do not forget to place this rule before rules where you permit access to your server.

How to swindle hoster or the tale of how to defeat greed.

I remembered one interesting story.
One time we bought dedicated server, when we made out the order i specified partition layout in additional comment section (raid 1 + lvm), when i got access to the server i saw that OS have one partition on LVM spliced over both drives.

We wrote email, and asked to reinstall OS with specified layout. The hoster answered that OS installed automatically, if we need another layout, then we must pay additional 100$. After that i started to google about how i can reinstall OS remotely, i found few complicated solution, but did not used it.

Also i remembered that we have IP KVM, i logged in (it was HP KVM), found that i can connect ISO to virtual cd-drive, tried it, but without result.

While i tried to use virtual cd, i rebooted host few times, while it was rebooting i saw that they used boot on lan and have installed gPXE. Unfortunately i did not found debian installation in their net boot environment, but i have deal with gPXE before and remembered that i can boot anything what i want over network thru gPXE console. I found my old gPXE script, modify it to boot debian installer, and taadaaam!

% chainload http://178.209.51.37/2013

I installed OS as i want and save money.
PS
It was interesting, when i started to install OS i observed that they reinstall OS with new partition layout. One part of first HDD was used for OS, another part and second HDD was used for LVM, idiots..

How to transfer data between hosts securely.

like a bossFrom time to time i faced with task how to transfer important data between servers securely (ie over ssl or something similar). I do not use passwords for remote access and do not have private keys on remote systems, so i can not use ssh for this purposes.

First i wanted to write about solution that i used few days ago (based on socat), but this solution is to complicated (later you will see why). I remembered that openssl can encrypt files with password and send result to STDOUT. While i was reading manual for openssl, i found that openssl can be used like netcat (s_server and s_client commands), unfortunately i did not found way how to use openssl for data transfer, because in session that you can establish openssl interprets some chars as commands (R for renegotiation for example), so if you want to use openssl client/server for data transfer, you need something like base64 encoding, but without control characters.

Solution with netcat and openssl:
First i created file for test (you can use data from STDIN, tar output for example, or transfer existing file):

client$ dd if=/dev/urandom of=/tmp/rand bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 1.36697 seconds, 7.7 MB/s
client$ md5sum /tmp/rand
10fe36edbbd48cde844ad1a2a29a8e0f  /tmp/rand

Next, prepare server side:

server$ read pass
PasSwOrD
server$ nc -l -p 6667|openssl aes-256-cbc -d -k $pass &gt; /tmp/rand

I used “read” to prevent save key into history file.
There “PasSwOrD” is your key, i use ssh to organize data transfer, so i did not worried that the traffic with key can be captured.
Next initiate transfer from client side:

client$ read pass
PasSwOrD
client$ cat /tmp/rand |openssl  aes-256-cbc -salt -k $pass|nc -w1 server.remote 6667

Check sum:

server$ md5sum /tmp/rand
10fe36edbbd48cde844ad1a2a29a8e0f  /tmp/rand

Yeah! Your see? I transfered mah file.
UPDATE: [2019-05-26] Nowaday openssl has broken backward compatibility, so when you try to decrypt file you cold get error message like `digital envelope routines:EVP_DecryptFinal_ex:bad decrypt:crypto/evp/evp_enc.c:535`, if so you need to add -md md5 or -md sha256 on both sides to openssl’s options.

Ok, next we will do it with socat.
First you need to generate client side and server side key and certificates, let’s do it on server:

server$ openssl genrsa -out server.key 2048
Generating RSA private key, 2048 bit long modulus
........+++
......................................+++
e is 65537 (0x10001)

Create certificate:

server$  openssl req -new -key server.key -x509 -days 108 -batch -out server.crt

Create pem file:

server$ cat server.* &gt; ./server.pem

After that you will need to execute same commands on client side, but you will need to change filenames from “server” to “client”.
Next step is to exchange certificates between client and server (do it on both sides), they could be copy pasted:

server$ cat &gt; ./client.crt &lt;&lt; EOF
-----BEGIN CERTIFICATE-----
.... a lot of garbage ....
-----END CERTIFICATE-----
EOF

Now we ready to transfer file, prepare server:

server$ socat openssl-listen:4433,reuseaddr,cert=./server.pem,cafile=./client.crt STDIO &gt; /tmp/rand

Client:

client$ socat STDIO openssl-connect:server.remote:4433,cert=./client.pem,cafile=./server.crt &lt; /tmp/rand
client$ md5sum /tmp/rand
10fe36edbbd48cde844ad1a2a29a8e0f  /tmp/rand

Gotcha!

As you can see, socat with TLS not a easy solution if you need just a transfer file, so i will recommend to use first solution. Also, in debian, you can use snakeoil key and cert, but it is your homework.

UPDATE:
I found how to use openssl for data transfer, only one problem, they did not close socket after EOF, so you need to stop it by hands:
Prepare server (this time i used snakeoil cert):

server% sudo openssl s_server -quiet -accept 4343 -cert /etc/ssl/certs/ssl-cert-snakeoil.pem -key /etc/ssl/private/ssl-cert-snakeoil.key &lt; /tmp/test

Run client:

client% md5sum /tmp/rnd
86246865b3932804979fdac48a99cebf  /tmp/rnd
client% openssl s_client -connect localhost:4343 -quiet &gt; /tmp/rnd

After data transfered, hit ^C on server side and check:

^C
server% md5sum /tmp/test
86246865b3932804979fdac48a99cebf  /tmp/test

Bug in munin

bug in munin traffic graphFew weeks i observed strange graphs for network usage produced by munin, i did not attach any importance to this. But few days ago when i seen again 600Mbit badwidth usage on host that had 10Mbit connection i remembered that before made some changes in munin.conf. I looked at config and found that changed directive ‘graph_period’ to ‘minute’ before(do not remember why). When i change it back to ‘second’, i got normal graphs.

A riddle from abyss

A week ago, when i worked at home computer i encountered with strange issue, xorg was stuck, i tried ctrl-alt-backspace and magic keys without result, so i rebooted host.
While computer is booted i saw few errors from different daemons that they can not found some files and finally when i tried to login i got fresh login screen after entered password, again and again. Superuser was disabled, so i booted into singlemod but only one errer that i saw was ‘Unable to cd to ‘/home/ivan’ for user ‘ivan'”. Considering the suddenness of what happened, i thought that my HDD broken, i booted from livecd and checked s.m.a.r.t., hdd surface for bad blocks and did not found anything. After that my paranoia  took over and i started to search rootkits or something that can broke my OS and nothing found again.

I configured network from single mode, start xsession with browser and started searching, suddenly i found article when someone said that run install script and that script changed permissions on ‘/’, i looked at my ‘/’ and observed 0644. For me is still a mystery how it happened.

Apt pinning note

Some days ago i got additional VDS and wanted to install php5-fpm from dotdeb, but after i added dotdeb repository, apt started to try upgrade mysql and nginx from dotdeb. Solution to stop that:

$ cat > /etc/apt/preferences.d/dotdeb << EOF
Package: *
Pin: origin "packages.dotdeb.org"
Pin-Priority: 50
EOF

Next time if i will want to install something from dotdeb, i can use -t switch for apt-get or aptitude.

Tor blacklist

Few month ago i was interested in how to block incoming traffic from Tor network. Tor network have finitely numbers of exit nodes, so the solution is to block traffic from this nodes. I see two solutions how to block exit nodes.  First use TorDNSEL service, in brief you can check that connection comes from Tor exit node or not by querying special domain name. It is useful if you want to do this check in your php script, for example. But i do not know how to protect  host completely by this method. Another way is to use exitlist. By this way you can get list of addresses. Both ways have advantages and disadvantages. I prefer last solution because, it allow to protect my hosts completely.  Cons of this method – near 700-800 rules into firewall, that can slow down your host.

Anyway, first that i done – simply script that fetch list of IP and print them to standard output. When i started to writing script i wanted to expand features in future. I want to generate ready firewall rules for example, but now this script can only print list of IP’s. I do not want to integrate firewall rules application in this script, because i think this job must be done by another script.
So, now this script only fetch and print IP addresses or exit with error code if can not do it:

#!/usr/bin/perl -w
use strict;
use LWP::Simple;
 
my $list = get("http://exitlist.torproject.org/exit-addresses");
my $i;
my @ips;
 
if( ! defined( $list ) ) {
die( "Can not fetch addresses.\n" );
}
 
foreach $i (split( /\n/, $list )) {
push( @ips, $1 ) if( $i =~ m/((?:\d{1,3}\.){3}\d{1,3})/);
}
 
if( defined( $ARGV[0] ) &&  $ARGV[0]  eq "-ip" ) {
print( join( "\n", @ips ));
} else {
die( "Usage $0 -ip\n" );
}

Next that you need – apply rules on firewall. As i say before, when you apply one rule per IP you will get near 700-800 rules, it is reason why you need optimize sequence of your rules. It can be done in next manner:
1. Rule to accept packets flowing thru loopback.
2. Rule to accept packets of RELATED and ESTABLISHED connections.
3. Rule to drop packets from blacklisted sources.
4. Rule to accept packets from whitelisted sources.
5. Rule for fail2ban (optional).
6. Rule for Tor exit nodes.
7. Other rules (for example, accept connection on ssh, http, https ports).

This sequence allow  to make decision for most packets before they pass Tor filtration rules.
I organize Tor filtration rules in separate table called TorExitnodes, it allow to update only this table and do not touch other firewall rules. Also i create table TorBlockAndLog that contain LOG and DROP target, but it is optional, you can simply drop packets.

For updating firewall rules for exit nodes i made next bash scrip and place it in cront:

#!/bin/bash
SOLVE="/root/bin/solve.pl"
 
if /usr/bin/perl $SOLVE
then
/sbin/iptables -F TorExitnodes
/sbin/iptables -I TorExitnodes -j RETURN
 
for i in `/usr/bin/perl $SOLVE -ip`
do
/sbin/iptables -I TorExitnodes -s $i -j TorBlockAndLog
done
else
echo "Can not fetch Tor exit nodes" 1>&2
exit 1;
fi

Where is variable “SOLVE” store path to previous perl script, as i say before, if you do not need special action for incoming connections from Tor, you can change “TorBlockAndLog” on “DROP”.

Thats all.

PS

It is good idea to place whitelist before TorExitnodes, who can know, may be one fine day you found -s 0.0.0.0/0 -j DROP into this table. =)