leonardo@opalsoft.net
Version 0.1 - April 30, 2002
Revision History | ||
---|---|---|
Revision 0.1 | 2002-04-30 | Revised by: lb |
Initial release. |
This document is free; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This document is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You can get a copy of the GNU GPL here.
I took this "disclaimer" from Linux-Advance Networking Overview by Saravanan Radhakrishnan (08-1999) because it applies in my own case:
All the text in this document is completely based on my understanding of implementations of various features. I have read some documents and have seen the code myself, and I described them based on my understanding. If the reader notices any concept description which appears to be contrary to their understanding of the concept, the issue can be taken up for discussion and corrections will be made to the document as necessary. I would appreciate any suggestions and comments made in order to improve the quality of this document.
My wife Cielo and my sons Jose, Dario and Gustavo by their patient and support.
Linux Documentation Project for publishing and uploading my document.
The site http://www.docum.org drived by Stef Coene that give me some ideas for writing this document.
Paul "Rusty" Rusell who write the kernel firewall code, the excelent package iptables and the associated library libiptc.
Harald Welte who write the utility iptables-save.
Alexey Kuznetsov who write the kernel queue discipline code and the excelent package iproute2.
Tabatha Persad from the Linux Documentation Project who revised my english syntax and writing, gave me several ideas to improve the content and encouraged me to learn and use DocBook to write the final version of this document.
You can find the latest version of this document at Querying libiptc HOWTO.html.
If you have suggestions to help make this document better, please submit your ideas to me at the following address: leonardo@opalsoft.net.
While I wrote this HOWTO, I developed a simple bandwith meter using user-defined chains to get the data to be measured. This idea was conceived looking at monitor.pl, a simple perl program for bandwith measurement, written by Stef Coene at http://www.docum.org. I recommend this site to people interested in bandwidth control and measurement.
I will try to be very pragmatic and clear in order to make this HOWTO useful.
You have to have some previous knowledge to follow this document:
You have to have kernel sources installed in your system, in /usr/src/linux as usual.
I am using a 2.4.16 kernel in a SuSE 7.1 Linux environment. You need 2.4.x kernel code to follow this HOWTO, preferably kernel 2.4.16. For SuSE you can get the kernel sources at ftp://ftp.gwdg.de/pub/linux/suse/ftp.suse.com/suse/i386/update.
You have to know how to compile the kernel if you have to update your kernel version. After activating the netfilter options using make menuconfig, you must compile and install the kernel as usual.
Reboot your new kernel using init 6. Ensure that you backup a copy of your previous kernel in lilo in case you encounter a problem and need to retrace your steps.
Be sure that your new 2.4.x kernel is running fine. To install iptables-1.2.6 you will need to patch the kernel again (and re-compile and install it), and it is better if you follow the previous two steps to ensure that your kernel is running right before applying new iptables patches.
To install libiptc follow these steps:
Download iptables-1.2.6.tar.bz2 from http://netfilter.samba.org/.
Copy the iptables tar file into /usr/local/src:
bash# cp iptables-1.2.6.tar.bz2 /usr/local/src |
Unpack:
bash# tar xjvf iptables-1.2.6.tar.bz2 |
Go into the iptables directory:
bash# cd iptables-1.2.6 |
Check to see if your kernel needs some aditional patches with:
bash# make pending-patches KERNEL_DIR=/usr/src/linux |
If your kernel source is located somewhere other than in /usr/src/linux, replace the kernel source directory in the command line above with your source directory.
Be careful with this option. This command invokes patch-o-matic, a new patch verification utility by Rusty Russell. The utility will show you a list of new patches (some proposed, some submitted, some accepted) available for your kernel source. As Rusty himself says, "Some of these new patches have bugs", and you do not have to apply all of them.
Read the information showed for each patch carefully and answer with y (apply the patch) or N (skip this patch). In some cases answering y will try to apply the patch, but if the patch finds some differences between your sources, it will be skipped and the next new one presented.
I did not apply any of the proposed patches and kept my kernel in its original state before continuing to the next step.
Make the iptables package with:
bash# make KERNEL_DIR=/usr/src/linux |
Again, if your kernel source is not at /usr/src/linux, replace the kernel source directory in the command above.
If all goes right the compiler will finish without errors.
Before the next step, check to see if you have installed iptables package by typing:
bash# rpm -q iptables |
If the iptables rpm is installed, you will see the name and version of the package, similar to:
iptables-1.1.2-13
In this case un-install with:
bash# rpm -e iptables |
Install the new created package:
bash# make install KERNEL_DIR=/usr/src/linux |
Again, check your kernel source directory.
This command will install the binaries (iptables, iptables-save, iptables-restore) in /usr/local/sbin, the manuals in /usr/local/man/man8 and the modules in /usr/local/lib/iptables.
Finally install the headers, development libraries and associated development man pages, with:
bash# make install-devel |
This command will install the libiptc library in /usr/local/lib.
I think something must be wrong with this command. It does not install all headers files properly, so you must install them yourself using:
bash# cd /usr/local/src/iptables-1.2.6 bash# cp include/iptables.h /usr/local/include bash# cp include/iptables_common.h /usr/local/include bash# mkdir /usr/local/include/libiptc bash# cp include/libiptc/libiptc.h /usr/local/include/libiptc bash# cp include/libiptc/ipt_kernel_headers.h /usr/local/include/libiptc bash# cp iptables.o /usr/local/lib |
iptables.o is needed above to compile programs to get rule information from netfilter.
Now you are ready to create programs that can communicate directly with libiptc.
Your program(s) would be something like this:
/* My program */ #include <getopt.h> #include <sys/errno.h> #include <stdio.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <dlfcn.h> #include <time.h> #include "libiptc/libiptc.h" #include "iptables.h" int main(void) { /* Use always this part for your programs .... From here ... **** */ iptc_handle_t h; const char *chain = NULL; const char *tablename = NULL; program_name = "my_program"; program_version = NETFILTER_VERSION; /* .... To here .... ******************************************** */ /* From here you write your own code */ .... your code ... .... .... } /* main */ |
I have also included a brief description (when available) taken from Linux netfilter Hacking HOWTO within each function explanation.
Usage: Takes a snapshot of the rules.
Prototype: iptc_handle_t iptc_init(const char *tablename)
Description: This function must be called as initiator before any other function can be called.
Have a look at this section of code in file iptables-save.c for how to invoke this function:
h = iptc_init(tablename); if (!h) exit_error(OTHER_PROBLEM, "Can't initialize: %s\n",iptc_strerror(errno)); |
Usage: Translates error numbers into more human-readable form.
Prototype: const char *iptc_strerror(int err)
Usage: Iterator functions to run through the chains.
Prototype: const char *iptc_first_chain(iptc_handle_t *handle)
Description: This function returns the first chain name in the table.
Usage: Iterator functions to run through the chains.
Prototype: const char *iptc_next_chain(iptc_handle_t *handle)
Description: This function returns the next chain name in the table; NULL means no more chains.
Returns: Char pointer to the name of the chain.
We can create Program #1 to exercise our understanding of these previous four functions:
/* * How to use libiptc- program #1 * /usr/local/src/p1.c */ #include <getopt.h> #include <sys/errno.h> #include <stdio.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <dlfcn.h> #include <time.h> #include "libiptc/libiptc.h" #include "iptables.h" int main(void) { iptc_handle_t h; const char *chain = NULL; const char *tablename = "filter"; program_name = "p1"; program_version = NETFILTER_VERSION; h = iptc_init(tablename); if ( !h ) { printf("Error initializing: %s\n", iptc_strerror(errno)); exit(errno); } for (chain = iptc_first_chain(&h); chain; chain = iptc_next_chain(&h)) { printf("%s\n", chain); } exit(0); } /* main */ |
Write this program and save it as p1.c in /usr/local/src.
Now write this "bash" script to simplify the compiling process:
#!/bin/bash gcc -Wall -Wunused -DNETFILTER_VERSION=\"1.2.6\" -rdynamic -o $1 $1.c \ /usr/local/lib/iptables.o /usr/local/lib/libiptc.a -ldl |
Save it as ipt-cc and do not forget to chmod 0700 ipt-cc.
bash# ./ipt-cc p1 |
bash# ./p1 |
INPUT FORWARD OUTPUT |
These are the three built-in iptables chains.
Now create some new chains using iptables and run your program again:
bash# iptables -N chain_1 bash# iptables -N chain_2 bash# ./p1 |
INPUT FORWARD OUTPUT chain_1 chain_2 |
Error initializing: Table does not exist (do you need to insmod?) |
iptables informs you that myfilter does not exist as a table.
Usage: Check if a chain exists.
Prototype: int iptc_is_chain(const char *chain, const iptc_handle_t handle)
Usage: Is this a built-in chain?
Prototype: int iptc_builtin(const char *chain, const iptc_handle_t handle)
Description: This function is used to check if a given chain name is a built-in chain or not.
Usage: Get first rule in the given chain.
Prototype: const struct ipt_entry *iptc_first_rule(const char *chain, iptc_handle_t *handle)
Usage: Get a pointer to the target name of this entry.
Prototype: const char *iptc_get_target(const struct ipt_entry *e, iptc_handle_t *handle)
Returns: Returns a char pointer to the target name. See Description above for more information.
/* Internet address. */ struct in_addr { __u32 s_addr; }; /* Yes, Virginia, you have to zero the padding. */ struct ipt_ip { /* Source and destination IP addr */ struct in_addr src, dst; /* Mask for src and dest IP addr */ struct in_addr smsk, dmsk; char iniface[IFNAMSIZ], outiface[IFNAMSIZ]; unsigned char iniface_mask[IFNAMSIZ], outiface_mask[IFNAMSIZ]; /* Protocol, 0 = ANY */ u_int16_t proto; /* Flags word */ u_int8_t flags; /* Inverse flags */ u_int8_t invflags; }; struct ipt_counters { u_int64_t pcnt, bcnt; /* Packet and byte counters */ }; /* This structure defines each of the firewall rules. Consists of 3 parts which are 1) general IP header stuff 2) match specific stuff 3) the target to perform if the rule matches */ struct ipt_entry { struct ipt_ip ip; /* Mark with fields that we care about. */ unsigned int nfcache; /* Size of ipt_entry + matches */ u_int16_t target_offset; /* Size of ipt_entry + matches + target */ u_int16_t next_offset; /* Back pointer */ unsigned int comefrom; /* Packet and byte counters. */ struct ipt_counters counters; /* The matches (if any), then the target. */ unsigned char elems[0]; }; |
An ipt_entry structure contains:
Here is another sample program Program #2 written with a lot of help from Russell-Welte:
/* * How to use libiptc- program #2 * /usr/local/src/p1.c */ #include <getopt.h> #include <sys/errno.h> #include <stdio.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <dlfcn.h> #include <time.h> #include "libiptc/libiptc.h" #include "iptables.h" /* Here begins some of the code taken from iptables-save.c **************** */ #define IP_PARTS_NATIVE(n) \ (unsigned int)((n)>>24)&0xFF, \ (unsigned int)((n)>>16)&0xFF, \ (unsigned int)((n)>>8)&0xFF, \ (unsigned int)((n)&0xFF) #define IP_PARTS(n) IP_PARTS_NATIVE(ntohl(n)) /* This assumes that mask is contiguous, and byte-bounded. */ static void print_iface(char letter, const char *iface, const unsigned char *mask, int invert) { unsigned int i; if (mask[0] == 0) return; printf("-%c %s", letter, invert ? "! " : ""); for (i = 0; i < IFNAMSIZ; i++) { if (mask[i] != 0) { if (iface[i] != '\0') printf("%c", iface[i]); } else { /* we can access iface[i-1] here, because * a few lines above we make sure that mask[0] != 0 */ if (iface[i-1] != '\0') printf("+"); break; } } printf(" "); } /* These are hardcoded backups in iptables.c, so they are safe */ struct pprot { char *name; u_int8_t num; }; /* FIXME: why don't we use /etc/protocols ? */ static const struct pprot chain_protos[] = { { "tcp", IPPROTO_TCP }, { "udp", IPPROTO_UDP }, { "icmp", IPPROTO_ICMP }, { "esp", IPPROTO_ESP }, { "ah", IPPROTO_AH }, }; static void print_proto(u_int16_t proto, int invert) { if (proto) { unsigned int i; const char *invertstr = invert ? "! " : ""; for (i = 0; i < sizeof(chain_protos)/sizeof(struct pprot); i++) if (chain_protos[i].num == proto) { printf("-p %s%s ", invertstr, chain_protos[i].name); return; } printf("-p %s%u ", invertstr, proto); } } static int print_match(const struct ipt_entry_match *e, const struct ipt_ip *ip) { struct iptables_match *match = find_match(e->u.user.name, TRY_LOAD); if (match) { printf("-m %s ", e->u.user.name); /* some matches don't provide a save function */ if (match->save) match->save(ip, e); } else { if (e->u.match_size) { fprintf(stderr, "Can't find library for match `%s'\n", e->u.user.name); exit(1); } } return 0; } /* print a given ip including mask if neccessary */ static void print_ip(char *prefix, u_int32_t ip, u_int32_t mask, int invert) { if (!mask && !ip) return; printf("%s %s%u.%u.%u.%u", prefix, invert ? "! " : "", IP_PARTS(ip)); if (mask != 0xffffffff) printf("/%u.%u.%u.%u ", IP_PARTS(mask)); else printf(" "); } /* We want this to be readable, so only print out neccessary fields. * Because that's the kind of world I want to live in. */ static void print_rule(const struct ipt_entry *e, iptc_handle_t *h, const char *chain, int counters) { struct ipt_entry_target *t; const char *target_name; /* print counters */ if (counters) printf("[%llu:%llu] ", e->counters.pcnt, e->counters.bcnt); /* print chain name */ printf("-A %s ", chain); /* Print IP part. */ print_ip("-s", e->ip.src.s_addr,e->ip.smsk.s_addr, e->ip.invflags & IPT_INV_SRCIP); print_ip("-d", e->ip.dst.s_addr, e->ip.dmsk.s_addr, e->ip.invflags & IPT_INV_DSTIP); print_iface('i', e->ip.iniface, e->ip.iniface_mask, e->ip.invflags & IPT_INV_VIA_IN); print_iface('o', e->ip.outiface, e->ip.outiface_mask, e->ip.invflags & IPT_INV_VIA_OUT); print_proto(e->ip.proto, e->ip.invflags & IPT_INV_PROTO); if (e->ip.flags & IPT_F_FRAG) printf("%s-f ", e->ip.invflags & IPT_INV_FRAG ? "! " : ""); /* Print matchinfo part */ if (e->target_offset) { IPT_MATCH_ITERATE(e, print_match, &e->ip); } /* Print target name */ target_name = iptc_get_target(e, h); if (target_name && (*target_name != '\0')) printf("-j %s ", target_name); /* Print targinfo part */ t = ipt_get_target((struct ipt_entry *)e); if (t->u.user.name[0]) { struct iptables_target *target = find_target(t->u.user.name, TRY_LOAD); if (!target) { fprintf(stderr, "Can't find library for target `%s'\n", t->u.user.name); exit(1); } if (target->save) target->save(&e->ip, t); else { /* If the target size is greater than ipt_entry_target * there is something to be saved, we just don't know * how to print it */ if (t->u.target_size != sizeof(struct ipt_entry_target)) { fprintf(stderr, "Target `%s' is missing " "save function\n", t->u.user.name); exit(1); } } } printf("\n"); } /* Here ends some of the code taken from iptables-save.c ****************** */ int main(void) { iptc_handle_t h; const struct ipt_entry *e; const char *chain = NULL; const char *tablename = "filter"; const int counters = 1; program_name = "p2"; program_version = NETFILTER_VERSION; /* initialize */ h = iptc_init(tablename); if ( !h ) { printf("Error initializing: %s\n", iptc_strerror(errno)); exit(errno); } /* print chains and their rules */ for (chain = iptc_first_chain(&h); chain; chain = iptc_next_chain(&h)) { printf("%s\n", chain); for (e = iptc_first_rule(chain, &h); e; e = iptc_next_rule(e, &h)) { print_rule(e, &h, chain, counters); } } exit(0); } /* main */ |
In main we iterate through each chain and for each one we iterate through each rule printing it.
The arguments of print_rule are:
e = pointer to an ipt_entry structure containing information about the rule.
h = pointer to an iptc_handle_t structure returned by iptc_init.
OK, compile and run program p2:
bash# ./ipt-cc p2 bash# ./p2 |
INPUT FORWARD OUTPUT chain_1 chain_2 |
Now modify the environment using iptables to add some rules:
bash# iptables -A INPUT -p tcp -i eth0 -s ! 192.168.1.1 --dport 20 -j ACCEPT bash# iptables -A chain_1 -p udp -o eth1 -s 192.168.2.0/24 --sport 33 -j DROP |
Now if you run again p2 you will get:
INPUT [0:0] -A INPUT -s ! 192.168.1.1 -i eth0 -p tcp -m tcp --dport 20 -j ACCEPT FORWARD OUTPUT chain_1 [0:0] -A chain_1 -s 192.168.2.0/255.255.255.0 -o eth1 -p udp -m udp --sport 33 -j DROP chain_2 |
Usage: Get the policy of a given built-in chain.
Returns: Returns a char pointer to the policy name.
Using pieces of programs 1 and 2 we can write program #3:
/* * How to use libiptc- program #3 * /usr/local/src/p3.c */ #include <getopt.h> #include <sys/errno.h> #include <stdio.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <dlfcn.h> #include <time.h> #include "libiptc/libiptc.h" #include "iptables.h" int main(void) { iptc_handle_t h; const char *chain = NULL; const char *policy = NULL; const char *tablename = "filter"; struct ipt_counters counters; program_name = "p3"; program_version = NETFILTER_VERSION; /* initialize */ h = iptc_init(tablename); if ( !h ) { printf("Error initializing: %s\n", iptc_strerror(errno)); exit(errno); } /* print built-in chains, their policies and counters */ printf("BUILT-IN POLICY PKTS-BYTES\n"); printf("-----------------------------\n"); for (chain = iptc_first_chain(&h); chain; chain = iptc_next_chain(&h)) { if ( !iptc_builtin(chain, h) ) continue; if ( (policy = iptc_get_policy(chain, &counters, &h)) ) printf("%-10s %-10s [%llu:%llu]\n", chain, policy, counters.pcnt, counters.bcnt); } exit(0); } /* main */ |
OK, compile and run program p3:
bash# ./ipt-cc p3 bash# ./p3 |
You will get something like this:
BUILT-IN POLICY PKTS-BYTES ---------------------------- INPUT ACCEPT [0:0] FORWARD ACCEPT [0:0] OUTPUT ACCEPT [0:0] |
These functions are not covered by this HOWTO and I will limit myself to presenting improved information taken from libiptc.h and the Linux netfilter Hacking HOWTO by Rusty Russell.
Prototype: int iptc_flush_entries(const ipt_chainlabel chain, iptc_handle_t *handle)
Description: This function flushes the rule entries in the given chain (ie. empties chain).
Usage: Zeroes the chain counters.
Prototype: int iptc_zero_entries(const ipt_chainlabel chain, iptc_handle_t *handle)
Description: This function zeroes the counters in the given chain.
Prototype: int iptc_create_chain(const ipt_chainlabel chain, iptc_handle_t *handle)
Prototype: int iptc_delete_chain(const ipt_chainlabel chain, iptc_handle_t *handle)
Description: This function delete the chain identified by the char pointer chain in the table.
Here is the bandwidth meter bw.c. It's well commented to be easy followed by everyone:
/* * How to use libiptc- program #4 * /usr/local/src/bw.c * By Leonardo Balliache - 04.09.2002 * e-mail: leonardo@opalsoft.net * --WELL COMMENTED-- to be easy followed by everyone. */ /* include files required */ #include <getopt.h> #include <sys/errno.h> #include <sys/time.h> #include <stdio.h> #include <fcntl.h> #include <stdlib.h> #include <string.h> #include <dlfcn.h> #include <time.h> #include <unistd.h> #include "libiptc/libiptc.h" #include "iptables.h" /* colors to differentiate chains measures */ #define RED "\033[41m\033[37m" #define GREEN "\033[42m\033[30m" #define ORANGE "\033[43m\033[30m" #define BLUE "\033[44m\033[37m" #define MAGENTA "\033[45m\033[37m" #define CYAN "\033[46m\033[30m" #define WHITE "\033[47m\033[30m" #define BLACK "\033[40m\033[37m" #define RESET "\033[00m" /* maximum number of chains to be processed */ #define MAXUSERCHAINS 7 /* time between measures in seconds; adjust as you like */ #define SLEEPTIME 1 /* structure to count bytes per chain */ struct bwcnt { int start; /* the chain was initialized */ u_int64_t icnt; /* bytes through; previous measure */ u_int64_t ocnt; /* bytes through; current measure */ double bw; /* bandwitdh (flow) on this chain */ }; /* function to calculate differences of time in seconds. * micro-seconds precision. */ double delta(struct timeval a, struct timeval b) { if (a.tv_usec & b.tv_usec) { a.tv_sec--; a.tv_usec += 1000000; } return a.tv_sec-b.tv_sec + (a.tv_usec-b.tv_usec)/1000000.0; } /* main function */ int main(int argc, char *argv[]) { int i, j, ok; double totbw; iptc_handle_t h; int c, act_bw = 0; const char *chain = NULL; const char *tablename = "filter"; struct timeval ti, to; struct bwcnt bw[MAXUSERCHAINS]; struct ipt_counters *counters; char *col[9] = { RED,GREEN,ORANGE,BLUE,MAGENTA,CYAN,WHITE,BLACK,RESET }; program_name = "bw"; program_version = NETFILTER_VERSION; /* check options * we have 2 options: * -c = display current flow (each SLEEPTIME). * -a = display average flow (from start); default option. */ while ((c = getopt (argc, argv, "ac")) != -1) switch (c) { case 'a': act_bw = 0; break; case 'c': act_bw = 1; break; case '?': if (isprint(optopt)) fprintf (stderr, "Unknown option `-%c'.\n", optopt); else fprintf (stderr,"Unknown option character `\\x%x'.\n",optopt); exit(1); default: abort(); } /* initialize array of chains */ memset(&bw, 0, MAXUSERCHAINS * sizeof(struct bwcnt)); /* get time to start meter on variable ti */ gettimeofday(&ti, NULL); /* fire meter loop */ if ( act_bw ) printf("Displaying current flow values ...\n"); else printf("Displaying average flow values ...\n"); /* forever loop; abort the program with ^C */ while ( 1 ) { /* you have to initialize for each measure to be done */ if ( !(h = iptc_init(tablename)) ) { printf("Error initializing: %s\n", iptc_strerror(errno)); exit(errno); } ok = 0; /* we start a new loop */ gettimeofday(&to, NULL); /* have a time shoot */ /* iterate through each chain of the table */ for (chain = iptc_first_chain(&h), i = 0; chain; chain = iptc_next_chain(&h)) { if ( iptc_builtin(chain, h) ) continue; /* if it is a built-in chain, ignore it */ /* ok, read the counters of this chain */ if ( !(counters = iptc_read_counter(chain, 0, &h)) ) { printf("Error reading %s: %s\n", chain, iptc_strerror(errno)); exit(errno); } /* check that we do not have more chains than we can process */ if ( i >= MAXUSERCHAINS ) { printf("Maximum of %d user chains exceeded!!\n", MAXUSERCHAINS); exit(1); } /* this chain counter has not been initialized; initialize it */ if ( bw[i].start == 0 ) { bw[i].icnt = counters->bcnt; bw[i].start = 1; } /* this chain has a previous measure; take the current one */ else { bw[i].ocnt = counters->bcnt; if ( bw[i].ocnt == bw[i].icnt ) /* no new bytes flowing? */ bw[i].bw = 0; /* flow is zero */ else /* flow in this chain is: * current bytes count (bw[i].octn) *minus* * previous bytes count (bw[i].icnt) *divided by* * 128.0 to convert bytes to kbits *and divided by* * difference in times in seconds *to get* * flow in kbits/sec that is what we want. */ bw[i].bw = (bw[i].ocnt - bw[i].icnt) / (128.0 * delta(to, ti)); /* do you want current flow of this chain? initialize previous * bytes count to current bytes count; we get the flow in last * SLEEPTIME elapsed time. */ if ( act_bw ) bw[i].icnt = bw[i].ocnt; ok = 1; /* ok, we have some measure to show */ } ++i; /* next chain, please */ } /* we iterate and i == 0; we do not have user chains at all */ if ( i == 0 ) { printf("No user chains to meter!!\n"); exit(1); } /* do you want to measure current flow? initialize previous time * to actual time; we get the time elapsed in last SLEEPTIME. */ if ( act_bw ) ti = to; /* do we have something to show? ok, display it */ if ( ok ) { totbw = 0; for ( j = 0; j < i; ++j ) { totbw = totbw + bw[j].bw; /* calculate total flow */ } printf("%s%6.1fk:%s ", col[7], totbw, col[8]); /* display total */ for ( j = 0; j < i; ++j ) { /* display flow of each chain in color */ printf("%s%6.1fk%s ", col[j], bw[j].bw, col[8]); } printf("\n"); } sleep(SLEEPTIME); /* rest a little; you go too fast */ } /* give us enough time in order to let some bytes flow */ exit(0); /* bye, we have our measures!! */ } /* main */ |
Write your program and compile as before:
bash# ./ipt-cc bw |
Before using the meter we need to set our environment.
First, we have to have at least 2 PCs connected in a network. This is our diagram configuration:
+--------+ eth0 eth0 +--------+ | PC #1 +-----------------+ PC #2 | +--------+ +--------+ eth0=192.168.1.1 eth0=192.168.1.2 |
Second, we need to install a very nice and useful package called netcat written by Hobbit. This excellent package will help us to inject and receive a flow of bytes between 2 NICs. If you don't have the package in your system, download it from http://rr.sans.org/audit/netcat.php.
The version that I use is 1.10-277. To install it follow these instructions:
bash# cp netcat-1.10.tar.gz /usr/local/src bash# tar xzvf netcat-1.10.tar.gz bash# cd netcat-1.10 |
My version requires to make a patch first; check yours if you have a file with a .dif extension and apply it too:
bash# patch -p0 -i netcat-1.10.dif |
Next compile the package using make:
bash# make linux |
Copy the binary nc to your user bin directory:
bash# cp nc /usr/bin |
And also to the second PC in your network:
bash# scp nc 192.168.1.2:/usr/bin |
We are going to use netcat to "listen" to a flow of bytes from PC #2 and to "talk" from PC #1. Using tty1 to tty4 consoles on PC #2 let's start netcat to listen from this PC. Go to PC #2 and in tty1 type:
bash# nc -n -v -l -s 192.168.1.2 -p 1001 >/dev/null |
netcat must respond with:
listening on [192.168.1.2] 1001 ... |
This command started netcat to listen from address 192.168.1.2 using port number 1001. Arguments are: -n = use numeric address identification; -v = verbose; -l = listen. All the flow that netcat receives in 192.168.1.2:1001 will be redirected to the "black hole" in /dev/null.
Repeat the command in tty2, tty3 and tty4; change to tty2 using ALT-F2 and after logging in write:
bash# nc -n -v -l -s 192.168.1.2 -p 1002 >/dev/null |
Now we are "listening" to the same address but port number 1002.
Go on now with tty3:
bash# nc -n -v -l -s 192.168.1.2 -p 1003 >/dev/null |
And tty4:
bash# nc -n -v -l -s 192.168.1.2 -p 1004 >/dev/null |
Now we are listening in PC #2, address 192.168.1.2 in ports 1001, 1002, 1003 and 1004.
Come back to PC #1 and let's set the environment to allow iptables to help us to complete our tests:
On PC #1, type the into tty1 as follows:
bash# iptables -F bash# iptables -X bash# iptables -N chn_1 bash# iptables -N chn_2 bash# iptables -N chn_3 bash# iptables -N chn_4 bash# iptables -A chn_1 -j ACCEPT bash# iptables -A chn_2 -j ACCEPT bash# iptables -A chn_3 -j ACCEPT bash# iptables -A chn_4 -j ACCEPT bash# iptables -A OUTPUT -o eth0 -p tcp --dport 1001 -j chn_1 bash# iptables -A OUTPUT -o eth0 -p tcp --dport 1002 -j chn_2 bash# iptables -A OUTPUT -o eth0 -p tcp --dport 1003 -j chn_3 bash# iptables -A OUTPUT -o eth0 -p tcp --dport 1004 -j chn_4 |
These commands will:
Flush all chains in table filter.
Delete all user chains in table filter.
Create user chains chn_1, chn_2, chn_3 and chn_4.
Establish a target ACCEPT in each user chain.
Create 4 rules in chain OUTPUT that matches port numbers 1001 to 1004 and target it to user chains chn_1 to chn_4.
Now start the bw meter using current values:
bash# ./bw -c |
It must respond with:
Displaying current flow values ... 0.0k: 0.0k 0.0k 0.0k 0.0k 0.0k: 0.0k 0.0k 0.0k 0.0k 0.0k: 0.0k 0.0k 0.0k 0.0k 0.0k: 0.0k 0.0k 0.0k 0.0k |
It informs that measures are current flows. Every line is a measure taken each SLEEPTIME lapse (1 second in our program). First column (in black) are total flow, next columns (in red, green, orange and blue) are flows in chains chn_1, chn_2, chn_3 and chn_4 respectively. Of course we do not have any flow now. However let bw run and continue reading.
Let's start now one of our byte flows; go to tty2 in PC #1 with ALT-F2 and after logging in, type:
bash# yes 000000000000000000 | nc -n -v -s 192.168.1.1 -p 2001 192.168.1.2 1001 |
netcat responds with:
(UNKNOWN) [192.168.1.2] 1000 (?) open |
Now we have a flow of bytes from PC #1 to PC #2. yes generates a constant flow of zeroes; this flow is piped to netcat through address 192.168.1.1, port 2001 and sends it to PC #2, address 192.168.1.2, port 1001 (where PC #2 is listening).
Check now the display of bw in tty1:
7653.2k: 7653.2k 0.0k 0.0k 0.0k 7829.5k: 7829.5k 0.0k 0.0k 0.0k 7786.7k: 7786.7k 0.0k 0.0k 0.0k 7892.1k: 7982.1k 0.0k 0.0k 0.0k |
Your mileage can vary depending of the physical characteristics of your system. In mine I have a flow of aproximately 7700 kbits/sec in the first chain chn_1 which corresponds to port number 1001 in PC #2.
Let's start now the second bytes flow; go to tty3 in PC #1 with ALT-F3 and after logging in, type:
bash# yes 000000000000000000 | nc -n -v -s 192.168.1.1 -p 2002 192.168.1.2 1002 |
netcat responds with:
(UNKNOWN) [192.168.1.2] 1002 (?) open |
Now we have 2 flows of bytes from PC #1 to PC #2; one from 192.168.1.1:2001 to 192.168.1.2:1001 and another from 192.168.1.1:2002 to 192.168.1.2:1002.
Now check the display of bw in tty1:
7819.6k: 4144.2k 3675.4k 0.0k 0.0k 8090.5k: 3923.9k 4166.6k 0.0k 0.0k 7794.7k: 3920.8k 3873.9k 0.0k 0.0k 7988.3k: 3754.6k 4233.7k 0.0k 0.0k |
Now we have 2 flows; each of them is more or less 50% of the total flow going out of the computer. The Linux kernel tries to balance the bandwidth available between the 2 channels of output.
To continue, start the 2 aditional flows through channels 192.168.1.1:2003-192.168.1.2:1003 and 192.168.1.1:2004-192.168.1.2:1004.
In tty4 type:
bash# yes 000000000000000000 | nc -n -v -s 192.168.1.1 -p 2003 192.168.1.2 1003 |
In tty5 type:
bash# yes 000000000000000000 | nc -n -v -s 192.168.1.1 -p 2004 192.168.1.2 1004 |
The display of bw in tty1 will be something like:
8120.6k: 1705.3k 2354.9k 1898.6k 2161.8k 7765.3k: 1634.2k 2560.2k 2011.4k 1559.5k 7911.9k: 1699.8k 2090.3k 1768.0k 2353.8k 8309.4k: 1734.5k 1999.7k 1999.9k 2575.3k |
Total bandwidth is distributed between the 4 channels of flow.
[*] QoS and/or fair queueing <M> CBQ packet scheduler <M> CSZ packet scheduler [*] ATM pseudo-scheduler <M> The simplest PRIO pseudoscheduler <M> RED queue <M> SFQ queue <M> TEQL queue <M> TBF queue <M> GRED queue <M> Diffserv field marker <M> Ingress Qdisc [*] QoS support [*] Rate estimator [*] Packet classifier API <M> TC index classifier <M> Routing table based classifier <M> Firewall based classifier <M> U32 classifier <M> Special RSVP classifier <M> Special RSVP classifier for IPv6 [*] Traffic policing (needed for in/egress) |
Having compiled and re-installed your kernel you have to install the iproute2 package that will be used to run the commands needed to implement the queues. Download this package from ftp://ftp.inr.ac.ru/ip-routing.
I'm working with version 2.2.4-now-ss001007. To install it follow these instructions:
bash# cp iproute2-2.2.4-now-ss001007.tar.gz /usr/local/src bash# tar xzvf iproute2-2.2.4-now-ss001007.tar.gz bash# cd iproute2 bash# make |
After make compiles the iproute2 package successfully the ip utility will be in iproute2/ip directory and the tc utility in iproute2/tc directory. Copy both of them to /usr/bin directory:
bash# cp ip/ip /usr/bin bash# cp tc/tc /usr/bin |
Now, using the tc utility, we are going to create a CBQ queue in the interface eth0 of the PC #1 computer. This queue will have 4 classes as children and each of these classes will be used to control the 4 flows from 192.168.1.1 to 192.168.1.2 through ports 1001 to 1004.
Write and run the following commands:
bash# tc qdisc add dev eth0 root handle 1:0 cbq bandwidth 10Mbit \ avpkt 1000 cell 8 |
This command creates the main (root) cbq queue 1:0 in the eth0 interface; the bandwidth of this queue is 10Mbit/sec corresponding to our Ethernet interface.
Now write and run:
bash# tc class add dev eth0 parent 1:0 classid 1:1 cbq bandwidth 10Mbit \ rate 1000kbit prio 8 allot 1514 cell 8 maxburst 20 avpkt 1000 bounded |
This command create the main cbq class 1:1. The rate of this class will be 1000kbit/sec.
Now we are going to create 4 classes ownned by this class; the classes will have rates of 100kbit, 200kbit, 300kbit and 400kbit respectively. Write and run these commands:
bash# tc class add dev eth0 parent 1:1 classid 1:3 cbq bandwidth 10Mbit \ rate 100kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 1000 bash# tc class add dev eth0 parent 1:1 classid 1:4 cbq bandwidth 10Mbit \ rate 200kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 1000 bash# tc class add dev eth0 parent 1:1 classid 1:5 cbq bandwidth 10Mbit \ rate 300kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 1000 bash# tc class add dev eth0 parent 1:1 classid 1:6 cbq bandwidth 10Mbit \ rate 400kbit prio 5 allot 1514 cell 8 maxburst 20 avpkt 1000 |
Each of these classes will have a sfq queue discipline attached to them to dispatch their packets. Write and run these commands:
bash# tc qdisc add dev eth0 parent 1:3 handle 30: sfq perturb 15 bash# tc qdisc add dev eth0 parent 1:4 handle 40: sfq perturb 15 bash# tc qdisc add dev eth0 parent 1:5 handle 50: sfq perturb 15 bash# tc qdisc add dev eth0 parent 1:6 handle 60: sfq perturb 15 |
These commands create 4 sfq queue disciplines, one for each class. sfq queue discipline is some kind of fair controlling queue. It tries to give to each connection in an interface same oportunity to their packets to be dispatched to at all.
Finally we are going to create filters to assign flows to ports 1001, 1002, 1003 and 1004 to classes 1:3, 1:4, 1:5 and 1:6 respectively. Write and run as follows:
bash# tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip \ dport 1001 0xffff flowid 1:3 bash# tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip \ dport 1002 0xffff flowid 1:4 bash# tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip \ dport 1003 0xffff flowid 1:5 bash# tc filter add dev eth0 parent 1:0 protocol ip prio 1 u32 match ip \ dport 1004 0xffff flowid 1:6 |
After running all these commands, now check your bw meter (you must be running netcat listening at ports 1001 to 1004 in PC #2 and talking in PC #1 as was explained in previous chapter and bw running in current -c mode). You will have something like this:
Current flow values ... 1099.9k: 108.8k 196.5k 337.9k 456.8k 1104.2k: 115.3k 184.9k 339.9k 464.1k 1102.1k: 117.3k 174.7k 339.7k 470.5k 1114.4k: 113.6k 191.7k 340.7k 468.4k 1118.4k: 113.7k 194.3k 340.5k 469.9k |
bw show us how flows are controlling using queue disciplines of the Linux kernel. As you see, CBQ queue discipline is not a very precise queue but you more or less have a flow of approximately 1000=100+200+300+400 on interface eth0.
To step back, write and run as follows:
bash# tc qdisc del dev eth0 root handle 1:0 cbq |
on PC #1, to delete the main (root) queue discipline and owned classes and filters.
bash# killall nc |
on PC #2 and PC #1, to stop netcat.
bash# iptables -F bash# iptables -X |
on PC #1, to clear iptables rules and chains.
bash# Ctrl-C |
on PC #1, tty1 to stop bw bandwidth meter.
Married to Cielo, with 3 sons (Jose, Dario, Gustavo), he can be reached at leonardo@opalsoft.net. He is working now (please be patient) to open a QoS Linux information site at http://opalsoft.net/qos/ to interchange knowledge with people interested and to make his works in the Linux "best of all" operating system available to the public.
April 30, 2002
Caracas, Venezuela