Linux FreeS/WAN background

This section discusses a number of issues which have three things in common:

They are not specifically FreeS/WAN problems
You may have to understand them to get FreeS/WAN working right
They are not simple questions

Grouping them here lets us provide the explanations some users will need without unduly complicating the main text.

Some DNS background

Opportunistic encryption requires that the gateway systems be able to fetch public keys, and other IPsec-related information, from each other's DNS (Domain Name Service) records.

DNS is a distributed database that maps names to IP addresses and vice versa.

Much good reference material is available for DNS, including:

We give only a brief overview here, intended to help you use DNS for FreeS/WAN purposes.

Forward and reverse maps

Although the implementation is distributed, it is often useful to speak of DNS as if it were just two enormous tables:

the forward map: look up a name, get an IP address
the reverse map: look up an IP address, get a name

Both maps can optionally contain additional data. For opportunistic encryption, we insert the data need for IPsec authentication.

A system named gateway.example.com with IP address 10.20.30.40 should have at least two DNS records, one in each map:

gateway.example.com. IN A 10.20.30.40: used to look up the name and get an IP address
40.30.20.10.in-addr.arpa. IN PTR gateway.example.com.: used for reverse lookups, looking up an address to get the associated name. Notice that the digits here are in reverse order; the actual address is 10.20.30.40 but we use 40.30.20.10 here.

For both maps there is a hierachy of DNS servers and a system of delegating authority so that, for example:

the DNS administrator for example.com can create entries of the form name.example.com
the example.com admin cannot create an entry for counterexample.com; only someone with authority for .com can do that
an admin might have authority for 20.10.in-addr.arpa.
in either map, authority can be delegated
- the example.com admin could give you authority for westcoast.example.com
- the 20.10.in-addr.arpa could give you authority for 30.20.10.in-addr.arpa

Returning to the example records:

	gateway.example.com. IN A 10.20.30.40
	40.30.20.10.in-addr.arpa. IN PTR gateway.example.com.

some syntactic details are:

the IN indicates that these records are for In ternet addresses
The final periods in '.com.' and '.arpa.' are required. They indicate the root of the domain name system.

The capitalised strings after IN indicate the type of record. Possible types include:

Address, for forward lookups
PoinTeR, for reverse lookups
Canonical NAME, records to support aliasing, multiple names for one address
Mail eXchange, used in mail routing
SIGnature, used in secure DNS
KEY, used in secure DNS
TeXT, a multi-purpose record type

To set up for opportunistic encryption, you add some KEY and TXT records to your DNS data. Details are in our quickstart document.

Problems with packet fragmentation

It seems, from mailing list reports, to be moderately common for problems to crop up in which small packets pass through the IPsec tunnels just fine but larger packets fail.

These problems are caused by various devices along the way mis-handling either packet fragments or path MTU discovery.

IPsec makes packets larger by adding an ESP or AH header. This can tickle assorted bugs in fragment handling in routers and firewalls, or in path MTU discovery mechanisms, and cause a variety of symptoms which are both annoying and, often, quite hard to diagnose.

An explanation from project technical lead Henry Spencer:

The problem is IP fragmentation; more precisely, the problem is that the
second, third, etc. fragments of an IP packet are often difficult for
filtering mechanisms to classify.

Routers cannot rely on reassembling the packet, or remembering what was in
earlier fragments, because the fragments may be out of order or may even
follow different routes.  So any general, worst-case filtering decision
pretty much has to be made on each fragment independently.  (If the router
knows that it is the only route to the destination, so all fragments
*must* pass through it, reassembly would be possible... but most routers
don't want to bother with the complications of that.)

All fragments carry roughly the original IP header, but any higher-level
header is (for IP purposes) just the first part of the packet data... so
only the first fragment carries that.  So, for example, on examining the
second fragment of a TCP packet, you could tell that it's TCP, but not
what port number it is destined for -- that information is in the TCP
header, which appears in the first fragment only. 

The result of this classification difficulty is that stupid routers and
over-paranoid firewalls may just throw fragments away.  To get through
them, you must reduce your MTU enough that fragmentation will not occur.
(In some cases, they might be willing to attempt reassembly, but have very
limited resources to devote to it, meaning that packets must be small and
fragments few in number, leading to the same conclusion:  smaller MTU.)

In addition to the problem Henry describes, you may also have trouble with path MTU discovery.

By default, FreeS/WAN uses a large MTU for the ipsec device. This avoids some problems, but may complicate others. Here's an explanation from Claudia:

Here are a couple of pieces of background information. Apologies if you
have seen these already. An excerpt from one of my old posts:

    An MTU of 16260 on ipsec0 is usual. The IPSec device defaults to this 
    high MTU so that it does not fragment incoming packets before encryption 
    and encapsulation. If after IPSec processing packets are larger than 1500,
    [ie. the mtu of eth0] then eth0 will fragment them. 

    Adding IPSec headers adds a certain number of bytes to each packet. 
    The MTU of the IPSec interface refers to the maximum size of the packet
    before the IPSec headers are added. In some cases, people find it helpful 
    to set ipsec0's MTU to 1500-(IPSec header size), which IIRC is about 1430.

    That way, the resulting encapsulated packets don't exceed 1500. On most 
    networks, packets less than 1500 will not need to be fragmented.

and... (from Henry Spencer)

    The way it *ought* to work is that the MTU advertised by the ipsecN
    interface should be that of the underlying hardware interface, less a
    pinch for the extra headers needed. 

    Unfortunately, in certain situations this breaks many applications.
    There is a widespread implicit assumption that the smallest MTUs are 
    at the ends of paths, not in the middle, and another that MTUs are 
    never less than 1500.  A lot of code is unprepared to handle paths 
    where there is an "interior minimum" in the MTU, especially when it's 
    less than 1500. So we advertise a big MTU and just let the resulting 
    big packets fragment.

This usually works, but we do get bitten in cases where some intermediate
point can't handle all that fragmentation.  We can't win on this one.

The MTU can be changed with an overridemtu= statement in the config setup section of ipsec.conf.5.

Here is an example of the difficulty of diagnosing an MTU-related problem, from the mailing list:

Date: Mon, 3 Apr 2000
From: "Michael H. Warfield" <mhw@wittsend.com>

Paul Koning wrote:

>  Chris>  It appears that the Osicom router discards IP
>  Chris> fragments...

> Amazing.  A device that discards fragments isn't a router, it's at
> best a boat anchor.

        It may not be exactly what it appears.  I ran into a similar problem
with an ISDN link a while ago giving similar symptoms.  Turned out that
the device was negotiating an MTU that it really couldn't handle and the
device in front of it (a Linux box with always defragment enabled) was
defragmenting the huge IPSec datagrams and then refragmenting them into
hunks that the ISDN PPP thought it could handle but couldn't.  Problem was
solved by manually capping the MTU on the ISDN link to a smaller value.

        I gave the FreeSwan guys a hard time until tracking it down since
FreeSwan was the only thing that appeared to be able to tickle the bug.
Nothing else seemed to be broken.  What it really was that MTU discovery
was avoiding the problem on normal links and it was only the IPsec tunnels
that were creating huge datagrams that went through the defragment/refragment
process.

        Point here is that it "appeared" as though the ISDN link was
failing to handle fragments when it was really a configuration error and
a software bug resulting in a bad MTU that was really the culprit.  So
it may not be that the router is not handling fragments.  It may be that
it's missconfigured or has some other bug that only FreeSwan is tripping
over.

Network address translation (NAT)

Network Address T ranslation is a service provided by some gateway machines. Calling it NAPT (adding the word Port) would be more precise, but we will follow the widespread usage.

A gateway doing NAT rewrites the headers of packets it is forwarding, changing one or more of:

source address
source port
destination address
destination port

On Linux 2.4, NAT services are provided by the netfilter(8) firewall code. There are several Netfilter HowTos including one on NAT.

For older versions of Linux, this was referred to as "IP masquerade" and different tools were used. See this resource page.

NAT to non-routable addresses

The most common application of NAT uses private non-routable addresses.

Often a home or small office network will have:

one connection to the Internet
one assigned publicly visible IP address
several machines that all need access to the net

Of course this poses a problem since several machines cannot use one address. The best solution might be to obtain more addresses, but often this is impractical or uneconomical.

A common solution is to have:

non-routable addresses on the local network
the gateway machine doing NAT
all packets going outside the LAN rewritten to have the gateway as their source address

The client machines are set up with reserved non-routable IP addresses defined in RFC 1918. The masquerading gateway, the machine with the actual link to the Internet, rewrites packet headers so that all packets going onto the Internet appear to come from one IP address, that of its Internet interface. It then gets all the replies, does some table lookups and more header rewriting, and delivers the replies to the appropriate client machines.

As far as anyone else on the Internet is concerned, the systems behind the gateway are completely hidden. Only one machine with one IP address is visible.

For IPsec on such a gateway, you can entirely ignore the NAT in:

ipsec.conf(5)
firewall rules affecting your Internet-side interface

Those can be set up exactly as they would be if your gateway had no other systems behind it.

You do, however, have to take account of the NAT in firewall rules which affect packet forwarding.

NAT to routable addresses

NAT to routable addresses is also possible, but is less common and may make for rather tricky routing problems. We will not discuss it here. See the Netfilter HowTos.

Contents Previous Next