Valid HTML 4.01 Transitional

IPv6 at the Department Level

James F. Carter <jimc@math.ucla.edu>, 2008-10-28

At the UCLA Mathematics Department we are running out of IPv4 addresses. Campus Network Services has a limited stock of additional addresses they could give us, and this issue is repeated at all scales. What are we going to do to give addresses to machines that seem to reproduce like rabbits?

[This is a work in progress. Not all steps have been actually accomplished to bring IPv6 to the Mathematics Department.]

Contents

Requirements

For the eventual solution we have these requirements:

Candidate Solutions

There are two classes of solution: NAT and IPv6.

NAT: Network Address Translation

RFC 1918 defines three IPv4 address blocks for use on internal LANs: 10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0/16. Any of these blocks would provide vastly more addresses than we would ever need. To connect to an outside server, a client with such an address needs a route through a NAT box with a real IPv4 address, which alters outgoing packets to appear to come from itself, and inversely alters the replies.

This technology is widely available and is well understood. Linux and Windows include NAT routing capability as a standard feature, and router boxes for home DSL lines normally do also. Cisco provides NAT devices at extra cost. No modification whatsoever is needed on the clients; at most, fixed addresses (if any) would need to be changed.

But NAT does have limitations.

IPv6: Internet Protocol Version 6

Since the world will soon use up all its IPv4 addresses, a new protocol has been developed with vastly more addresses: 2128 of them, an outrageously large number. For example, the tunnel broker that I investigated will give you -- for free -- an allotment of 280 addresses, which is enough to give an individual IPv6 address to every cell in the body of every person in the state of Ohio.

Linux, Microsoft Windows and Apple Macintosh OS-X support IPv6 in the kernel and utilities, and have done so for years, since work began on IPv6. On Linux at least, firewall modules are well developed. However, helper modules for reverse connections are not ready yet; in the IPv4 firewall these recognize incoming connections related to one originating on an internal client, e.g. for SIP, and let them through, where an original connection from outside would be subject to stricter firewall rules.

On Linux, many subsystems support IPv6. See Appendix C for a list of software supporting IPv6; this includes the major service daemons, and corresponding client software such as the popular web browsers and mail readers. Existing software that makes or listens for network connections on IPv4 do not automatically support IPv6, but there is a library routine, getaddrinfo, which can automatically switch between IPv4 and IPv6 as needed, and which is almost a drop-in replacement for gethostbyname.

There is a functioning intercontinental backbone, with infrastructure such as IPv6-enabled root nameservers. If the partners in an IP connection can get their packets to the backbone, they can make the connection work.

The IPv6 stack includes a number of technical improvements such as kernel-level tunneling and encryption, and automatic host configuration so that DHCP is no longer necessary.

But the features of IPv6 are not all good.

Selection of a Solution

There are two issues here. First, IPv6 is the future of the Internet. At any time one of our faculty members may have a need to communicate with colleagues on an IPv6-only net, particularly in China, and we will have to scramble to provide the service. It's best to get IPv6 set up and working at the earliest feasible time, in parallel with our IPv4 service.

On the other hand, the future is not here yet, and if we have IPv6-only hosts they will spend most of their time interacting with IPv4-only peers. To make that happen we need a solution functionally equivalent to NAT (possibilities are detailed below). It makes a lot more sense to do the NAT on IPv4. In other words, we should use RFC 1918 addresses to expand our IPv4 address space.

In addition, we still need to deal with equipment which is IPv4-only because it is obsolete or because the designers lacked the imagination we have about the future of IPv6.

So the conclusion is, we need to do both NAT-4 and IPv6.

Reasons to Implement IPV6

What arguments can be given to upper management to induce them to support IPv6?

Adding IPv6 to a Business

Here's a checklist of what you need, to add IPv6 support to your business. I am assuming a dual stack model in which the same servers and network equipment handle both IPv4 and IPv6 traffic. There is no need whatsoever to replace working servers when adding IPv6.

Technical Requirements for UCLA-Mathnet IPv6 Deployment

Action Plan

Overview

We need to move our I.T. infrastructure into the 21st century and hence will deploy IPv6. However, we have an urgent need to expand our address space, and not a lot of confidence in IPv6 to IPv4 NAT-like solutions, and so we will provide IPv4 addresses from RFC 1918 (192.168.x.x and friends) to hosts that don't need a public presence, with a default route through an IPv4 NAT box.

New Router: Harlech

The keystone and single point of failure of this plan is a new machine called Harlech (which will take over the functions of the existing one, in particular, the 128.97.4.254 address for DNS). It has four physical gigabit NICs. It will have one interface on the PSnet, i.e. the "wild side" (which is in fact in RFC 1918 address space), and one feeding into the Cisco using VLANs (802.1q) to establish a direct presence on all five Math-PIC subnets.

Future Redundancy and Migration

Once Harlech is operational we can consider a redundant copy with a slightly higher route metric, which will provide hot failover. When and as feasible, e.g. with upgrades to IOS and the router board, some of these functions can be transferred to the various Cisco boxes.

DHCP: Fixed and RFC 1918 Addresses

Harlech will have a DHCP server (replacing the present one on Windows). This server will pass out two classes of IPv4 addresses. Fixed IPs will be matched up with MAC addresses, and we can consider no longer doing explicit configuration of the IPv4 address on workstations, as Ed has suggested for PIC. Some of these addresses can be RFC 1918 type, e.g. for printers and network switch administrative ports. Further, an inexhaustible pool of RFC 1918 addresses will be available on each subnet for rogue machines.

Routing RFC 1918 Packets

Both the public and RFC 1918 addresses will run on the same subnets. All the devices holding RFC 1918 addresses will have a default route through Harlech, either as a DHCP setting or by explicit configuration. If feasible without a lot of work, each host will have a device route on its main NIC for both the subnet's public and RFC 1918 address range. Hosts for which this is not feasible (e.g. printers) will route packets (that could have gone direct) through Harlech in a mirror mode.

NAT for Off-Site RFC 1918 Packets

Packets sent from RFC 1918 hosts offsite through Harlech will get NAT treatment. This precludes incoming connections (remote login) from the wild side to a RFC 1918 host. To the extent feasible, helper modules included in the Linux kernel will be deployed so protocols will work if they use back connections, e.g. VoIP protocols.

IPv6 Routing

Harlech will route IPv6 packets among the various Math-PIC subnets and the wild side in the normal way for IPv6. Mathnet will obtain an IPv6 prefix for each of our five VLANs. Harlech will advertise itself as the only router through which IPv6 packets can be sent, and will be the default IPv6 route on the Math-PIC subnets.

Host IPv6 Addresses

Per RFC 2464 the Math-PIC hosts will use IPv6 addresses derived from their MAC addresses, which will be made available via DNS if the host is registered. As soon as Harlech sends its first router advertisement they will all do this automatically and become fully operational on IPv6.

Virtual Private Networks

IPSec traffic, both IPv6 and IPv4, would be directed to Harlech's wild side interface. IPv4 traffic would get the NAT treatment on the Mathnet side, so as to attract returning packets back to Harlech rather than to the more preferred default route on the Cisco. Secure IPv6 could go through unmunged, because Harlech holds the default route. OpenVPN will also be handled through Harlech's wild side.

How to Set Up IPv6

Here are my experiences setting up IPv6 on my home server, using it as a testbed for the proposed departmental deployment. See RFC 2373 for the textual format of an IPv6 address.

Tunnel to the Sixbone

The first step was to listen for router advertisements coming from the wild side, presumably from my home ISP (Verizon DSL). The command line was

tcpdump -l -i eth1 ip6
This would get all IPv6 traffic, not just router advertisements. The default interval in Linux's /etc/radvd.conf is every 3 to 10 seconds (varying randomly). Ten minutes of listening revealed . . . Silence. My ISP does not support IPv6. It was time to turn to plan B: connect through a tunnel broker.

However, at work during a 5 minute listening period, three different rogue machines (not under Mathnet's control) were seen to transmit neighbor solicitations and replies, DHCP6 solictations, and MDNS requests. Possibly these are personal Macintoshes or PDAs. Of course no server answered the DHCP6 or MDNS packets. These were sent to appropriate multicast addresses and at the link level were sent to 33:33:ww:xx:yy:zz where the variable part is the low 32 bits of the multicast address, per RFC 2464. The Cisco switch recognized this as a broadcast-like address.

Being located in California (USA), I picked Hurricane Electric, http://tunnelbroker.net/, as my tunnel broker. You need to register; you pick a loginID and they mail back a password of, in my case, 8 decimal digits. Update: starting on 2009-02-06 they use 10 truly random alphanumerics (about 48 bits entropy).

Once registered, you can connect to your tunnel status page and create up to four tunnels. For simplicity choose a host with a fixed IPv4 address (dynamic addresses can also be handled; see below). The tunnel server will need to exchange packets to it in the IP-in-IP protocol, number 4, and also, when the tunnel is created the administrative server will want to ping it, so you will need to open a hole in your firewall for these items. The IPv4 addresses of the two servers will be shown when you create the tunnel. (You create the tunnel first, then configure your end so it works.)

You will be assigned a block of 264 IPv6 addresses; you can request 280 addresses after the tunnel is created. The IPv6 address will look like 2001:2345:6789:abcd::/64. Your and their end of the tunnel will be in an adjacent block. Their policy is to give their end of the tunnel the address ending in 1, and yours will end in 2.

Their nameserver can delegate reverse DNS to your own nameserver(s); this is for PTR records mapping addresses in your block to alphabetic names. It helps a lot if your nameserver has, or will soon have, an IPv6 address and a corresponding AAAA record. You will need to work with your domain registrar to insert AAAA records that map names of your hosts to IPv6 addresses.

Now you need to configure your tunnel endpoint machine. The tunnel status page has a listbox for showing sample configuration procedures on various systems. I'm using the iproute2 tools (ifconfig could also be used), and my procedure was:

modprobe ipv6				# For me, loaded by default
modprobe sit				# For me, NOT loaded by default
ip tunnel add he-ipv6 mode sit remote 72.52.104.74 local 128.97.4.125 ttl 255
ip link set he-ipv6 up
ip addr add 2001:2345:6789:abcd::2/64 dev he-ipv6
ip route add 2000::/3 dev he-ipv6	# Add default route
ip -f inet6 addr show			# Check the IPv6 addresses

At this point, executing on the endpoint machine, you should be able to do:

ping6  2001:2345:6789:abcd::1		# Their end of the tunnel

And you can use a web browser to contact http://ipv6.google.com (offers normal search services) or http://www.kame.net (the logo image is animated if you get and use the AAAA record, or is static on IPv4).

Items to watch out for:

Internal Network Infrastructure

Your next challenge is to set up normal network infrastructure on IPv6 for your internal subnet. This involves address assignment, DNS (domain name service, that is, translating names to addresses), and routing. I'm assuming that the tunnel endpoint is also the server where infrastructure daemons run, although in reality most of the deaemons could be on a different machine.

Address Assignment

Unlike with IPv4, it is common for one machine to have several IPv6 addresses at the same time. Most of the addresses are [supposed to be] derived from the MAC address per RFC 2464. In addition, each machine listens to several multicast groups, one of which acts like the IPv4 broadcast address.

There are four variants of address assignment.

Domain Name Service

Your forward DNS map needs to include an AAAA record for each of your machines, which looks like this:

jacinth         IN      AAAA    2001:470:1f05:844::2

Here is a shortened version of my forward DNS map. Frequently a small business uses their domain registrar's nameserver rather than providing their own on delegation from the registrar, in which case they will need to use the registrar's web form to post the address assignment. Assuming that the registrar supports IPv6 at all.

PTR records are similar to those for IPv4, though the addressing tree is rather more complex. To convert a IPv6 address to a domain name for the PTR record, start with the hexadecimal representation, put in all omitted zeroes, remove the colons, and reverse the order of the hex digits. Separate them with dots, and append ip6.arpa. (ip6, not ipv6). Here is an example:

>> dig www.kame.net. AAAA
www.kame.net. 86400 IN AAAA 2001:200:0:8002:203:47ff:fea5:3085

>> dig -x 2001:200:0:8002:203:47ff:fea5:3085
5.8.0.3.5.a.e.f.f.f.7.4.3.0.2.0.2.0.0.8.0.0.0.0.0.0.2.0.1.0.0.2.ip6.arpa. \
    86400IN PTR orange.kame.net.

Here is a shortened version of my reverse DNS map. Hurricane Electric, my tunnel broker, will delegate the reverse map for my address block to my nameserver, though other ISPs may allow or require clients to copy the whole map onto the ISP's server.

If you are going with addresses per RFC 2464, you need to know every host's MAC address. To keep the /etc/ethers file at work up to date, I run this MAC checking script as part of daily housekeeping. We use Sun-style NIS, and we have a local program hostgroup which is used here to spit out the 1-component hostnames of all the servers. Other sites would have to alter the script to fit their practices.

Given /etc/ethers, you can generate DNS maps, but the process is tedious and error-prone. Here is a script to convert ethers to DNS maps.

Multicast DNS (mdns) is related to DNS in that the goals, content and packet formats are identical, but the basic philosophy is different. With unicast DNS you have a central server which is authoritative for the names and addresses of all hosts on your net -- or frequently, you don't use DNS at all and fall back to a fixed /etc/hosts file. With multicast DNS, each host knows its name and IP address by other means such as DHCP, and it runs its own mdns responder (server) that can send out the corresponding DNS records. Thus all the hosts' mdns responders are federated together to make a complete DNS server. But the multicast addresses used (IPv4 and IPv6) are link-local, so mdns only works on a single network segment unless there are proxies on the routers. My networks are not suitable for mdns and I will not be setting it up.

Routing

If the router endpoint machine has been configured per instructions, and if the Router Advertisement Daemon is running (q.v. for a sample configuration file), then client machines will autonomously configure themselves to be functional on IPv6. The router itself does not do so (it doesn't have enough information ab initio), but the configuration instructions indicate the correct command to set the interface address.

For production use you will need to automate setting up IPv6 on the router. Here is my network6 startup script for SuSE and similar LSB-type distros such as Red Hat/Fedora; it can serve as a base for hacking on Debian-type distros.

IPv6 to IPv4 Translation

Service providers, such as e-commerce vendors, financial services, web content vendors, and VoIP services, are firmly rooted in the past, referring to IPv4. As of 2009-08-xx, Google is the only known exception, having given both IPv4 and IPv6 addresses to their primary search site. Thus, until we see real progress among the service providers, our IPv6-only machines will have few network resources to which they can connect with native IPv6. We need a service to translate IPv6 to IPv4.

There are quite a number of issues in translation between IPv4 and IPv6, most of which are irrelevant to us.

Remote IPv4 only Remote IPv6 only
Tunneling our subnets We will continue to rely on CNS for IPv4 connectivity. Campus Network Services gives us IPv6 connectivity.
Outsiders connect to our servers Our servers have IPv4 addresses. No support for remote IPv4 to local IPv6. Our servers also will have IPv6 addresses; remote client connects natively.
We connect to remote servers Local IPv6 to remote IPv4 is the case we have to deal with. Our workstations have dual stack and can connect natively. For personal IPv4-only machines, we rely on the remote server to arrange connectivity and/or on a campus-wide solution (if it materializes).

SIIT (RFC 2765) is a mechanism for converting packet contents in both directions between the IPv6 and IPv4 protocols. Linux implements it through the sit pseudo network device. However, the generated IPv4 packet has to come from some address, to which the remote host can send replies, and the RFC explicitly leaves out of scope how this address is going to be acquired, and how the IPv4 packets are going to be transported on an IPv6-only network. SIIT is a building block for a complete protocol translation solution, e.g. NAT on a router.

Our issue is that we cannot get public IPv4 addresses for our expanding population of hosts. If a solution requires the IPv6 host to also have an IPv4 public address, we cannot provide that address. This means that we need a solution analogous to NAT, where a router holds one public IPv4 address that is shared among all the IPv6 hosts. But if we have to do IPv6 NAT, a much more reasonable solution for us is IPv4 NAT, passing out addresses from RFC 1918 (private) address space.

Comparison of Proposals to Replace NAT-PT (Internet Draft). By Wing, Ward, Durand; 2008-09-29. This document discusses a variety of proposals that bridge the IPv4 and IPv6 address spaces, in the context of replacing the NAT-PT proposal which was determined to be inadequate.

The variants relevant to our case are these:

IETF working on making IPv6 and IPv4 talk to each other by Iljitsch van Beijnum in Ars Technica, 2008-10-06.

SIIT Stateless IP and ICMP Translation can do the protocol translation, but requires that the IPv6 client have a dedicated IPv4 address, which is a problem when the reason for changing to IPv6 is that the organization cannot get more IPv4 addresses.

NAT-PT means, effectively, to use SIIT from RFC 1918 address space and then to use NAT on the resulting packets. However, if the protocol (like VoIP) includes IP addresses in the payload, it will fail. Also it requires that DNS records be munged to provide a representation of the translated client or server, which is a security problem. Thus NAT-PT did not catch on.

NAT64 is the name of the new scheme. It is like NAT-PT but without the DNS effects.

Appendix A: Summary of RFCs

The links in the list below point to the summaries in this document; summary headlines point to the RFCs themselves.

In this summary I often refer to the 48 bit MAC address as used by IEEE 802 family link-level protocols, specifically Ethernet. IPv6 works over many kinds of links, such as ATM or Token Ring, which have shorter MAC addresses or none at all.

RFC 2460: Internet Protocol, Version 6 (IPv6) Specification

Glossary:

Node
Any device that implements IPv6.
Router
A node that forwards IPv6 packets addressed to other than itself.
Host
A node that is not a router.
Link
A communication medium over which nodes communicate at the OSI layer below IPv6, such as an Ethernet segment or a tunnel.
Interface
A node's attachment to a link.
Neighbors
Nodes attached to the same link.
MTU
Maximum Transmission Unit, upper bound (in octets, i.e. 8-bit bytes) on the size of a packet on a link or sequence of links.

The sending host must send jumbo packets in fragments that fit in the path MTU. Routers do not fragment packets; they drop the packet and send back ICMP6 packet too big. The biggest representable jumbo packet is 216 octets long plus the length of the IPv6 header.

RFC 2373: IP Version 6 Addressing Architecture

Textual Format of Addresses: The 128 bit address is represented by 8 hex numbers of 16 bits each separated by colons; leading 0's optional; one segment of all 0's may be replaced by the null string including at the ends, e.g. :: for all 0's. Alternatively the last 32 bits may be written as a dotted quad like an IPv4 address. The RFC does not say this, but in contexts where an alphabetic domain name is expected, an IPv6 address in [square brackets] will usually be recognized.

CIDR: A network address range is represented by a prefix of a specific number of leading bits. A prefix is represented textually as an IPv6 address, slash, and the number of bits in decimal. The excluded bits need not be 0 and will be cleared when needed. For example: fec0::/10 means the first 10 bits of that address (of which all but 1 are 1 bits).

What an Address Represents: It refers to an interface, e.g. a specific Ethernet or wireless transceiver. One interface usually has multiple addresses. The same address(es) may be assigned to multiple interfaces if they are functionally equivalent at the internet layer, that is, on one host and a packet sent to any interface will be similarly acted upon or responded to.

Assigned Address Ranges. In most addresses the lower 64 bits identify the host and are derived from the MAC address. In the table some minor items are omitted.

Address Range Purpose
2000::/3 Aggregatable Global Unicast Addresses -- assigned hierarchically (see RFC 2374) for scalable routing tables.
fe80::/10 Link-Local Unicast Addresses -- to be used on one link (subnet) for autoconfiguration or neighbor discovery.
fec0::/10 Site-Local Unicast Addresses -- Use as a 48 bit prefix, then site subnet ID (16 bits), then interface ID. For use within a site but may not be sent globally.
ff::/8 Multicast Addresses
::/96 + ipv4 IPv4 compatible address (6-in-4 tunnels per RFC 1993)
::ffff:0:0/96 + ipv4 IPv4 mapped address (IPv4-only hosts per RFC 1993)

Recognized Addresses: A router recognizes its subnet prefix with the rest of the bits all 0, and also the all routers multicast address of ff0S::2 (S = 2 for link scope, i.e. all router(s) on the subnet). All nodes recognize the all nodes multicast address of ff0S::1, which has the same use as IPv4's broadcast address. They also recognize their link local and global unicast addresses, and the loopback address of ::1 (within the node only). Every node recognizes the Solicited-Node Multicast Address for each of its unicast addresses, which is ff02:0:0:0:0:1:ff00::/104 followed by the last 24 bits of the unicast address. This is used for Neighbor Solicitation (equivalent to ARP) and some other multicast protocols.

RFC 2464: Transmission of IPv6 Packets over Ethernet Networks

Almost all unicast addresses are [supposed to be] obtained by prepending a 64 bit prefix to a 64 bit Interface Identifier called the EUI-64. This is obtained deterministically as follows: The factory-assigned MAC address is modified (must not use a MAC address altered by software). 0xfffe is squeezed in after the 3rd byte, and the first byte of the MAC address has bit 2 complemented, i.e. xor with 0x02. This bit must be 0 in the Interface Identifier (must be 1 in the MAC) if the MAC address is guaranteed to be globally unique; the opposite polarity is used e.g. for an ad-hoc value for a tunnel or virtual machine (and the following 0x01 bit would normally be 0 in all cases). It's a fact that under 1% of a large sample of MAC addresses have the "globally unique" bit set.

The default link MTU is 1500 octets. This may be set higher manually or by DHCP, or lower (never higher) by Router Advertisement. The minimum allowed MTU is 1280 octets (may have been revised later to 540 octets); links incapable of this MTU must provide link-level fragmentation that IPv6 does not see.

RFC 2461: Neighbor Discovery for IP Version 6 (IPv6)

Neighbor discovery refers to the process by which a node discovers:

Router Solicitation

When an interface comes up, the host may ask routers to advertise themselves immediately.

Router Advertisement

Routers send this information in response to Router Solicitations, and they broadcast it periodically, sending it to the link scope all-nodes multicast group.

Neighbor Solicitation

A node requests the MAC address of a neighbor (or detects a duplicate address or a dead neighbor). This has the same function as ARP in IPv4 but is part of IPv6, not a separate protocol. The Solicited Node Multicast Address is used, and since this is derived from the (known) unicast address which is [supposed to be] normally derived from the (unknown) MAC address, it will almost always be unique on the link.

Neighbor Advertisement

The neighbor responds to Neighbor Solicitation.

Redirect

A router, having received a packet that it will have to forward to a different router on the same link, sends this message to tell the sender to send further packets directly.

RFC 2462: IPv6 Stateless Address Autoconfiguration

IPv6 is designed to minimize manual configuration of addresses on hosts. A host can determine its own address(es) and participate in the complete IPv6 protocol using only local information (MAC address) and router advertisements. In sites with only one link segment and no connection to the global internet, hosts can connect to each other with no servers or routers at all.

Each host creates the following addresses (subject to duplicate address checks). Most are obtained by joining a 64-bit prefix to a 64-bit Interface Identifier (see RFC 2464 for how this is generated from the MAC address).

Link Local

Prefix is fe80::/64.

Stateless

The host sends a Router Solicitation and receives a Router Advertisement, which may list prefix(es) suitable for stateless addressing. Each of these (usually just one) is prefixed to the Interface Identifier.

Stateful (DHCP)

The DHCP server can send a fixed address based on a host identifier, or can allocate an aleatory address from a pool. The Router Advertisement has a flag that can suppress DHCP configuration (on links that have no DHCP server), but in the absence of Router Advertisements the host should attempt DHCP.

Appendix B: Linux Helper Modules

NAT on Linux requires helper modules for non-simple protocols. It appears that the following protocols (IPv4) have helpers in the mainline kernel, as of 2.6.22 in OpenSuSE 10.3. Other protocols may be supported by modules that are available separately.

Appendix C: Programs Supporting IPv6

Here is the URL of the official IPv6 support list.

In this post, a commenter gives pointers about getaddrinfo which can automatically select between IPv4 and IPv6 addresses according to which are available and which ones the particular client host can deal with. Existing IPv4 applications are not able to support IPv6, but getaddrinfo is nearly a drop-in replacement for gethostbyname.

Here is a list of supported programs that are used at Mathnet. Programs irrelevant at Mathnet are omitted. Recently upgraded programs may not have made it onto the list.

Application Servers

Infrastructure Servers

Web and Mail Clients

Media Players

(Can play from an IPv6 URL)

Other Network Software

Miscellaneous

Revision History: