src - OpenBSD base system

diff options


context:
space:
mode:

author	Jun-ichiro itojun Hagino <itojun@cvs.openbsd.org>	2000-02-09 12:29:29 +0000
committer	Jun-ichiro itojun Hagino <itojun@cvs.openbsd.org>	2000-02-09 12:29:29 +0000
commit	6fb0e93d02a7fddce1c4789b895a805732412008 (patch)
tree	f879deb849837380cd5383f1a0a94abbea959f14
parent	f1dd7c2f67784cc38698446bb1dfbf5018879596 (diff)

bring in the latest document to sync with reality.

Diffstat

-rw-r--r--

sys/netinet6/IMPLEMENTATION

715

1 files changed, 304 insertions, 411 deletions

diff --git a/sys/netinet6/IMPLEMENTATION b/sys/netinet6/IMPLEMENTATION
index c3b252ff87b..208adf50cf3 100644
--- a/sys/netinet6/IMPLEMENTATION
+++ b/sys/netinet6/IMPLEMENTATION

@@ -1,4 +1,4 @@

-$OpenBSD: IMPLEMENTATION,v 1.2 1999/12/20 08:26:32 itojun Exp $

+$OpenBSD: IMPLEMENTATION,v 1.3 2000/02/09 12:29:28 itojun Exp $

# NOTE: this is from original KAME distribution.

# Some portion of this document is not applicable to the code merged into

@@ -8,7 +8,7 @@ $OpenBSD: IMPLEMENTATION,v 1.2 1999/12/20 08:26:32 itojun Exp $

KAME Project

http://www.kame.net/

- KAME Date: 1999/12/20 08:23:13

+ KAME Date: 2000/02/08 15:29:50

1. IPv6

@@ -20,7 +20,7 @@ below (NOTE: this is not a complete list - this is too hard to maintain...).

For details please refer to specific chapter in the document, RFCs, manpages

come with KAME, or comments in the source code.

-Conformance tests have been performed on the KAME STABLE kit

+Conformance tests have been performed on past and latest KAME STABLE kit,

at TAHI project. Results can be viewed at http://www.tahi.org/report/KAME/.

We also attended Univ. of New Hampshire IOL tests (http://www.iol.unh.edu/)

in the past, with our past snapshots.

@@ -88,6 +88,9 @@ RFC2675: IPv6 Jumbograms

* See 1.7 in this document for details.

RFC2710: Multicast Listener Discovery for IPv6

RFC2711: IPv6 router alert option

+RFC2732: Format for Literal IPv6 Addresses in URL's

+ * The spec is implemented in programs that handle URLs

+ (like freebsd ftpio(3) and fetch(1), or netbsd ftp(1))

draft-ietf-ipngwg-router-renum-08: Router renumbering for IPv6

draft-ietf-ipngwg-icmp-namelookups-02: IPv6 Name Lookups Through ICMP

draft-ietf-ipngwg-icmp-name-lookups-03: IPv6 Name Lookups Through ICMP

@@ -103,6 +106,10 @@ draft-yamamoto-wideipv6-comm-model-00

* See 1.6 in this document for details.

draft-ietf-ipngwg-scopedaddr-format-00.txt:

An Extension of Format for IPv6 Scoped Addresses

+draft-ietf-ngtrans-tcpudp-relay-00.txt:

+ An IPv6-to-IPv4 transport relay translator

+ * FAITH tcp relay translator (faithd) implements this. See 3.1 for more

+ details.

1.2 Neighbor Discovery

@@ -112,13 +119,22 @@ are supported. In the near future we will be adding Proxy Neighbor

Advertisement support in the kernel and Unsolicited Neighbor Advertisement

transmission command as admin tool.

+Duplicated Address Detection (DAD) will be performed when an IPv6 address

+is assigned to a network interface, or the network interface is enabled

+(ifconfig up). It is documented in RFC2462 5.4.

If DAD fails, the address will be marked "duplicated" and message will be

generated to syslog (and usually to console). The "duplicated" mark

can be checked with ifconfig. It is administrators' responsibility to check

-for and recover from DAD failures.

-The behavior should be improved in the near future.

-Some of the network driver loops multicast packets back to itself,

+for and recover from DAD failures. We may try to improve failure recovery

+in future KAME code.

+DAD procedure may not be effective on certain network interfaces/drivers.

+If a network driver needs long initialization time (with wireless network

+interfaces this situation is popular), and the driver mistakingly raises

+IFF_RUNNING before the driver becomes ready, DAD code will try to transmit

+DAD probes to not-really-ready network driver and the packet will not go out

+from the interface. In such cases, network drivers should be corrected.

+Some of network drivers loop multicast packets back to themselves,

even if instructed not to do so (especially in promiscuous mode).

In such cases DAD may fail, because DAD engine sees inbound NS packet

(actually from the node itself) and considers it as a sign of duplicate.

@@ -137,10 +153,11 @@ list. For more details, see the comments in the source code and email

thread started from (IPng 7155), dated Feb 6 1999.

IPv6 on-link determination rule (RFC2461) is quite different from assumptions

-in BSD network code. At this moment, KAME does not implement on-link

-determination rule when default router list is empty (RFC2461, section 5.2,

-last sentence in 2nd paragraph - note that the spec misuse the word "host"

-and "node" in several places in the section).

+in BSD IPv4 network code. To implement behavior in RFC2461 section 5.2

+(when default router list is empty), the kernel needs to know the default

+outgoing interface. To configure the default outgoing interface, use

+commands like "ndp -I de0" as root. Note that the spec misuse the word

+"host" and "node" in several places in the section.

To avoid possible DoS attacks and infinite loops, KAME stack will accept

only 10 options on ND packet. Therefore, if you have 20 prefix options

@@ -151,30 +168,45 @@ provide sysctl knob for the variable.

1.3 Scope Index

-IPv6 uses scoped addresses. Therefore, it is very important to

+IPv6 uses scoped addresses. It is therefore very important to

specify scope index (interface index for link-local address, or

site index for site-local address) with an IPv6 address. Without

-scope index, scoped IPv6 address is ambiguous to the kernel, and

-kernel will not be able to determine the outbound interface for a

-packet.

-Ordinary userland applications should use advanced API (RFC2292) to

-specify scope index, or interface index. For similar purpose,

-sin6_scope_id member in sockaddr_in6 structure is defined in RFC2553.

-However, the semantics for sin6_scope_id is rather vague. If you

-care about portability of your application, we suggest you to use

-advanced API rather than sin6_scope_id.

-In the kernel, an interface index for link-local scoped address is

-embedded into 2nd 16bit-word (3rd and 4th byte) in IPv6 address.

+scope index, a scoped IPv6 address is ambiguous to the kernel, and

+the kernel will not be able to determine the outbound interface for a

+packet. KAME code tries to address the issue in several ways.

+Site-local address is very vaguely defined in the specs, and both specification

+and KAME code need tons of improvements to enable its actual use.

+For example, it is still very unclear how we define a site, or how we resolve

+hostnames in a site. There are work underway to define behavior of routers

+at site border, however, we have almost no code for site boundary node support

+(both forwarding nor routing) and we bet almost noone has.

+We recommend, at this moment, you to use global addresses for experiments -

+there are way too many pitfalls if you use site-local addresses.

+1.3.1 Kernel internal

+In the kernel, the interface index for a link-local scope address is

+embedded into the 2nd 16bit-word (the 3rd and 4th bytes) in the IPv6

+address.

For example, you may see something like:

fe80:1::200:f8ff:fe01:6317

in the routing table and interface address structure (struct

-in6_ifaddr). The address above is a link-local unicast address

+in6_ifaddr). The address above is a link-local unicast address

which belongs to a network interface whose interface identifier is 1.

The embedded index enables us to identify IPv6 link local

addresses over multiple interfaces effectively and with only a

little code change.

+1.3.2 Interaction with API

+Ordinary userland applications should use the advanced API (RFC2292)

+to specify scope index, or interface index. For the similar purpose,

+the sin6_scope_id member in the sockaddr_in6 structure is defined in

+RFC2553. However, the semantics for sin6_scope_id is rather vague.

+If you care about portability of your application, we suggest you to

+use the advanced API rather than sin6_scope_id.

Routing daemons and configuration programs, like route6d and

ifconfig, will need to manipulate the "embedded" scope index.

These programs use routing sockets and ioctls (like SIOCGIFADDR_IN6)

@@ -183,6 +215,26 @@ filled in. The APIs are for manipulating kernel internal structure.

Programs that use these APIs have to be prepared about differences

in kernels anyway.

+getaddrinfo(3) and getnameinfo(3) are modified to support extended numeric

+IPv6 syntax, as documented in draft-ietf-ipngwg-scopedaddr-format-00.txt.

+You can specify outgoing link, by using name of the outgoing interface

+like "ne0%fe80::1". This way you will be able to specify link-local scoped

+address without much trouble.

+To use this extension in your program, you'll need to use getaddrinfo(3),

+and getnameinfo(3) with NI_WITHSCOPEID.

+The implementation currently assumes 1-to-1 relationship between a link and an

+interface, which is stronger than what IPv6 specs say.

+Other APIs like inet_pton(3) or getipnodebyname(3) are inherently unfriendly

+with scoped addresses, since they are unable to annotate addresses with

+scope identifier.

+1.3.3 Interaction with users (command line)

+Some of the userland tools support extended numeric IPv6 syntax, as

+documented in draft-ietf-ipngwg-scopedaddr-format-00.txt. In this case,

+you can specify outgoing link, by using name of the outgoing interface like

+"ne0%fe80::1".

When you specify scoped address to the command line, NEVER write the

embedded form (such as ff02:1::1 or fe80:2::fedc). This is not supposed

to work. Always use standard form, like ff02::1 or fe80::fedc, with

@@ -192,15 +244,14 @@ outgoing interface, that command is not ready to accept scoped address.

This may seem to be opposite from IPv6's premise to support "dentist office"

situation. We believe that specifications need some improvements for this.

-Some of the userland tools support extended numeric IPv6 syntax, as

-documented in draft-ietf-ipngwg-scopedaddr-format-00.txt. You can specify

-outgoing link, by using name of the outgoing interface like "fe80::1@ne0".

-This way you will be able to specify link-local scoped address without much

-trouble.

-To use this extension in your program, you'll need to use getaddrinfo(3),

-and getnameinfo(3) with NI_WITHSCOPEID.

-The implementation currently assumes 1-to-1 relationship between a link and an

-interface, which is stronger than what specs say.

+The only exception to the above rule would be when you configure routing table

+manually by route(8). Gateway portion of IPv6 routing entry must be an

+link-local address (otherwise ICMPv6 redirect will not work), and in this

+case you'll need to configure it by putting interface index into the address:

+ # route add -inet6 default fe80:2::9876:5432:1234:5678

+ (when interface index for outgoing interface = 2)

+To avoid configuration mistakes, we suggest you to run dynamic routing instead

+(like route6d(8)).

1.4 Plug and Play

@@ -214,7 +265,7 @@ userland.

1.4.1 Assignment of link-local, and special addresses

-IPv6 link-local address is generated from IEEE802 adddress (ethernet MAC

+IPv6 link-local address is generated from IEEE802 address (ethernet MAC

address). Each of interface is assigned an IPv6 link-local address

automatically, when the interface becomes up (IFF_UP). Also, direct route

for the link-local address is added to routing table.

@@ -223,8 +274,8 @@ Here is an output of netstat command:

Internet6:

Destination Gateway Flags Netif Expire

-fe80:1::/64 link#1 UC ed0

-fe80:2::/64 link#2 UC ep0

+ed0%fe80::/64 link#1 UC ed0

+ep0%fe80::/64 link#2 UC ep0

Interfaces that has no IEEE802 address (pseudo interfaces like tunnel

interfaces, or ppp interfaces) will borrow IEEE802 address from other

@@ -253,6 +304,14 @@ routers and hosts. Routers forward packets addressed to others, hosts does

not forward the packets. net.inet6.ip6.forwarding defines whether this

node is router or host (router if it is 1, host if it is 0).

+It is NOT recommended to change net.inet6.ip6.forwarding while the node

+is in operation. IPv6 specification defines behavior for "host" and "router"

+quite differently, and switching from one to another can cause serious

+troubles. It is recommended to configure the variable at bootstrap time only.

+The first step in stateless address configuration is Duplicated Address

+Detection (DAD). See 1.2 for more detail on DAD.

When a host hears Router Advertisement from the router, a host may

autoconfigure itself by stateless address autoconfiguration.

This behavior can be controlled by net.inet6.ip6.accept_rtadv

@@ -330,7 +389,7 @@ gif can be configured to be ECN-friendly. See 4.5 for ECN-friendliness

of tunnels, and gif(4) manpage for how to configure.

If you would like to configure an IPv4-in-IPv6 tunnel with gif interface,

-read gif(4) carefully. You will need to remove IPv6 link-local address

+read gif(4) carefully. You may need to remove IPv6 link-local address

automatically assigned to the gif interface.

1.6 Source Address Selection

@@ -375,10 +434,10 @@ the spec rather than the above longest-match rule.

For new connections (when rule 1 does not apply), deprecated addresses

(addresses with preferred lifetime = 0) will not be chosen as source address

-if other choises are available. If no other choices are available,

+if other choices are available. If no other choices are available,

deprecated address will be used as a last resort. If there are multiple

choice of deprecated addresses, the above scope rule will be used to choose

-from those deprecated addreses. If you would like to prohibit the use

+from those deprecated addresses. If you would like to prohibit the use

of deprecated address for some reason, configure net.inet6.ip6.use_deprecated

to 0. The issue related to deprecated address is described in RFC2462 5.5.4

(NOTE: there is some debate underway in IETF ipngwg on how to use

@@ -548,159 +607,188 @@ address. These extensions have thus not been implemented in KAME.

RFC2553 describes IPv4 mapped address (3.7) and special behavior

of IPv6 wildcard bind socket (3.8). The spec allows you to:

+- Accept IPv4 connections by AF_INET6 wildcard bind socket.

- Transmit IPv4 packet over AF_INET6 socket by using special form of

the address like ::ffff:10.1.1.1.

-- Accept IPv4 connections by AF_INET6 wildcard bind socket.

but the spec itself is very complicated and does not specify how the

socket layer should behave.

-We KAME team have 4 OS platforms right now, and behavior is slightly

-different between them. To summarize:

-- All KAME implementations treat tcp/udp port number space separately

- between IPv4 and IPv6.

-- KAME/BSDI3, KAME/OpenBSD and KAME/FreeBSD228 does not support IPv4 mapped

- address, nor special wildcard bind on AF_INET6.

-- KAME/FreeBSD3x supports IPv4 mapped address, and special wildcard bind on

- AF_INET6. It is enabled by default. You can disable those two by runtime

- and kernel compile configuration.

- (you can't enable only one of them: they come together)

-- KAME/NetBSD supports both. This is always enabled.

-- KAME/BSDI4 supports both. This is always enabled.

-The following sections will give you the details, and how you can

+Here we call the former one "listening side" and the latter one "initiating

+side", for reference purposes.

+Almost all KAME implementations treat tcp/udp port number space separately

+between IPv4 and IPv6. You can perform wildcard bind on both of the address

+families, on the same port.

+There are some OS-platform differences in KAME code, as we use tcp/udp

+code from different origin. The following table summarizes the behavior.

+ listening side initiating side

+ (AF_INET6 wildcard (connection to ::ffff:10.1.1.1)

+ socket gets IPv4 conn.)

+ --- ---

+KAME/BSDI3 not supported not supported

+KAME/FreeBSD228 not supported not supported

+KAME/FreeBSD3x configurable supported

+ default: enabled

+KAME/NetBSD configurable supported

+ default: disabled

+KAME/BSDI4 enabled supported (*)

+KAME/OpenBSD not supported not supported

+(*) on KAME/BSDI4, port number space is not always separated.

+The following sections will give you more details, and how you can

configure the behavior.

-Advise to application implementers: to implement a portable IPv6 application

-(which works on multiple IPv6 kernels), we believe that the following

-is the key to the success:

-- NEVER hardcode AF_INET nor AF_INET6.

-- Use getaddrinfo() and getnameinfo() throughout the system.

- Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*().

-- If you would like to listen to connections, use getaddrinfo() (maybe

- with AI_PASSIVE), and make sockets for all the "struct addrinfo" returned.

-- If you would like to connect to destination, use getaddrinfo() and try

- all the destination returned, like telnet does.

-- Some of the IPv6 stack is shipped with buggy getaddrinfo(). Ship a minimal

- working version with your application and use that as last resort.

-- Try to avoid use of IPv4 mapped address (waiting for IPv4 connection on

- AF_INET6 socket). Listen to both AF_INET socket and AF_INET6 socket,

- if you support both address families.

+Comments on listening side:

It looks that RFC2553 talks too little on wildcard bind issue,

especially on the port space issue, failure mode and relationship

between AF_INET/INET6 wildcard bind. There can be several separate

-interpretation for this RFC (see 1.12.2 - we have two different

-implementation for this, and RFC2553 seems to fit to both of them).

+interpretation for this RFC which conform to it but behaves differently.

So, to implement portable application you should assume nothing

about the behavior in the kernel. Using getaddrinfo() is the safest way.

Port number space and wildcard bind issues were discussed in detail

on ipv6imp mailing list, in mid March 1999 and it looks that there's

no concrete consensus (means, up to implementers). You may want to

check the mailing list archives.

We supply a tool called "bindtest" that explores the behavior of

kernel bind(2). The tool will not be compiled by default.

-1.12.1 KAME/BSDI3 and KAME/FreeBSD228

+If a server application would like to accept IPv4 and IPv6 connections,

+it should use AF_INET and AF_INET6 socket (you'll need two sockets).

+Use getaddrinfo() with AI_PASSIVE into ai_flags, and socket(2) and bind(2)

+to all the addresses returned.

+By opening multiple sockets, you can accept connections onto the socket with

+proper address family. IPv4 connections will be accepted by AF_INET socket,

+and IPv6 connections will be accepted by AF_INET6 socket (NOTE: KAME/BSDI4

+kernel sometimes violate this - we will fix it).

+If you try to support IPv6 traffic only and would like to reject IPv4

+traffic, always check the peer address when a connection is made toward

+AF_INET6 listening socket. If the address is IPv4 mapped address, you may

+want to reject the connection. You can check the condition by using

+IN6_IS_ADDR_V4MAPPED() macro. This is one of the reasons the author of

+the section (itojun) dislikes special behavior of AF_INET6 wildcard bind.

+Comments on initiating side:

-The platform do not support IPv4 mapped address.

-The IPv4 mapped address support needs tweaked implementation in

-DNS support libraries, as documented in RFC2553 6.1. However, since

-the platforms do not support this, you do not need to worry about

-RFC2553 6.1 and story goes much simpler.

-(KAME library actually implements the tweaks, but it is safe to ignore that)

+Advise to application implementers: to implement a portable IPv6 application

+(which works on multiple IPv6 kernels), we believe that the following

+is the key to the success:

+- NEVER hardcode AF_INET nor AF_INET6.

+- Use getaddrinfo() and getnameinfo() throughout the system.

+ Never use gethostby*(), getaddrby*(), inet_*() or getipnodeby*().

+- If you would like to connect to destination, use getaddrinfo() and try

+ all the destination returned, like telnet does.

+- Some of the IPv6 stack is shipped with buggy getaddrinfo(). Ship a minimal

+ working version with your application and use that as last resort.

-Port number space is totally separate between AF_INET and

-AF_INET6 sockets. You can always perform wildcard bind on both of

-the adderss families, on the same port.

+If you would like to use AF_INET6 socket for both IPv4 and IPv6 outgoing

+connection, you will need tweaked implementation in DNS support libraries,

+as documented in RFC2553 6.1. KAME libinet6 includes the tweak in

+getipnodebyname(). Note that getipnodebyname() itself is not recommended as

+it does not handle scoped IPv6 addresses at all. For IPv6 name resolution

+getaddrinfo() is the preferred API. getaddrinfo() does not implement the

+tweak.

-If a server application would like to accept IPv4 and IPv6 connections,

-it should use AF_INET and AF_INET6 socket (you'll need two sockets).

-Applicsations should use proper socket for connections. IPv4 connections

-must be made on AF_INET socket, and IPv6 connections must be made

-on AF_INET6 socket. getaddrinfo() library helps you in writing

-AF-independent application, and managing sockets with different AFs.

-(some of the implementers think that we should totally get rid of gethostby*

-family of the functions and migrate to get{addr,name}info, since

-it is very clean and helps you support new AFs in the future)

+When writing applications that make outgoing connections, story goes much

+simpler if you treat AF_INET and AF_INET6 as totally separate address family.

+{set,get}sockopt issue goes simpler, DNS issue will be made simpler. We do

+not recommend you to rely upon IPv4 mapped address.

+1.12.1 KAME/BSDI3 and KAME/FreeBSD228

+The platforms do not support IPv4 mapped address at all (both listening side

+and initiating side). AF_INET6 and AF_INET sockets are totally separated.

+Port number space is totally separate between AF_INET and AF_INET6 sockets.

1.12.2 KAME/FreeBSD3x

-The platform can be configured to support IPv4 mapped address/special AF_INET6

-wildcard bind (enabled by default). If you disable it, it behaves as described

-in 1.12.1.

+KAME/FreeBSD3x uses shared tcp4/6 code (from sys/netinet/tcp*) and shared

+udp4/6 code (from sys/netinet/udp*). It uses unified inpcb/in6pcb structure.

-The IPv4 mapped address support needs tweaked implementation in

-DNS support libraries. This is documented in RFC2553 6.1.

-KAME libraries (namely libinet6.a) actually support that.

+1.12.2.1 KAME/FreeBSD3x, listening side

-RFC2553 does not talk about how port number space should be designed

-(i.e. should they be separate between AF_INET and AF_INET6, or

-should they be common)

-In KAME with the behavior enabled, port number space is separate

-between AF_INET and AF_INET6 sockets in most cases. The only

-exception is wildcard bind socket, where the special behavior appears.

+The platform can be configured to support IPv4 mapped address/special

+AF_INET6 wildcard bind (enabled by default). Kernel configuration is

+summarized as follows:

+- By default, MAPPED_ADDR_ENABLED option is defined in the kernel

+ configuration file. In this case, AF_INET6 socket will grab IPv4

+ connections in certain condition. You can disable it with sysctl, or

+ setsockopt.

+- If you remove MAPPED_ADDR_ENABLED option, the code to perform special

+ behavior will not be compiled. It behaves as described in 1.12.1.

-If a server application would like to accept IPv4 and IPv6 connections,

-it can use IPv6 socket with wildcard bind, or use two sockets (for

-AF_INET6 and AF_INET). You can handle IPv4 and IPv6 connections

-by using AF_INET6 socket. Porting of an application can be simpler in

-this case, like:

-- change AF_INET into AF_INET6

-- use gethostbyname2(hostname, AF_INET6) or getipnodebyname(), instead of

- gethostbyname(hostname)

-- use struct sockaddr_in6 instead of sockaddr_in

-To provide services to both IPv4 and IPv6 clients, you will run a single

-server which binds to single AF_INET6 wildcard socket. This server will

-accept both IPv4 and IPv6 connections to the tcp/udp port.

-If you run two daemons, which binds to AF_INET6 wildcard socket and

-AF_INET socket (say, sendmail4 and sendmail6), story start to look

-a bit complicated. The next sections have the detail.

-Wildcard bind on AF_INET6 behaves like "wildcard bind between two

-address families". It will grab IPv4 connection if and only if

-there is no socket that binds to more specific destination.

-Here, wildcard bind on AF_INET is regarded as "more specific bind"

-than wildcard bind on AF_INET6.

-In other words, wildcard bind on AF_INET6 is the only thing that

-has special behavior. It will not affect wildcard bind on AF_INET.

-If the following events happen, IPv4 connection will be routed

-to application B, and IPv6 connection will be routed to application A.

-- application A perform wildcard bind on AF_INET6, port X

-- application B perform wildcard bind on AF_INET, port X

-- IPv4 connection arrives

-- IPv6 connection arrives

-If the following events happen, the behavior is the same. IPv4

-connection will be routed to application B, and IPv6 connection

-will be routed to application A.

-- application B perform wildcard bind on AF_INET, port X

-- application A perform wildcard bind on AF_INET6, port X

-- IPv4 connection arrives

-- IPv6 connection arrives

-If the following events happen, KAME/FreeBSD3x will behave like this:

-- sendmail4 is running. It is doing wildcard bind on AF_INET.

-- Invoke sendmail6. It will do a wildcard bind on AF_INET6.

-- Stop sendmail4. Here, on KAME/FreeBSD3x, IPv4 and IPv6 conections will

- be routed to sendmail6. This is different from KAME/FreeBSD228.

-1.12.4 KAME/NetBSD

+Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following

+conditions are satisfied:

+- there's no AF_INET socket that matches the IPv4 connection

+- the AF_INET6 socket is configured to accept IPv4 traffic, i.e.

+ getsockopt(IPV6_BINDV6ONLY) returns 0.

+There's no problem with open/close ordering.

+(XXX need checking)

+1.12.2.2 KAME/FreeBSD3x, initiating side

+KAME/FreeBSD3x supports outgoing connection to IPv4 mapped address

+(::ffff:10.1.1.1), if the node is configured to accept IPv4 connections

+by AF_INET6 socket.

+(XXX need checking)

+1.12.3 KAME/NetBSD

KAME/NetBSD uses shared tcp4/6 code (from sys/netinet/tcp*) and shared

udp4/6 code (from sys/netinet/udp*). The implementation is made differently

from KAME/FreeBSD3x. KAME/NetBSD uses separate inpcb/in6pcb structures,

while KAME/FreeBSD3x uses merged inpcb structure.

-Supports for IPv4 mapped address/special AF_INET6 wildcard bind are

-enabled by default. At this moment there is no way to disable it.

-1.12.5 KAME/BSDI4

+1.12.3.1 KAME/NetBSD, listening side

+The platform can be configured to support IPv4 mapped address/special AF_INET6

+wildcard bind (disabled by default). Kernel behavior can be summarized as

+follows:

+- default: special support code will be compiled in, but is disabled by

+ default. It can be controlled by sysctl (net.inet6.ip6.bindv6only),

+ or setsockopt(IPV6_BINDV6ONLY).

+- add "INET6_BINDV6ONLY": No special support code for AF_INET6 wildcard socket

+ will be compiled in. AF_INET6 sockets and AF_INET sockets are totally

+ separate. The behavior is similar to what described in 1.12.1.

+sysctl setting will affect per-socket configuration at in6pcb creation time

+only. In other words, per-socket configuration will be copied from sysctl

+configuration at in6pcb creation time. To change per-socket behavior, you

+must perform setsockopt or reopen the socket. Change in sysctl configuration

+will not change the behavior or sockets that are already opened.

+Wildcard AF_INET6 socket grabs IPv4 connection if and only if the following

+conditions are satisfied:

+- there's no AF_INET socket that matches the IPv4 connection

+- the AF_INET6 socket is configured to accept IPv4 traffic, i.e.

+ getsockopt(IPV6_BINDV6ONLY) returns 0.

+There's no problem with open/close ordering.

+1.12.3.2 KAME/NetBSD, initiating side

+When you initiate a connection, you can always connect to IPv4 destination

+over AF_INET6 socket, usin IPv4 mapped address destination (::ffff:10.1.1.1).

+This is enabled independently from the configuration for listening side, and

+always enabled.

+1.12.4 KAME/BSDI4

KAME/BSDI4 uses NRL-based TCP/UDP stack and inpcb source code,

which was derived from NRL IPv6/IPsec stack. I guess it supports IPv4 mapped

address and speical AF_INET6 wildcard bind. The implementation is, again,

different from other KAME/*BSDs.

-Note that NRL inpcb layer has different behavior than KAME implementation,

-namely:

+1.12.4.1 KAME/BSDI4, listening side

+NRL inpcb layer supports special behavior of AF_INET6 wildcard socket.

+It grabs IPv4 connection under certain condition. NRL inpcb layer has

+different behavior than KAME implementation, namely:

- If you bind(2) a socket to IPv6 wildcard address (::) then bind(2)

another socket to IPv4 wildcard address (0.0.0.0), the latter will fail

with EADDRINUSE.

@@ -708,33 +796,35 @@ namely:

both will success. However, all IPv4 traffic (and IPv6 traffic) will be

captured by IPv6 wildcard socket.

-1.12.6 KAME/OpenBSD

+1.12.4.2 KAME/BSDI4, initiating side

+KAME/BSDi4 supports connection initiation to IPv4 mapped address

+(like ::ffff:10.1.1.1).

+1.12.5 KAME/OpenBSD

KAME/OpenBSD uses NRL-based TCP/UDP stack and inpcb source code,

-which was derived from NRL IPv6/IPsec stack. However, KAME/OpenBSD

-disables special behavior on AF_INET6 wildcard bind for security reasons

-(if IPv4 traffic toward AF_INET6 wildcard bind is allowed, access control

-will become much harder). KAME/BSDI4 uses NRL-based TCP/UDP stack as well,

-however, the behavior is different.

+which was derived from NRL IPv6/IPsec stack.

+1.12.5.1 KAME/OpenBSD, listening side

+KAME/OpenBSD disables special behavior on AF_INET6 wildcard bind for

+security reasons (if IPv4 traffic toward AF_INET6 wildcard bind is allowed,

+access control will become much harder). KAME/BSDI4 uses NRL-based TCP/UDP

+stack as well, however, the behavior is different due to OpenBSD's security

+policy.

As a result the behavior of KAME/OpenBSD is similar to KAME/BSDI3 and

KAME/FreeBSD228 (see 1.12.1 for more detail).

-1.12.7 configuration and implementation

+1.12.5.2 KAME/OpenBSD, initiating side

-On KAME/FreeBSD3x, the behavior is configurable by following procedure.

-To enable it:

-- Add the "MAPPED_ADDR_ENABLED" kernel config option into your

- kernel config file (see "sys/i386/conf/GENERIC.v6" sample file)

- and build your kernel, and

-- set sysctl variable appropriately, like:

- # sysctl -w net.inet6.ip6.mapped_addr=1

-Note that, to enable the behavior you'll need to do the both of the above.

-If you do not do the both, the behavior is disabled.

+KAME/OpenBSD does not support connection initiation to IPv4 mapped address

+(like ::ffff:10.1.1.1).

1.13 sockaddr_storage

-When RFC2553 was about to be finalized, there was discusson on how struct

+When RFC2553 was about to be finalized, there was discussion on how struct

sockaddr_storage members are named. One proposal is to prepend "__" to the

members (like "__ss_len") as they should not be touched. The other proposal

was that don't prepend it (like "ss_len") as we need to touch those members

@@ -758,7 +848,7 @@ definition.

KAME kit prior to December 1999 used RFC2553 definition. KAME kit after

December 1999 (including December) will conform to XNET definition,

-based on RFC2553bis discusson.

+based on RFC2553bis discussion.

If you look at multiple IPv6 implementations, you will be able to see

both definitions. As an userland programmer, the most portable way of

@@ -771,6 +861,22 @@ dealing with it is to:

struct sockaddr_storage ss;

family = ((struct sockaddr *)&ss)->sa_family

+1.14 Invalid addresses on the wire

+IPv6 specifications reserve IPv6 address range that are used internally

+in IPv6 nodes (not on the wire). They are:

+- IPv4 mapped address (like ::ffff:10.1.1.1)

+- IPv4 compatible address (like ::10.1.1.1)

+They are defined and used to ease IPv4-to-IPv6 transition. However,

+if they mistakingly appear on the wire, they can confuse IPv6 implementations.

+It is also possible to use the above addresses as tools to attack IPv6 hosts,

+to bypass certain security checks (like using source address of

+::ffff:127.0.0.1 to bypass "reject packet from remote" filter).

+KAME code is carefully written to avoid such incidents. More specifically,

+KAME kernel will reject packets if the above addresses are used in IPv6

+source/dstination address, or IPv6 routing header.

2. Network Drivers

KAME requires three items to be added into the standard drivers:

@@ -867,10 +973,13 @@ The following table lists the network drivers we have tried so far.

support?

--- --- --- ---

(Ethernet)

- ne pci/i386 ok ok yes

+ awi pcmcia/i386 ok ok -

+ bah zbus/amiga NG(*)

+ cnw pcmcia/i386 ok ok yes

ep pcmcia/i386 ok ok -

le sbus/sparc ok ok yes

- bah zbus/amiga NG(*)

+ ne pci/i386 ok ok yes

+ wi pcmcia/i386 ok ok yes

(ATM)

en pci/i386 ok ok -

@@ -908,9 +1017,27 @@ Here is a list of OpenBSD 2.x drivers and its conditions:

support?

--- --- --- ---

(Ethernet)

+ le sbus/sparc ok ok yes

+ fxp pci/i386 ?(*)

ne pci/i386 ok ok yes

ne pcmcia/i386 ok ok yes

- le sbus/sparc ok ok yes

+(*) There seem to be some problem in driver, with multicast filter

+configuration. This happens with certain revision of chipset on the card.

+Should be fixed by now but still not sure.

+2.6 BSD/OS 4.x

+The following lists BSD/OS 4.x device drivers and its conditions:

+ driver mbuf(1) multicast(2) official

+ support?

+ --- --- --- ---

+ (Ethernet)

+ de ok ok yes

+You may want to use "@insert" directive in /etc/pccard.conf to invoke

+"rtsol" command right after dynamic insertion of PCMCIA ethernet cards.

3. Translator

@@ -961,238 +1088,4 @@ For more details, consult kame/kame/faithd/README.

(to be written)

-4. IPsec

-# NOTE: This section does not apply to OpenBSD-current.

-IPsec is mainly organized by three components.

-(1) Policy Management

-(2) Key Management

-(3) AH and ESP handling

-Note that KAME/OpenBSD does NOT include support for KAME IPsec code,

-as OpenBSD team has their home-brew IPsec stack and they have no plan

-to replace it. IPv6 support for IPsec is, therefore, lacking on KAME/OpenBSD.

-KAME/BSDI4 lacks IPsec at this moment (both NRL and KAME). In the near

-future we will be adding KAME IPSec code support into KAME/BSDI4.

-4.1 Policy Management

-The kernel implements experimental policy management code. There are two way

-to to manage security policy. One is to configure per-socket policy using

-setsockopt(3). In this cases, policy configuration is described in

-ipsec_set_policy(3). The other is to configure kernel packet filter-based

-policy using PF_KEY interface, via setkey(8).

-The policy entry is not re-ordered with its

-indexes, so the order of entry when you add is very significant.

-4.2 Key Management

-The key management code implemented in this kit (sys/netkey) is a

-home-brew PFKEY v2 implementation. This conforms to RFC2367.

-The home-brew IKE daemon, "racoon" is included in the kit

-(kame/kame/racoon).

-Basically you'll need to run racoon as daemon, then setup a policy

-to require keys (like ping -P 'out ipsec esp/transport//use').

-The kernel will contact racoon daemon as necessary to exchange keys.

-4.3 AH and ESP handling

-IPsec module is implemented as "hooks" to the standard IPv4/IPv6

-processing. When sending a packet, ip{,6}_output() checks if ESP/AH

-processing is required by checking if a matching SPD (Security

-Policy Database) is found. If ESP/AH is needed,

-{esp,ah}{4,6}_output() will be called and mbuf will be updated

-accordingly. When a packet is received, {esp,ah}4_input() will be

-called based on protocol number, i.e. (*inetsw[proto])().

-{esp,ah}4_input() will decrypt/check authenticity of the packet,

-and strips off daisy-chained header and padding for ESP/AH. It is

-safe to strip off the ESP/AH header on packet reception, since we

-will never use the received packet in "as is" form.

-By using ESP/AH, TCP4/6 effective data segment size will be affected by

-extra daisy-chained headers inserted by ESP/AH. Our code takes care of

-the case.

-Basic crypto functions can be found in directory "sys/crypto". ESP/AH

-transform are listed in {esp,ah}_core.c with wrapper functions. If you

-wish to add some algorithm, add wrapper function in {esp,ah}_core.c, and

-add your crypto algorithm code into sys/crypto.

-Tunnel mode is partially supported in this release, with the following

-restrictions:

-- IPsec tunnel is not combined with GIF generic tunneling interface.

- It needs a great care because we may create an infinite loop between

- ip_output() and tunnelifp->if_output(). Opinion varies if it is better

- to unify them, or not.

-- MTU and Don't Fragment bit (IPv4) considerations need more checking, but

- basically works fine.

-- Authentication model for AH tunnel must be revisited. We'll need to

- improve the policy management engine, eventually.

-4.4 Conformance to RFCs and IDs

-The IPsec code in the kernel conforms (or, tries to conform) to the

-following standards:

- "old IPsec" specification documented in rfc182[5-9].txt

- "new IPsec" specification documented in rfc240[1-6].txt, rfc241[01].txt,

- rfc2451.txt and draft-mcdonald-simple-ipsec-api-01.txt (draft expired,

- but you can take from ftp://ftp.kame.net/pub/internet-drafts/).

- (NOTE: IKE specifications, rfc241[7-9].txt are implemented in userland,

- as "racoon" IKE daemon)

-Currently supported algorithms are:

- old IPsec AH

- null crypto checksum (no document, just for debugging)

- keyed MD5 with 128bit crypto checksum (rfc1828.txt)

- keyed SHA1 with 128bit crypto checksum (no document)

- HMAC MD5 with 128bit crypto checksum (rfc2085.txt)

- HMAC SHA1 with 128bit crypto checksum (no document)

- old IPsec ESP

- null encryption (no document, similar to rfc2410.txt)

- DES-CBC mode (rfc1829.txt)

- new IPsec AH

- null crypto checksum (no document, just for debugging)

- keyed MD5 with 96bit crypto checksum (no document)

- keyed SHA1 with 96bit crypto checksum (no document)

- HMAC MD5 with 96bit crypto checksum (rfc2403.txt

- HMAC SHA1 with 96bit crypto checksum (rfc2404.txt)

- new IPsec ESP

- null encryption (rfc2410.txt)

- DES-CBC with derived IV

- (draft-ietf-ipsec-ciph-des-derived-01.txt, draft expired)

- DES-CBC with explicit IV (rfc2405.txt)

- 3DES-CBC with explicit IV (rfc2451.txt)

- BLOWFISH CBC (rfc2451.txt)

- CAST128 CBC (rfc2451.txt)

- RC5 CBC (rfc2451.txt)

- each of the above can be combined with:

- ESP authentication with HMAC-MD5(96bit)

- ESP authentication with HMAC-SHA1(96bit)

-The following algorithms are NOT supported:

- old IPsec AH

- HMAC MD5 with 128bit crypto checksum + 64bit replay prevention

- (rfc2085.txt)

- keyed SHA1 with 160bit crypto checksum + 32bit padding (rfc1852.txt)

-IPsec (in kernel) and IKE (in userland as "racoon") has been tested

-at several interoperability test events, and it is known to interoperate

-with many other implementations well. Also, KAME IPsec has quite wide

-coverage for IPsec crypto algorithms documented in RFC (we cover

-algorithms without intellectual property issues only).

-4.5 ECN consideration on IPsec tunnels

-KAME IPsec implements ECN-friendly IPsec tunnel, described in

-draft-ipsec-ecn-00.txt.

-Normal IPsec tunnel is described in RFC2401. On encapsulation,

-IPv4 TOS field (or, IPv6 traffic class field) will be copied from inner

-IP header to outer IP header. On decapsulation outer IP header

-will be simply dropped. The decapsulation rule is not compatible

-with ECN, since ECN bit on the outer IP TOS/traffic class field will be

-lost.

-To make IPsec tunnel ECN-friendly, we should modify encapsulation

-and decapsulation procedure. This is described in

-http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt, chapter 3.

-KAME IPsec tunnel implementation can give you three behaviors, by setting

-net.inet.ipsec.ecn (or net.inet6.ipsec6.ecn) to some value:

-- RFC2401: no consideration for ECN (sysctl value -1)

-- ECN forbidden (sysctl value 0)

-- ECN allowed (sysctl value 1)

-Note that the behavior is configurable in per-node manner, not per-SA manner

-(draft-ipsec-ecn-00 wants per-SA configuration, but it looks too much for me).

-The behavior is summarized as follows (see source code for more detail):

- encapsulate decapsulate

- --- ---

-RFC2401 copy all TOS bits drop TOS bits on outer

- from inner to outer. (use inner TOS bits as is)

-ECN forbidden copy TOS bits except for ECN drop TOS bits on outer

- (masked with 0xfc) from inner (use inner TOS bits as is)

- to outer. set ECN bits to 0.

-ECN allowed copy TOS bits except for ECN use inner TOS bits with some

- CE (masked with 0xfe) from change. if outer ECN CE bit

- inner to outer. is 1, enable ECN CE bit on

- set ECN CE bit to 0. the inner.

-General strategy for configuration is as follows:

-- if both IPsec tunnel endpoint are capable of ECN-friendly behavior,

- you'd better configure both end to "ECN allowed" (sysctl value 1).

-- if the other end is very strict about TOS bit, use "RFC2401"

- (sysctl value -1).

-- in other cases, use "ECN forbidden" (sysctl value 0).

-The default behavior is "ECN forbidden" (sysctl value 0).

-For more information, please refer to:

- http://www.aciri.org/floyd/papers/draft-ipsec-ecn-00.txt

- RFC2481 (Explicit Congestion Notification)

- KAME sys/netinet6/{ah,esp}_input.c

-(Thanks goes to Kenjiro Cho <kjc@csl.sony.co.jp> for detailed analysis)

-4.6 Interoperability

-Here are (some of) platforms we have tested IPsec/IKE interoperability

-in the past. Note that both ends (KAME and others) may have modified their

-implementation, so use the following list just for reference purposes.

- Altiga, Ashley-laurent (vpcom.com), Data Fellows (F-Secure), Ericsson

- ACC, FreeS/WAN, HITACHI, IBM AIX, IIJ, Intel, Microsoft WinNT, NIST

- (linux IPsec + plutoplus), Netscreen, OpenBSD, RedCreek, Routerware,

- SSH, Secure Computing, Soliton, Toshiba, VPNet, Yamaha RT100i

-5. IPComp

-# NOTE: This section does not apply to OpenBSD-current.

-IPComp stands for IP payload compression protocol. This is aimed for

-payload compression, not the header compression like PPP VJ compression.

-This may be useful when you are using slow serial link (say, cell phone)

-with powerful CPU (well, recent notebook PCs are really powerful...).

-The protocol design of IPComp is very similar to IPsec.

-KAME implements the following specifications:

-- RFC2393: IP Payload Compression Protocol (IPComp)

-- RFC2394: IP Payload Compression Using DEFLATE

-Here are some points to be noted:

-- IPComp is treated as part of IPsec protocol suite, and SPI and

- CPI space is unified. Spec says that there's no relationship

- between two so they are assumed to be separate.

-- IPComp association (IPCA) is kept in SAD.

-- It is possible to use well-known CPI (CPI=2 for DEFLATE for example),

- for outbound/inbound packet, but for indexing purposes one element from

- SPI/CPI space will be occupied anyway.

-- pfkey is modified to support IPComp. However, there's no official

- SA type number assignment yet. Portability with other IPComp

- stack is questionable (anyway, who else implement IPComp on UN*X?).

-- Spec says that IPComp output processing must be performed before IPsec

- output processing, to achieve better compression ratio and "stir" data

- stream before encryption. However, with manual SPD setting, you are able to

- violate the ordering requirement (KAME code is too generic, maybe).

-- Though MTU can be significantly decreased by using IPComp, no special

- consideration is made about path MTU (spec talks nothing about MTU

- consideration). IPComp is designed for serial links, not ethernet-like

- medium, it seems.

-- You can change compression ratio on outbound packet, by changing

- deflate_policy in sys/netinet6/ipcomp_core.c. You can also change history

- buffer size by changing deflate_window in the same source code.

- (should it be sysctl accessible? or per-SAD configurable?)

-- Tunnel mode IPComp is not working right. KAME box can generate tunnelled

- IPComp packet, however, cannot accept tunneled IPComp packet.

-6. ALTQ

-KAME kit includes ALTQ 2.0 code, which supports FreeBSD2, FreeBSD3 and

-NetBSD. For other BSDs, ALTQ does not work.

-ALTQ in KAME supports (or tries to support) IPv6. ALTQ-related userland

-tools must be built manually, using ports/altq or pkgsrc/net/altq.