Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Aug 2007 19:04:20 GMT
From:      Matus Harvan <mharvan@FreeBSD.org>
To:        Perforce Change Reviews <perforce@FreeBSD.org>
Subject:   PERFORCE change 125446 for review
Message-ID:  <200708201904.l7KJ4KTp080579@repoman.freebsd.org>

next in thread | raw e-mail | index | archive | help
http://perforce.freebsd.org/chv.cgi?CH=125446

Change 125446 by mharvan@mharvan_bike-planet on 2007/08/20 19:04:05

	Updated design.txt

Affected files ...

.. //depot/projects/soc2007/mharvan-mtund/mtund.doc/design.txt#4 edit

Differences ...

==== //depot/projects/soc2007/mharvan-mtund/mtund.doc/design.txt#4 (text+ko) ====

@@ -1,175 +1,266 @@
-This document describes the intended design and main implementation
-details of mtund. This plan is work in progress and changes are
-expected to reflect the outcomes of (mostly email) discussions with my
-mentors.
+	Magic Tunnel Daemon Design and Implementation Details
+
+IP can easily be tunneled over a plethora of network protocols at
+various layers, such as IP, ICMP, UDP, TCP, DNS, HTTP, SSH and many
+others. While a direct connection may not always be possible due to a
+firewall, the IP packets could be encapsulated as payload in other
+protocols, which would get through. However, each such encapsulation
+requires the setup of a different program and the user has to manually
+probe different encapsulations to find out which of them works in a
+given environment.
+
+The Magic Tunneling Daemon (mtund) uses plugins for different
+encapsulations. It automagically selects a working encapsulation in
+each environment an can failover to another one if the environment
+changes. This document describes the design and main implementation
+details of mtund. After an overview of the daemon, various details of
+the implemetnation are discussed. Afterwards, the implemented plugins
+are described. The document concludes with things remaining to be done
+in the future.
+
+
+MTUND - GENERAL OVERVIEW
+The daemon and plugins are written in plain C. The daemon can operate
+in two modes, as a client or as a server. A server accepts connections
+from multiple clients. It uses a tun(4) interface, one per client, as
+the tunnel endpoint. Using private ip addresses and nat on the tun(4)
+interfaces seems to be a viable solution. Traffic from the client can
+be encapsulated in various network protocols. Each such encapsulation
+is implemented by a plugin. Plugins are loaded at run-time with
+dlopen(3). For multiplexing between the tun interfaces and the sockets
+provided used by the plugins libevent is used. Plugins register their
+events directly by calling event_add(3) and event_dispatch(3) is
+called from main() in the daemon. Libevent can then also call handler
+functions from the plugins.
+
+Each plugin has to implement a set of functions, defined in
+plugin.h. A plugin is represented in the daemon by struct plugin. This
+struct contains pointers to the plugin functions, allowing the daemon
+to interact with the plugin. The functions avialable to the daemon are
+described in more detail in plugin.h. In client mode, the plugin tries
+to connect to the server after calling plugin_initalize(). The daemon
+can then use plugin_send() to send data throught the tunnel set up by
+the plugin. For the other direction of data flow, the plugin registers
+events for watching its sockets for incoming traffic. Upon receiving
+data over the encapsulation, the plugin decapsulates it and passes it
+to the daemon by calling the process_data_from_daemon() function. The
+plugin also provides a function ot deinitialize itslef,
+plugin_deinitialize(). In the server mode, the plugin_initialize()
+function sets up a socket listening for incoming
+connections. Otherwise, the functionality is similar to the client
+mode.
+
+In server mode, the server initializes all plugins and waits for
+incoming connections. The client initializes plugins serially, i.e.,
+try the next one if the current one fails. After initialization, it
+starts sending probes (pings) to the server. If the server receives
+them, it generates a reply. Upon reception of a ping reply, the client
+request a client ID from the server. If granted, the server and the
+client configure tun interfaces and start exchanging traffic using the
+plugin.
+
+PROBING AND KEEP-ALIVE PINGS
+The ping probes used for initial probing are also used as regular
+keep-alive traffic and to check if unreliable connections have not
+failed. If a certain number of successive ping requests is left
+without a reply (PING_FAIL_PLUGIN), the encapsulation is considered
+malfunctioning and the client tries to find another working
+encapsulation. The server is sending such probes to the client as
+well. If a certain number of pings is left unanswered
+(PING_FAIL_COMPLETE) then it is assumed that the client has
+disconnected and its tun interface on the server is deinitialized. If
+a working encapsulation is found, the client should send data or a
+special message with DISPATCH_SELECT_PLUGIN to notify the server of
+plugin failover, i.e., to make sure the server uses the new plugin for
+talking to the client.
+
+MULTI-USER SUPPORT
+The server supports concurrent sessions from multi-plient clients. In
+order to tunnel traffic for a client, the client first has to
+associate with the server. This is done by requesting a client ID. If
+the server alsready has too many clients, it simply does not answer
+the reuqest for an ID. If a client does not answer PING_FAIL_COMPLETE
+pings, it is considered disconnected and its client ID is reclaimed by
+the server. The client ID is prepended before the payload in traffic
+from the client to the server so that the server could determine from
+which client the traffic is coming. Although for TCP the file
+descriptor could be used, for DNS or ICMP the client ID is needed. For
+traffic from the server to the client no client ID is prepended as
+each client talks to only one server at a time.
+
+The plugins also have to support multiple connections as several users
+can be using the same encapsulation to communicate with the server. A
+plugin keeps track of its connections on its own. The interface to the
+daemon uses the client ID. The previous design had references to
+client ID and a connection flag passed to the daemon as arguments to
+the process_data_from_plugin() function. However, this was abandoned
+in favour of having a separate function plugin_conn_map(), implemented
+by the plugin. This function is called from process_data_from_plugin()
+before calling plugin_send(). The client ID is passed to this function
+and it maps the last incoming connection to the client ID. Note that
+there is no race condition as the plugin_conn_map() function is called
+before process_data_from_plugin() returns and hence before
+plugin_receive() returns. Hence, no other event from libevent can
+preempt the call. In addition, a flag is passed indicating whether
+* the payload was garbage and hence the connection should be
+  discarded
+* the payload was a ping, the connection is temporary
+* the payload was data from an associated client and hence the
+  connection is permanent
+
+A temporary connection can be used by the daemon until
+process_data_from_plugin() returns. It is used for replying to
+pings. For the TCP plugin, the socket is closed after a timeout while
+for non-connection oriented encapsulations the connection metadata can
+be removed earlier. The metadata of a permanent connection should be
+kept around until plugin_conn_close() is called by the daemon. A
+permanent connection is used by associated clients.
+
+The funcitons use the client ID to identify the connection and the
+mapping from the client ID to the particular connection is the
+responsibility of the plugin. Note that it is not desirable for the
+plugins to inspect the payload as this is done centrally by the
+daemon. To this end, the above described scheme has been designed.
+
+DISPATCH
+To distinguish between tunneled traffic, pings and other types of
+control traffic, a dispatch value indicating the type of payload is
+prepended before the payload. It is after the client ID for traffic
+originated by the client. The various dispatch values are defined in
+mtund.h.
 
-TODO:
-o man page
-o port skeleton
+If a plugin wishes to exchange traffic directly with another plugin,
+the payload still has to pass via the daemon so that the client ID
+gets prepended. In this way the plugins know to which
+client/connection the traffic relates. This is usefull for plugins
+probing which types of traffic pass through and for polling plugins to
+send empty requests.
 
-TODO this document
-o plugin_send() return values - prevent filling the stack
-o fragmenation, fragment reassembly, framing
+FRAGMENTATION, FRAGMENT REASSEMBLY
+Some plugins may offer only a lower mtu. Therefore, fragmentation and
+fragment reassembly has been implemented in the daemon. It is used if
+a larger packet than the indicated plugin mtu should be sent. For
+sending, a fragment header, struct frag_hdr defined in mtund.c, is
+prepended before each fragment. As the plugin indicates how much data
+was consumed, it is also posA similar problem is solved complete
+messages; TCP - framing
 
-TUN(4) INTERFACE (OR SOMETHING ELSE?)
-My original idea, as described in the proposal is to use the tun(4)
-interface. It gives a virtual network interface (point-to-point).
-Whenever a packet comes to it, it is passed into the userspace and one
-can write into it packets from the userspace to produce outgoing
-packets. Packets start with the IP header. Hence, using it is rather
-painless. On the other hand, it is a proper network interface so that
-one can assign IP addresses to it and add entries into the routing table
-for that interface. An additional benefit is that it is implemented on
-several different OSes, allowing for easy portability. Also, I already
-know how to use it and hence getting started with it is easy for me.
+sible to support plugins where the mtu
+varies from packet to packet.
 
-For using the tun interface, my idea was to set up the routing table in
-a way that all traffic would be routed via this interface except for
-traffic to the tunnel endpoint, i.e. don't try to send the encapsulated
-traffic via the tunnel. The tunnel endpoint could then do NAT or
-routing, depending on how many public IP addresses it would have
-available. This approach would basically be at the IP layer, so one
-could not easily say that some ports should go via the tunnel while
-others could not. Or maybe this would be doable with one of the
-firewalls available on FreeBSD?
+Teh reassembly happens then on the receiving end. If not all fragments
+are received within a given time, the reassembly buffer is
+discarded. After reassembling the complete packet, is is passed to the
+tun interface.
 
-One problem with IPv6 and tun(4) is that by default it tags packets as
-being IPv4. However, at the moment IPv4 is the main goal and IPv6
-support is left for later. To have IPv6 working over it, one has to
-disable the the IFF_NO_PI flag and prefix each packet with a 4-byte
-struct specifying the type of traffic (ETH_P_IPV6). These are the flag
-constants on linux, as I am currently writing code on linux. On
-FreeBSD, the flag seems to be TUNSLMODE. Another approach would be to
-change the kernel code to look at the first byte of the packet (the
-version field for both IPv4 and IPv6) and determine from that if it's
-IPv4 or IPv6. On FreeBSD 6.1, the hardcoded value is set on line 865
-in file /sys/net/if_tun.c.
+Note that the fragmentation and fragment reassembly happens in the
+daemon rather than the plugins and hence any plugin automatically can
+use it.
 
-Max has mentioned netgraph as an alternative for tun. However, it has
-been decided that tun(4) would be used rather than netgraph.
+DIRECT/POLLING PLUGINS
+In general, there are two types of plugins. Direct plugins and polling
+plugins.
 
-MULTIPLEXING BETWEEN DIFFERENT FILE DESCRIPTORS
-The easiest way that came to my mind was using select(2). The plugins
-of course have to register their file descriptors to be watched. The
-code will be rewritten to use libevent. This will allow to easily use
-timeouts for checking the connectivity of plugins or do other
-timeout-related things.
+For direct plugins both the client and the server can send a packet
+whenever they wish so. An example would be the TCP plugin using a TCP
+socket or a UDP plugin using a UDP socket.
 
-Another alternative would be to fork a process for each plugin or to use
-threads. It may give more flexibility to the plugins, but at the moment
-I do not see a useful advantage in this approach. The current design and
-the functions from the main daemon avaialable to the plugins pretty much
-dictate the plugin design and to some extent capabilities.
+The polling plugins use a request-reply scheme, such as ICMP echo
+request/reply or DNS query/answer. While the client can initiate a
+request at any time to send data, the server can only tunnel data in
+responses. These responses can only originate in response to requests
+and hence the server cannot send packets whenever is wants. To tackle
+this problem, the plugin queues one packet at a time and sends it when
+a response is geneerated. Actually, there are two one-packet
+queuues. One is for normal data, such as packets from the tun device
+or ping requests. The other, called urgent, is for replies generated
+in response to received traffic. These are ping replies and client ID
+offers. These have priority over normal data. To distinguish between
+the different types of plugins, the plugin_is_ready_to_send() function
+return values indicates whether the data would be send immediately,
+queued or whether the queue is full.
 
-CHECKING UNRELIABLE CONNECTIONS
-One thing that is missing in the code I wrote for the application is
-checking whether a udp connection (still) works. For tcp, one can use
-the return value of the write(2) call. For unreliable protocols like
-udp, icmp or ip, the read call does not indicate that something went
-wrong. In order to detect this problem I was thinking of exchanging some
-regular keep-alive traffic, i.e. regularly sending an echo request on
-the application level and expecting an echo reply within a time
-interval. If that fails N times, the connection is declared
-malfunctioning.
+If a polling plugin is used, the monitoring of the tun device by
+libevent is disabled. When a response should be sent, plugin uses the
+function report_plugin() with the REPORT_READY_TO_SEND flag to
+indicate that is can send a packet. The daemon then checks whether no
+fragments are pending. If not, a read on the tun interface is be
+attempted. Note that the queue is still needed to originate ping
+requests on the server as it does not queue them, but expects the
+plugin to do so. Using the "urgent" queue for replies is just a
+technical issue to simplify the plugins.
 
-For the implementation, things should get easier with the use of
-libevent. This would easily allow signalling timeouts. One issue is
-that I might receive the timer signal at an inappropriate time and
-would have to properly protect shared variables. This has to be
-checked. In general, synchronization and multi-threaded safety should
-be considered.
+Upon receiving a response, the plugin on the client immediately
+generates a new request. If no data is avaiable, it sends an empty
+request. The reason is that the server could have more data queued and
+is waiting for another response to send it. In addition, the client
+sends regular request, possibly empty to decrease the latency in case
+traffic becomes avaialable on the tun device on the server, but the
+client has no data to send at the moment.
 
-Another way would be to use select(2) with a struct timeval *timeout
-(5th argument), which might be easier.
+The report_plugin() function is also used to indicate various errors
+and failures of a plugin.
 
-For this part, having plugins written as threads/processes might give
-them more flexibility. However, I do not think that such flexibility is
-needed. At least not at the moment.
 
-REDIRECTING ONLY CERTAIN PORTS
-We might want to allow for redirecting only certain ports via the
-tunnel while leaving others to go directly. One way to achieve this
-would be to use pf anchors. Examples can be found in usr.sbin/authpf
-in base or ftp/ftp-proxy in ports. Currently, this is consider an
-optional feature and hence left for later,
+PLUGINS
 
-MULTI USER SUPPORT ON THE SERVER
-The server shall support multiple users concurrently. This is an
-important feature and the design has to be changed appropriately to
-accomodate for it.
+UDP PLUGIN
+The UDP plugin is a direct plugin using a UDP socket for the encapsulation.
 
-At the moment, it is unclear how the multi user support could be
-achieved, but some session management, possibly with some
-atuhentication/handshake out-of-band. Some encapsulations such as UDP
-and TCP offer port and addresses for identifiyng the sessions/clients
-while for others (ICMP) the session may have to be signalled
-in-band. In particular, with the former a separate file descriptor
-represents each client, making things easier.
+UDP CATCHALL PLUGIN
+The UDP CATCHALL plugin uses a raw IP socket to receive unclaimed UDP
+traffic, i.e., listen on all unused ports. A kernel patch is provided
+to allow this.
 
-Maybe a separate tun(4) interface could be created for each user on
-the server and the burden of multiplexing between them and assigning
-traffic to the right tun interface (and hence mtund instance) would be
-the responsibility of the kernel by using entries in the routing table
-or tracking connections when natting.
+TCP PLUGIN
+The TCP plugin is a direct plugin using a TCP socket for the
+encapsulation. In addition, a patch for the kernel is provided to
+allow a TCP socket to listen on all unused ports.
 
-AUTODETECTION
-I'm not sure what/how could/should be autodetected. I guess some proxy
-information could be taken from environment variables,
-firefox/opera/konqueror configuration. Default gateway and other routing
-informaiton could be taken from the system's routing tables. Maybe some
-information about socks proxies could be taken from Dante's SOCKS config
-file. Possible name servers could be taken from /etc/resolv.conf to use
-for dns tunelling. Maybe the default gateway should be probed for
-offering dns, http proxying,...
+ICMP PLUGIN
+The ICMP plugin is a polling plugin using ICMP echo requeust/response
+exchanges.
 
-QUEUING
-The current implementation with blocking I/O does queuing of one
-packet. If this approach turns out to be problematic, different
-queuing strategies would have to be investigated.
+DNS PLUGIN
+The DNS plugin is a polling plugin using DNS queries/answers. Fro the
+DNS encoding/decoding, code from the iodine project is used.
 
-FRAGMENTATION
-What if the encapsulation provides a smaler MTU? This might be the
-case for DNS tunnelling. We should then probably fragment packets and
-reassemble fragments on the other end. For this, in-band signalling
-might be needed.
+THINGS LEFT TO DO:
 
-In additiona, for TCP we have to do STREAM <-> MSG dissection, for MSG
-based protocols we have to figure out MSS (probably without the help
-of ICMP) and fragment accordingly.
+HTTP PLUGIN
+Reading httptunnel sources is a good starting point.
 
-Max thinks the absolute minimum we should provide is a MTU of 1300
-(1280 + 20) which will allow to run a gif tunnel over the mtund tunnel
-without the need for IPv4 fragmentation for the gif tunnel.
+CONFIG FILE
+Currently, the config options are specified with #defines. A parser
+for the config needs to be written. lex/yacc is a good candidate
+here. The plugin-specific parts of the config file may be parsed by
+the plugins. This would allow to leave the daemon independent of the
+plugins.
 
 CRYPTO
 The easiest way to secure the tunnel would be to put IPSec on the tun
-interface. Other options would likely not be investigated, but
-nevertheless are descibed in this document.
+interface. However, this would not secure the control traffic. Putting
+a symmetric key onto both, the client and the server, the traffic
+could be encrypted with blowfish, aes or another symmetric cipher.
 
-Offering basic encryption support should be easy. Putting a symmetric
-key onto both, the client and the server, encapsulated payload could
-be encrypted with blowfish, aes or another symmetric cipher.
+REDIRECTING ONLY CERTAIN PORTS
+We might want to allow for redirecting only certain ports via the
+tunnel while leaving others to go directly. One way to achieve this
+would be to use pf anchors. Examples can be found in usr.sbin/authpf
+in base or ftp/ftp-proxy in ports. Currently, this is consider an
+optional feature and hence left for later,
 
-Adding authentication might be harder and I'm not sure it's a high
-priority.
+MAN PAGE
+Depened on the config file parsing.
 
-CONFIG FILE
-What should be configurable? What should the config file then look
-like?  The config file would be plain text file. It's format and
-contents will be determined later.
+PORT SKELETON
+Depened on the config file parsing.
 
-PROJECT SCHEDULE
-I have put a rough estimate of a schedule in the proposal:
-* core tunnel daemon, config file parsing, plugin interface - 2 weeks
-* probing/checking strategy for unreliable protocols (UDP, ICMP)
-  - 1 week
-* TCP, UDP plugins - 1 week
-* ICMP plugin - 1 week
-* HTTP plugin - 1 week
-* DNS plugin - 1 week
-* SSH plugin - 1 week
-* man pages, ports Makefile - 1 week
+MTU PROBING
+The ping mechanism in the plugin can be used to probe which maximal
+MTU passes the firewall.
 
-However, if things go well, most items in the schedule should be done
-faster. But at the moment, it seems hard to me to predict the amount of
-times the various items would take.
+ICMP PLUGIN PROBING
+The ICMP plugin should probed if the icmp request/response exchanges
+are needed or if the firewall would pass through more responses per
+request,...



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200708201904.l7KJ4KTp080579>