aboutsummaryrefslogtreecommitdiffstats
path: root/net/dccp/ackvec.h
Commit message (Collapse)AuthorAgeFilesLines
* This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp"Gerrit Renker2008-09-091-91/+113
| | | | | | as it accentally contained the wrong set of patches. These will be submitted separately. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Separate option parsing from CCID processingGerrit Renker2008-09-041-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces an almost identical replication of code: large parts of dccp_parse_options() re-appeared as ccid2_ackvector() in ccid2.c. Apart from the duplication, this caused two more problems: 1. CCIDs should not need to be concerned with parsing header options; 2. one can not assume that Ack Vectors appear as a contiguous area within an skb, it is legal to insert other options and/or padding in between. The current code would throw an error and stop reading in such a case. The patch provides a new data structure and associated list housekeeping. Only small changes were necessary to integrate with CCID-2: data structure initialisation, adapt list traversal routine, and add call to the provided cleanup routine. The latter also lead to fixing the following BUG: CCID-2 so far ignored Ack Vectors on all packets other than Ack/DataAck, which is incorrect, since Ack Vectors can be present on any packet that has an Ack field. Details: -------- * received Ack Vectors are parsed by dccp_parse_options() alone, which passes the result on to the CCID-specific routine ccid_hc_tx_parse_options(); * CCIDs interested in using/decoding Ack Vector information will add code to fetch parsed Ack Vectors via this interface; * a data structure, `struct dccp_ackvec_parsed' is provided as interface; * this structure arranges Ack Vectors of the same skb into a FIFO order; * a doubly-linked list is used to keep the required FIFO code small. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Remove old infrastructureGerrit Renker2008-09-041-78/+1
| | | | | | | | | | | | | | | | This removes * functions for which updates have been provided in the preceding patches and * the @av_vec_len field - it is no longer necessary since the buffer length is now always computed dynamically; * conditional debugging code (CONFIG_IP_DCCP_ACKVEC). The reason for removing the conditional debugging code is that Ack Vectors are an almost inevitable necessity - RFC 4341 says that for CCID-2, Ack Vectors must be used. Furthermore, the code would be only interesting for coding - after some extensive testing with this patch set, having the debug code around is no longer of real help. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Update code for the Ack Vector input/registration routineGerrit Renker2008-09-041-0/+9
| | | | | | | | | | | | | | | | | | | | | | This patch uupdates the code which registers new packets as received, using the new circular buffer interface. It contributes a new algorithm which * supports both tail/head pointers and buffer wrap-around and * deals with overflow (head/tail move in lock-step). The updated code is also partioned differently, into 1. dealing with the empty buffer, 2. adding new packets into non-empty buffer, 3. reserving space when encountering a `hole' in the sequence space, 4. updating old state and deciding when old state is irrelevant. Protection against large burst losses: With regard to (3), it is too costly to reserve space when there are large bursts of losses. When bursts get too large, the code does no longer reserve space and just fills in cells normally. This measure reduces space consumption by a factor of 63. The code reuses in part the previous implementation by Arnaldo de Melo. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Algorithm to update buffer stateGerrit Renker2008-09-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides a routine to consistently update the buffer state when the peer acknowledges receipt of Ack Vectors; updating state in the list of Ack Vectors as well as in the circular buffer. While based on RFC 4340, several additional (and necessary) precautions were added to protect the consistency of the buffer state. These additions are essential, since analysis and experience showed that the basic algorithm was insufficient for this task (which lead to problems that were hard to debug). The algorithm now * deals with HC-sender acknowledging to HC-receiver and vice versa, * keeps track of the last unacknowledged but received seqno in tail_ackno, * has special cases to reset the overflow condition when appropriate, * is protected against receiving older information (would mess up buffer state). Note: The older code performed an unnecessary step, where the sender cleared Ack Vector state by parsing the Ack Vector received by the HC-receiver. Doing this was entirely redundant, since * the receiver always puts the full acknowledgment window (groups 2,3 in 11.4.2) into the Ack Vectors it sends; hence the HC-receiver is only interested in the highest state that the HC-sender received; * this means that the acknowledgment number on the (Data)Ack from the HC-sender is sufficient; and work done in parsing earlier state is not necessary, since the later state subsumes the earlier one (see also RFC 4340, A.4). This older interface (dccp_ackvec_parse()) is therefore removed. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Implementation of circular Ack Vector buffer with overflow handlingGerrit Renker2008-09-041-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This completes the implementation of a circular buffer for Ack Vectors, by extending the current (linear array-based) implementation. The changes are: (a) An `overflow' flag to deal with the case of overflow. As before, dynamic growth of the buffer will not be supported; but code will be added to deal robustly with overflowing Ack Vector buffers. (b) A `tail_seqno' field. When naively implementing the algorithm of Appendix A in RFC 4340, problems arise whenever subsequent Ack Vector records overlap, which can bring the entire run length calculation completely out of synch. (This is documented on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/\ ack_vectors/tracking_tail_ackno/ .) (c) The buffer lengthi is now computed dynamically (i.e. current fill level), as the span between head to tail. As a result, dccp_ackvec_pending() is now simpler - the #ifdef is no longer necessary since buf_empty is always true when IP_DCCP_ACKVEC is not configured. Note on overflow handling: ------------------------- The Ack Vector code previously simply started to drop packets when the Ack Vector buffer overflowed. This means that the userspace application will not be able to receive, only because of an Ack Vector storage problem. Furthermore, overflow may be transient, so that applications may later recover from the overflow. Recovering from dropped packets is more difficult (e.g. video key frames). Hence the patch uses a different policy: when the buffer overflows, the oldest entries are subsequently overwritten. This has a higher chance of recovery. Details are on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ack_vectors/ Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Separate internals of Ack Vectors from option-parsing codeGerrit Renker2008-09-041-3/+2
| | | | | | | | | | | This patch * separates Ack Vector housekeeping code from option-insertion code; * shifts option-specific code from ackvec.c into options.c; * introduces a dedicated routine to take care of the Ack Vector records; * simplifies the dccp_ackvec_insert_avr() routine: the BUG_ON was redundant, since the list is automatically arranged in descending order of ack_seqno. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp ccid-2: Ack Vector interface clean-upGerrit Renker2008-09-041-47/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch brings the Ack Vector interface up to date. Its main purpose is to lay the basis for the subsequent patches of this set, which will use the new data structure fields and routines. There are no real algorithmic changes, rather an adaptation: (1) Replaced the static Ack Vector size (2) with a #define so that it can be adapted (with low loss / Ack Ratio, a value of 1 works, so 2 seems to be sufficient for the moment) and added a solution so that computing the ECN nonce will continue to work - even with larger Ack Vectors. (2) Replaced the #defines for Ack Vector states with a complete enum. (3) Replaced #defines to compute Ack Vector length and state with general purpose routines (inlines), and updated code to use these. (4) Added a `tail' field (conversion to circular buffer in subsequent patch). (5) Updated the (outdated) documentation for Ack Vector struct. (6) All sequence number containers now trimmed to 48 bits. (7) Removal of unused bits: * removed dccpav_ack_nonce from struct dccp_ackvec, since this is already redundantly stored in the `dccpavr_ack_nonce' (of Ack Vector record); * removed Elapsed Time for Ack Vectors (it was nowhere used); * replaced semantics of dccpavr_sent_len with dccpavr_ack_runlen, since the code needs to be able to remember the old run length; * reduced the de-/allocation routines (redundant / duplicate tests). Justification for removing Elapsed Time information [can be removed]: --------------------------------------------------------------------- 1. The Elapsed Time information for Ack Vectors was nowhere used in the code. 2. DCCP does not implement rate-based pacing of acknowledgments. The only recommendation for always including Elapsed Time is in section 11.3 of RFC 4340: "Receivers that rate-pace acknowledgements SHOULD [...] include Elapsed Time options". But such is not the case here. 3. It does not really improve estimation accuracy. The Elapsed Time field only records the time between the arrival of the last acknowledgeable packet and the time the Ack Vector is sent out. Since Linux does not (yet) implement delayed Acks, the time difference will typically be small, since often the arrival of a data packet triggers sending feedback at the HC-receiver. Justification for changes in de-/allocation routines [can be removed]: ---------------------------------------------------------------------- * INIT_LIST_HEAD in dccp_ackvec_record_new was redundant, since the list pointers were later overwritten when the node was added via list_add(); * dccp_ackvec_record_new() was called in a single place only; * calls to list_del_init() before calling dccp_ackvec_record_delete() were redundant, since subsequently the entire element was k-freed; * since all calls to dccp_ackvec_record_delete() were preceded to a call to list_del_init(), the WARN_ON test would never evaluate to true; * since all calls to dccp_ackvec_record_delete() were made from within list_for_each_entry_safe(), the test for avr == NULL was redundant; * list_empty() in ackvec_free was redundant, since the same condition is embedded in the loop condition of the subsequent list_for_each_entry_safe(). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* dccp: Leave headroom for options when calculating the MPSGerrit Renker2008-09-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | The Maximum Packet Size (MPS) is of interest for applications which want to transfer data, so it is only relevant to the data transfer phase of a connection (unless one wants to send data on the DCCP-Request, but that is not considered here). The strategy chosen to deal with this requirement is to leave room for only such options that may appear on data packets. A special consideration applies to Ack Vectors: this is purely guesswork, since these can have any length between 3 and 1020 bytes. The strategy chosen here is to subtract a configurable minimum, the value of 16 bytes (2 bytes for type/length plus 14 Ack Vector cells) has been found by experimentatation. If people experience this as too much or too little, this could later be turned into a Kconfig option. There are currently no CCID-specific header options which may appear on data packets, hence it is not necessary to define a corresponding CCID field. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
* dccp: Set per-connection CCIDs via socket optionsGerrit Renker2008-09-041-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which overrides the defaults set by the global sysctl variables for TX/RX CCIDs. To make full use of this facility, the remaining patches of this patch set are needed, which track dependencies and activate negotiated feature values. Note on the maximum number of CCIDs that can be registered: ----------------------------------------------------------- The maximum number of CCIDs that can be registered on the socket is constrained by the space in a Confirm/Change feature negotiation option. The space in these in turn depends on the size of header options as defined in RFC 4340, 5.8. Since this is a recurring constant, it has been moved from ackvec.h into linux/dccp.h, clarifying its purpose. Relative to this size, the maximum number of CCID identifiers that can be present in a Confirm option (which always consumes 1 byte more than a Change option, cf. 6.1) is 2 bytes less than the maximum TLV size: one for the CCID-feature-type and one for the selected value. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* [ACKVEC]: Reduce length of identifiersGerrit Renker2008-01-281-31/+31
| | | | | | | | | | | | | | This is reduces the length of the struct ackvec/ackvec_record fields. It is a purely text-based replacement: s#dccpavr_#avr_#g; s#dccpav_#av_#g; and increases readability somewhat. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP]: Spelling fixesJoe Perches2007-12-201-1/+1
| | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ackvec: Convert to ktime_tArnaldo Carvalho de Melo2007-10-101-2/+2
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ackvec: infrastructure for sending more than one ackvec per packetAndrea Bittau2006-12-021-7/+11
| | | | | | | | | | | | | | | | Commiter note: This was split from Andrea's original patch, in the process I changed the type of the ackvec index fields to u16 instead of to int and haven't folded dccp_ackvec_parse with dccp_ackvec_check_rcv_ackno. Next patch will actually do the insertion of more than one ackvec per packet, using, initially, up to a max of 2 ackvecs as per Andrea's original patch, then I'll work on support for larger ackvecs, be it using a sysctl or using setsockopt. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP] ackvec: Remove unused dccpav_ack_ptr field from dccp_ackvecAndrea Bittau2006-12-021-2/+0
| | | | | | | Commiter note: original patch was splitted. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [DCCP]: Update documentation references.Gerrit Renker2006-10-241-2/+1
| | | | | | | | | | | | | | | | | | Updates the references to spec documents throughout the code, taking into account that * the DCCP, CCID 2, and CCID 3 drafts all became RFCs in March this year * RFC 1063 was obsoleted by RFC 1191 * draft-ietf-tcpimpl-pmtud-0x.txt was published as an Informational RFC, RFC 2923 on 2000-09-22. All references verified. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ackvec: Remove unused variablesAndrea Bittau2006-09-221-3/+1
| | | | | | | | Get rid of unused variables in ackvector state. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Remove obsolete #include <linux/config.h>Jörn Engel2006-06-301-1/+0
| | | | | Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* [DCCP] ackvec: Introduce ack vector recordsAndrea Bittau2006-03-201-6/+25
| | | | | | | | Based on a patch by Andrea Bittau. Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk> Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ackvec: Introduce dccp_ackvec_slabArnaldo Carvalho de Melo2006-03-201-0/+12
| | | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ackvec: Ditch dccpav_buf_lenArnaldo Carvalho de Melo2006-03-201-7/+3
| | | | | | | Simplifying the code a bit as we're always using DCCP_MAX_ACKVEC_LEN. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [DCCP] ackvec: use u8 for the buf offsetsArnaldo Carvalho de Melo2006-01-041-6/+6
| | | | Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* [PATCH] gfp flags annotations - part 1Al Viro2005-10-081-2/+2
| | | | | | | | | | | | - added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [DCCP]: Move the ack vector code to net/dccp/ackvec.[ch]Arnaldo Carvalho de Melo2005-09-181-0/+133
Isolating it, that will be used when we introduce a CCID2 (TCP-Like) implementation. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>