aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4/fib_frontend.c
Commit message (Collapse)AuthorAgeFilesLines
* net: restore ip source validationJamal Hadi Salim2009-12-251-0/+2
| | | | | | | | | | | | | | | | when using policy routing and the skb mark: there are cases where a back path validation requires us to use a different routing table for src ip validation than the one used for mapping ingress dst ip. One such a case is transparent proxying where we pretend to be the destination system and therefore the local table is used for incoming packets but possibly a main table would be used on outbound. Make the default behavior to allow the above and if users need to turn on the symmetry via sysctl src_valid_mark Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4 05/05: add sysctl to accept packets with local source addressesPatrick McHardy2009-12-031-4/+7
| | | | | | | | | | | | | | | | | commit 8ec1e0ebe26087bfc5c0394ada5feb5758014fc8 Author: Patrick McHardy <kaber@trash.net> Date: Thu Dec 3 12:16:35 2009 +0100 ipv4: add sysctl to accept packets with local source addresses Change fib_validate_source() to accept packets with a local source address when the "accept_local" sysctl is set for the incoming inet device. Combined with the previous patches, this allows to communicate between multiple local interfaces over the wire. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: NETDEV_UNREGISTER_PERNET -> NETDEV_UNREGISTER_BATCHEric W. Biederman2009-12-011-1/+3
| | | | | | | | | | | | | | | | | | | | | The motivation for an additional notifier in batched netdevice notification (rt_do_flush) only needs to be called once per batch not once per namespace. For further batching improvements I need a guarantee that the netdevices are unregistered in order allowing me to unregister an all of the network devices in a network namespace at the same time with the guarantee that the loopback device is really and truly unregistered last. Additionally it appears that we moved the route cache flush after the final synchronize_net, which seems wrong and there was no explanation. So I have restored the original location of the final synchronize_net. Cc: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: factorize cache clearing for batched unregister operationsOctavian Purdila2009-11-181-5/+6
| | | | | Signed-off-by: Octavian Purdila <opurdila@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-11-061-1/+4
|\ | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/usb/cdc_ether.c All CDC ethernet devices of type USB_CLASS_COMM need to use '&mbm_info'. Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Fix RPF to work with policy routingjamal2009-10-291-1/+4
| | | | | | | | | | | | | | | | Policy routing is not looked up by mark on reverse path filtering. This fixes it. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: fib table algorithm performance improvementStephen Hemminger2009-10-051-13/+13
|/ | | | | | | | | | The FIB algorithim for IPV4 is set at compile time, but kernel goes through the overhead of function call indirection at runtime. Save some cycles by turning the indirect calls to direct calls to either hash or trie code. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: cleanup: remove unnecessary include.Rami Rosen2009-05-181-1/+0
| | | | | | | | There is no need for net/icmp.h header in net/ipv4/fib_frontend.c. This patch removes the #include net/icmp.h from it. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ip: add loose reverse path filteringStephen Hemminger2009-02-221-1/+1
| | | | | | | | | | | Extend existing reverse path filter option to allow strict or loose filtering. (See http://en.wikipedia.org/wiki/Reverse_path_filtering). For compatibility with existing usage, the value 1 is chosen for strict mode and 2 for loose mode. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: clean up net/ipv4/fib_frontend.c fib_hash.c ip_gre.cJianjun Kong2008-11-031-5/+5
| | | | | Signed-off-by: Jianjun Kong <jianjun@zeuux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* netns: add namespace parameter to rt_cache_flushDenis V. Lunev2008-07-051-8/+9
| | | | | Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: remove CVS keywordsAdrian Bunk2008-06-111-2/+0
| | | | | | | | This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* route: Mark unused routing attributes as suchThomas Graf2008-06-031-1/+0
| | | | | | | | Also removes an unused policy entry for an attribute which is only used in kernel->user direction. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.YOSHIFUJI Hideaki2008-03-261-4/+4
| | | | | | | | | Introduce per-sock inlines: sock_net(), sock_net_set() and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.YOSHIFUJI Hideaki2008-03-261-6/+6
| | | | | | | | Introduce per-net_device inlines: dev_net(), dev_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
* [NETNS]: Lookup in FIB semantic hashes taking into account the namespace.Denis V. Lunev2008-01-311-1/+1
| | | | | | | | | | | | | The namespace is not available in the fib_sync_down_addr, add it as a parameter. Looking up a device by the pointer to it is OK. Looking up using a result from fib_trie/fib_hash table lookup is also safe. No need to fix that at all. So, just fix lookup by address and insertion to the hash table path. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: fib_sync_down rework.Denis V. Lunev2008-01-311-2/+2
| | | | | | | | | | fib_sync_down can be called with an address and with a device. In reality it is called either with address OR with a device. The codepath inside is completely different, so lets separate it into two calls for these two cases. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Fix memory leak on error path during FIB initialization.Denis V. Lunev2008-01-311-1/+9
| | | | | | | net->ipv4.fib_table_hash is not freed when fib4_rules_init failed. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add namespace parameter to ip_dev_find.Denis V. Lunev2008-01-281-2/+2
| | | | | | | | in_dev_find() need a namespace to pass it to fib_get_table(), so add an argument. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add netns parameter to fib_select_default.Denis V. Lunev2008-01-281-2/+3
| | | | | | | | | Currently fib_select_default calls fib_get_table() with the init_net. Prepare it to provide a correct namespace to lookup default route. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Consolidate fib_select_default.Denis V. Lunev2008-01-281-0/+14
| | | | | | | | | The difference in the implementation of the fib_select_default when CONFIG_IP_MULTIPLE_TABLES is (not) defined looks negligible. Consolidate it and place into fib_frontend.c. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Pass correct namespace in fib_validate_source.Denis V. Lunev2008-01-281-2/+4
| | | | | | | | | Correct network namespace is available inside fib_validate_source. It can be obtained from the device passed in. The device is not NULL as in_device is obtained from it just above. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add netns parameter to fib_lookup.Denis V. Lunev2008-01-281-2/+2
| | | | | Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Enable use of 240/4 address space.Jan Engelhardt2008-01-281-1/+1
| | | | | | | | | | This short patch modifies the IPv4 networking to enable use of the 240.0.0.0/4 (aka "class-E") address space as propsed in the internet draft draft-fuller-240space-00.txt. Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Namespace stop vs 'ip r l' race.Denis V. Lunev2008-01-281-6/+1
| | | | | | | | | | | | | | | | | | | | During network namespace stop process kernel side netlink sockets belonging to a namespace should be closed. They should not prevent namespace to stop, so they do not increment namespace usage counter. Though this counter will be put during last sock_put. The raplacement of the correct netns for init_ns solves the problem only partial as socket to be stoped until proper stop is a valid netlink kernel socket and can be looked up by the user processes. This is not a problem until it resides in initial namespace (no processes inside this net), but this is not true for init_net. So, hold the referrence for a socket, remove it from lookup tables and only after that change namespace and perform a last put. Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Consolidate kernel netlink socket destruction.Denis V. Lunev2008-01-281-1/+1
| | | | | | | | | | Create a specific helper for netlink kernel socket disposal. This just let the code look better and provides a ground for proper disposal inside a namespace. Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Memory leak on network namespace stop.Denis V. Lunev2008-01-281-1/+1
| | | | | | | | | | Network namespace allocates 2 kernel netlink sockets, fibnl & rtnl. These sockets should be disposed properly, i.e. by sock_release. Plain sock_put is not enough. Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: fib hash|trie initializationStephen Hemminger2008-01-281-3/+6
| | | | | | | | | Initialization of the slab cache's should be done when IP is initialized to make sure of available memory, and that code can be marked __init. Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] FIB: printk related cleanupsStephen Hemminger2008-01-281-4/+2
| | | | | | | | | | | printk related cleanups: * Get rid of unused printk wrappers. * Make bug checks into KERN_WARNING because KERN_DEBUG gets ignored * Turn one cryptic old message into something real * Make sure all messages have KERN_XXX Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Enable routing configuration in non-initial namespace.Denis V. Lunev2008-01-281-16/+0
| | | | | | | | | | I.e. remove the net != &init_net checks from the places, that now can handle other-than-init net namespace. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Replace init_net with the correct context in fib_frontend.cDenis V. Lunev2008-01-281-4/+4
| | | | | | | Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Pass namespace through ip_rt_ioctl.Denis V. Lunev2008-01-281-4/+4
| | | | | | | | | ... up to rtentry_to_fib_config Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Correctly fill fib_config data.Denis V. Lunev2008-01-281-13/+14
| | | | | | | Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Provide correct namespace for fibnl netlink socket.Denis V. Lunev2008-01-281-8/+16
| | | | | | | | | | This patch makes the netlink socket to be per namespace. That allows to have each namespace its own socket for routing queries. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Place fib tables into netns.Denis V. Lunev2008-01-281-12/+24
| | | | | | | | | | | The preparatory work has been done. All we need is to substitute fib_table_hash with net->ipv4.fib_table_hash. Netns context is available when required. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add netns to nl_info structure.Denis V. Lunev2008-01-281-0/+5
| | | | | | | | | | | nl_info is used to track the end-user destination of routing change notification. This is a natural object to hold a namespace on. Place it there and utilize the context in the appropriate places. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add netns parameter to inet_(dev_)add_type.Eric W. Biederman2008-01-281-8/+10
| | | | | | | | | | | | | | | The patch extends the inet_addr_type and inet_dev_addr_type with the network namespace pointer. That allows to access the different tables relatively to the network namespace. The modification of the signature function is reported in all the callers of the inet_addr_type using the pointer to the well known init_net. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Add netns parameter to fib_get_table/fib_new_table.Denis V. Lunev2008-01-281-12/+12
| | | | | | | | | | | This patch extends the fib_get_table and the fib_new_table functions with the network namespace pointer. That will allow to access the table relatively from the network namespace. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Unify access to the routing tables.Denis V. Lunev2008-01-281-17/+12
| | | | | | | | | | | | | Replace the direct pointers to local and main tables with calls to fib_get_table() with appropriate argument. This doesn't introduce additional dereferences, but makes the access to fib tables uniform in any (CONFIG_IP_MULTIPLE_TABLES) case. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETNS]: Refactor fib initialization so it can handle multiple namespaces.Denis V. Lunev2008-01-281-8/+80
| | | | | | | | | | | | This patch makes the fib to be initialized as a subsystem for the network namespaces. The code does not handle several namespaces yet, so in case of a creation of a network namespace, the creation/initialization will not occur. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Check fib4_rules_init failure.Denis V. Lunev2008-01-281-3/+15
| | | | | | | | | | | This adds error paths into both versions of fib4_rules_init (with/without CONFIG_IP_MULTIPLE_TABLES) and returns error code to the caller. Acked-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4] net/ipv4: Use ipv4_is_<type>Joe Perches2008-01-281-3/+3
| | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Add inet_dev_addr_type()Laszlo Attila Toth2008-01-281-2/+19
| | | | | | | | | Address type search can be limited to an interface by inet_dev_addr_type function. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: Modify all rtnetlink methods to only work in the initial namespace (v2)Denis V. Lunev2008-01-281-0/+12
| | | | | | | | | | | | | | | | Before I can enable rtnetlink to work in all network namespaces I need to be certain that something won't break. So this patch deliberately disables all of the rtnletlink methods in everything except the initial network namespace. After the methods have been audited this extra check can be disabled. Changes from v1: - added IPv6 addrlabel protection Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [IPV4]: OOPS with NETLINK_FIB_LOOKUP netlink socketDenis V. Lunev2007-12-211-3/+6
| | | | | | | | | | | | | | | | | | [ Regression added by changeset: cd40b7d3983c708aabe3d3008ec64ffce56d33b0 [NET]: make netlink user -> kernel interface synchronious -DaveM ] nl_fib_input re-reuses incoming skb to send the reply. This means that this packet will be freed twice, namely in: - netlink_unicast_kernel - on receive path Use clone to send as a cure, the caller is responsible for kfree_skb on error. Thanks to Alexey Dobryan, who originally found the problem. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Compact some ifdefs in the fib code.Pavel Emelyanov2007-11-071-7/+8
| | | | | | | | | | | There are places that check for CONFIG_IP_MULTIPLE_TABLES twice in the same file, but the internals of these #ifdefs can be merged. As a side effect - remove one ifdef from inside a function. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [IPV4]: Explicitly call fib_get_table() in fib_frontend.cPavel Emelyanov2007-10-231-5/+7
| | | | | | | | | | | In case the "multiple tables" config option is y, the ip_fib_local_table is not a variable, but a macro, that calls fib_get_table(RT_TABLE_LOCAL). Some code uses this "variable" *3* times in one place, thus implicitly making 3 calls. Fix it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETLINK]: fib_frontend build fixesDavid S. Miller2007-10-101-11/+5
| | | | | | | | | 1) fibnl needs to be declared outside of config ifdefs, and also should not be explicitly initialized to NULL 2) nl_fib_input() args are wrong for netlink_kernel_create() input method Signed-off-by: David S. Miller <davem@davemloft.net>
* [NET]: make netlink user -> kernel interface synchroniousDenis V. Lunev2007-10-101-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* [NETLINK]: Introduce nested and byteorder flag to netlink attributeThomas Graf2007-10-101-1/+1
| | | | | | | | | | | This change allows the generic attribute interface to be used within the netfilter subsystem where this flag was initially introduced. The byte-order flag is yet unused, it's intended use is to allow automatic byte order convertions for all atomic types. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>