summaryrefslogtreecommitdiff
path: root/sys/net/rtable.c
AgeCommit message (Collapse)Author
2016-09-07Rename rtable_mpath_next() into rtable_iterate() and make it do a properMartin Pieuchot
reference count. rtable_iterate() frees the passed ``rt'' and returns the next one on the multipath list or NULL if there's none. ok dlg@
2016-08-30use a per-table rwlock to serialize ART updates and walks, rather thanJonathan Matthew
taking the kernel lock. ok mpi@ dlg@
2016-07-19Revert use of the _SAFE version of SRPL_FOREACH() now that the offendingMartin Pieuchot
function has been fixed. Functions passed to rtable_walk() must return EAGAIN if they delete an entry from the tree, no matter if it is a leaf or not.
2016-07-04Use the _SAFE_ version of SRPL_FOREACH() in rtable_walk_helper() toMartin Pieuchot
prevent an off-by-one when removing entries from the mpath list. Fix a regression introduced by the refactoring needed to serialize rtable_walk() with create/delete. ok jca@
2016-06-22rework art_walk so it will behave in an mpsafe world.David Gwynne
art_walk now explicitly takes the same lock used to serialise change made via rtable_insert and _delete, so it can safely adjust the refcnts on tables while it recurses into them. they need to still exist when returning out of the recursion. it uses srps to access nodes and drops the lock before calling the callback function. this is because some callbacks sleep (eg, copyout in the sysctl code that dumps an rtable to userland), which you shouldnt hold a lock accross. other callbacks attempt to modify the rtable (eg, marking routes as down when then interface theyre on goes down), which tries to take the lock again, which probably wont work in the future. ok jmatthew@ mpi@
2016-06-14Convert the links between art data structures used during lookups into srps.Jonathan Matthew
art_lookup and art_match now return an active srp_ref, which the caller must leave when it's done with the returned route (if any). This allows lookups to be done without holding any locks. The art_table and art_node garbage collectors are still responsible for freeing items removed from the routing table, so they now use srp_finalize to wait out any active references, and updates are done using srp_swap operations. ok dlg@ mpi@
2016-06-07per trending style, add continue to empty loops.Ted Unangst
ok mglocker
2016-06-01shuffle the code in rtable_insert so it inserts a populated art_node.David Gwynne
this makes the node usable as soon as it is in the tree, rather than after it inserts the rtentry on the node. ok mpi@
2016-06-01rtref and rtfree around moving the rt in rtable_mpath_reprio so the listDavid Gwynne
operations cant drop the refcount to 0. ok mpi@
2016-06-01move all the art_node initialisation to art_get in art.cDavid Gwynne
ok mpi@
2016-05-18rework the srp api so it takes an srp_ref struct that the caller provides.David Gwynne
the srp_ref struct is used to track the location of the callers hazard pointer so later calls to srp_follow and srp_enter already know what to clear. this in turn means most of the caveats around using srps go away. specifically, you can now: - switch cpus while holding an srp ref - ie, you can sleep while holding an srp ref - you can take and release srp refs in any order the original intent was to simplify use of the api when dealing with complicated data structures. the caller now no longer has to track the location of the srp a value was fetched from, the srp_ref effectively does that for you. srp lists have been refactored to use srp_refs instead of srpl_iter structs. this is in preparation of using srps inside the ART code. ART is a complicated data structure, and lookups require overlapping holds of srp references. ok mpi@ jmatthew@
2016-05-02Simplify life for routing table implementations by requiring that rtable_walkJonathan Matthew
callbacks return EAGAIN if they modify the routing table. While we're here, simplify life for rtable_walk callers by moving the loop that restarts the walk on EAGAIN into rtable_walk itself. Flushing cloned routes on interface state changes becomes a bit more inefficient, but this can be improved later. ok mpi@ dlg@
2016-04-13Keep all pools in the same place.Martin Pieuchot
ok jmatthew@
2016-02-24Fix ECMP routing by passing the correct destination address to theMartin Pieuchot
hash routine. Bug reported and fix analysed by Jean-Daniel Dupas <jddupas AT xooloo DOT net> ok deraadt@
2016-01-18Pass the address length to art_alloc() and remove the hack abusing theMartin Pieuchot
offset of the address in the sockaddr to initialize the stride lengths.
2016-01-18Stop storing a backpointer to the corresponding ART node in each routeMartin Pieuchot
entry. This pointer hasn't been used for some time and without it no external reference count is needed to turn art_lookup() mpsafe.
2015-12-21Pass the destination and mask to rtable_mpath_reprio() in order to notMartin Pieuchot
use ``rt_node'' with ART.
2015-12-16Merge rtable_mpath_select() into rtable_match().Martin Pieuchot
This allow us to get rid of one more "rt_node" usage with ART. ok jmatthew@
2015-12-15Do not panic when trying to delete an non-existing route with ART.Martin Pieuchot
Reported by bluhm@, ok jmatthew@
2015-12-04Move the KERNEL_LOCK from rt_match() to rtable_match().Martin Pieuchot
ok claudio@
2015-12-03Get rid of rt_mask() and stop allocating a "struct sockaddr" for everyMartin Pieuchot
route entry in ART. rt_plen() now represents the prefix length of a route entry and should be used instead. For now use a "struct sockaddr_in6" to represent the mask when needed, this should be then replaced by the prefix length and RTA_NETMASK only used for compatibility with userland. ok claudio@
2015-12-02rtable_delete() does not use its prio parameter, so delete it.Alexander Bluhm
OK mpi@
2015-12-02Respect priorities when inserting routes to the same destination in ART.Martin Pieuchot
2015-12-02Move multipath Hash-Threshold selection mechanism inside rtable_match().Martin Pieuchot
This will helps for unlocking the routing table and will prevent further mistake by keeping the multipath logic inside the rtable_* API. ok dlg@, claudio@
2015-11-29Convert the simple list of multipath route entries used by ART kernelsMartin Pieuchot
to a SRP list. This turns the rtable_* layer mpsafe. We now only need to protect the ART implementation itself. Note that route(8) regress tests will now fail due to a supplementary reference taken by the SRPL_INIT(9) API. ok dlg@
2015-11-27Document that routing table heads are never freed as suggested by dlg@Martin Pieuchot
and kill rtable_put() because we're not going to use it. The overhead of keeping a "struct art_root/radix_node_head" around is very small compared to the added complexity needed to reference count such structures.
2015-11-27Protect the growth of the routing table arrays used by rtable_get()Martin Pieuchot
with SRPs. This is a simplified version of the dynamically sizeable array of pointers used by if_get() because routing table heads are never freed. ok dlg@
2015-11-24Provide art_free(), a method to release unused routing table heads.Martin Pieuchot
While here initialize pools in art_init().
2015-11-10Allocate ART table's heap independently from the structure and useMartin Pieuchot
pool(9) to not waste most of the memory allocated. This reduces the memory overhead of our ART routing table from 80M to 70M compared to the existing radix-tree when loading ~550K IPv4 routes. ART can now be used for huge tables without exhausting malloc(9)'s limit. claudio@ agrees with the direction, inputs from and ok dlg@
2015-11-09Do not leave dangling pointers in the ART tree in case of memoryMartin Pieuchot
exhaustion. Reported by benno@ and found thanks to his bgpd(8) test setup.
2015-11-06Rename rt_mpath_next() into rtable_mpath_next() and provide anMartin Pieuchot
implementation for ART based on the singly-linked list of route entries.
2015-11-06Use a SLIST instead of a LIST for MPATH route entries with ART.Martin Pieuchot
2015-11-06In ART separate the MPATH delete case to properly recover if art_delete()Martin Pieuchot
does not find a matching node. This currently never happens because we always do a route lookup before calling rtable_delete(). Yes this is odd & due to the way multipath is implemented in the radix tree.
2015-11-04Initialize the correct variable in ART's rtable_match().Martin Pieuchot
2015-11-04Some tweaks to build the rtable API and backends in userland.Martin Pieuchot
Needed by the regression tests.
2015-11-04Call rtable_put(), a stub for now, before leaving a function that calledMartin Pieuchot
rtable_get().
2015-11-02Merge rtable_mpath_match() into rtable_lookup().Martin Pieuchot
ok bluhm@
2015-10-25Merge rtable_mpath_conflict() into rtable_insert().Martin Pieuchot
ok claudio@
2015-10-22Use only one refcounting mechanism for route entries.Martin Pieuchot
ok bluhm@, dlg@, claudio@
2015-10-21Return the correct error code when a table already exists.Martin Pieuchot
2015-10-14Rewrite the logic around the dymanic array of routing tables to helpMartin Pieuchot
turning rtable_get(9) MP-safe. Use only one per-AF array, as suggested by claudio@, pointing to an array of pointers to the routing table heads. Routing tables are now allocated/initialized per-AF. This will let us allocate routing table on-demand instead of always having an AF_INET, AF_MPLS and AF_INET table as soon as a new rtableID is used. This also get rid of the "void ***" madness. ok dlg@, jmatthew@
2015-10-07Make rtable_get() private to ensure it won't be used outside ofMartin Pieuchot
net/rtable.c. This will ease the introduction of rtable_put(). Routing tables are mapped to a tuple (idx, af) so the public API should as much as possible require these two keys. ok dlg@
2015-10-07Initialize the routing table before domains.Martin Pieuchot
The routing table is not an optional component of the network stack and initializing it inside the "routing domain" requires some ugly introspection in the domain interface. This put the rtable* layer at the same level of the if* level. These two subsystem are organized around the two global data structure used in the network stack: - the global &ifnet list, to be used in process context only, and - the routing table which can be read in interrupt context. This change makes the rtable_* layer domain-aware and extends the "struct domain" such that INET, INET6 and MPLS can specify the length of the binary key used in lookups. This allows us to keep, or move towards, AF-free route and rtable layers. While here stop the madness and pass the size of the maximum key length in *byte* to rn_inithead0(). ok claudio@, mikeb@
2015-10-07Move the reference counting of a newly created route entry insideMartin Pieuchot
rtable_insert(). inputs and ok bluhm@
2015-09-28Use the radix-tree API instead of function pointers.Martin Pieuchot
2015-09-28Factors ou the route hashing code to implement Equal-Cost Multi-PathMartin Pieuchot
for ART. While here sync the two remaining mix() macros. ok chris@, dlg@
2015-09-12Use rtref(9) in rtable_match() before returning a route entry.Martin Pieuchot
ok bluhm@, claudio@
2015-09-11Introduce rtref(9) use it in rtable_lookup() before returning a routeMartin Pieuchot
entry. ok bluhm@, claudio@
2015-09-04Make every subsystem using a radix tree call rn_init() and pass theMartin Pieuchot
length of the key as argument. This way every consumer of the radix tree has a chance to explicitly initialize the shared data structures and no longer rely on another subsystem to do the initialization. As a bonus ``dom_maxrtkey'' is no longer used an die. ART kernels should now be fully usable because pf(4) and IPSEC properly initialized the radix tree. ok chris@, reyk@
2015-08-20Make ART internals free of 'struct sockaddr'.Martin Pieuchot
Keep route entry/BSD compatibility goos in the rtable layer. The way addresses and masks (prefix-lengths) are encoded is really tied to the radix-tree implementation. Since we decided to no longer support non-contiguous masks, we could get rid of some extra "sockaddr" allocations and reduce the memory grows related to the use of a multibit-trie.