.\" $OpenBSD: altq.9,v 1.11 2004/01/24 12:04:09 jmc Exp $ .\" .\" Copyright (C) 2001 .\" Sony Computer Science Laboratories Inc. All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY SONY CSL AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL SONY CSL OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .Dd July 10, 2001 .Dt ALTQ 9 .Os .\" .Sh NAME .Nm ALTQ .Nd kernel interfaces for manipulating output queues on network interfaces .Sh SYNOPSIS .Fd #include .Fd #include .Fd #include .Ft void \"macro .Fn IFQ_ENQUEUE "struct ifaltq *ifq" "struct mbuf *m" "struct altq_pktattr *pa" "int err" .Ft void \"macro .Fn IFQ_DEQUEUE "struct ifaltq *ifq" "struct mbuf *m" .Ft void \"macro .Fn IFQ_POLL "struct ifaltq *ifq" "struct mbuf *m" .Ft void \"macro .Fn IFQ_PURGE "struct ifaltq *ifq" .Ft void \"macro .Fn IFQ_CLASSIFY "struct ifaltq *ifq" "struct mbuf *m" "int af" "struct altq_pktattr *pktattr" .Ft void \"macro .Fn IFQ_IS_EMPTY "struct ifaltq *ifq" .Ft void \"macro .Fn IFQ_SET_MAXLEN "struct ifaltq *ifq" "int len" .Ft void \"macro .Fn IFQ_INC_LEN "struct ifaltq *ifq" .Ft void \"macro .Fn IFQ_DEC_LEN "struct ifaltq *ifq" .Ft void \"macro .Fn IFQ_INC_DROPS "struct ifaltq *ifq" .Ft void \"macro .Fn IFQ_SET_READY "struct ifaltq *ifq" .Sh DESCRIPTION The .Nm system is a framework to manage queuing disciplines on network interfaces. .Nm introduces new macros to manipulate output queues. The output queue macros are used to abstract queue operations and not to touch the internal fields of the output queue structure. The macros are independent from the .Nm implementation, and compatible with the traditional .Dv ifqueue macros for ease of transition. .Pp .Fn IFQ_ENQUEUE enqueues a packet .Fa m to the queue .Fa ifq . The underlying queuing discipline may discard the packet. .Fa err is set to 0 on success, or .Dv ENOBUFS if the packet is discarded. .Fa m will be freed by the device driver on success or by the queuing discipline on failure, so the caller should not touch .Fa m after calling .Fn IFQ_ENQUEUE . .Pp .Fn IFQ_DEQUEUE dequeues a packet from the queue. The dequeued packet is returned in .Fa m , or .Fa m is set to .Dv NULL if no packet is dequeued. The caller must always check .Fa m since a non-empty queue could return .Dv NULL under rate-limiting. .Pp .Fn IFQ_POLL returns the next packet without removing it from the queue. It is guaranteed by the underlying queuing discipline that .Fn IFQ_DEQUEUE immediately after .Fn IFQ_POLL returns the same packet. .Pp .Fn IFQ_PURGE discards all the packets in the queue. The purge operation is needed since a non-work conserving queue cannot be emptied by a dequeue loop. .Pp .Fn IFQ_CLASSIFY classifies a packet to a scheduling class, and returns the result in .Fa pktattr . .Pp .Fn IFQ_IS_EMPTY can be used to check if the queue is empty. Note that .Fn IFQ_DEQUEUE could still return .Dv NULL if the queuing discipline is non-work conserving. .Pp .Fn IFQ_SET_MAXLEN sets the queue length limit to the default FIFO queue. .Pp .Fn IFQ_INC_LEN and .Fn IFQ_DEC_LEN increment or decrement the current queue length in packets. .Pp .Fn IFQ_INC_DROPS increments the drop counter and is equal to .Fn IF_DROP . It is defined for naming consistency. .Pp .Fn IFQ_SET_READY sets a flag to indicate this driver is converted to use the new macros. .Nm can be enabled only on interfaces with this flag. .Sh COMPATIBILITY .Ss ifaltq structure In order to keep compatibility with the existing code, the new output queue structure .Dv ifaltq has the same fields. The traditional .Fn IF_XXX macros and the code directly referencing the fields within .Dv if_snd still work with .Dv ifaltq . (Once we finish conversions of all the drivers, we no longer need these fields.) .Bd -literal ##old-style## ##new-style## | struct ifqueue { | struct ifaltq { struct mbuf *ifq_head; | struct mbuf *ifq_head; struct mbuf *ifq_tail; | struct mbuf *ifq_tail; int ifq_len; | int ifq_len; int ifq_maxlen; | int ifq_maxlen; int ifq_drops; | int ifq_drops; }; | /* altq related fields */ | ...... | }; | .Ed The new structure replaces .Dv struct ifqueue in .Dv struct ifnet . .Bd -literal ##old-style## ##new-style## | struct ifnet { | struct ifnet { .... | .... | struct ifqueue if_snd; | struct ifaltq if_snd; | .... | .... }; | }; | .Ed The (simplified) new .Fn IFQ_XXX macros looks like: .Bd -literal #ifdef ALTQ #define IFQ_DEQUEUE(ifq, m) \e if (ALTQ_IS_ENABLED((ifq)) \e ALTQ_DEQUEUE((ifq), (m)); \e else \e IF_DEQUEUE((ifq), (m)); #else #define IFQ_DEQUEUE(ifq, m) IF_DEQUEUE((ifq), (m)); #endif .Ed .Ss Enqueue operation The semantics of the enqueue operation are changed. In the new style, enqueue and packet drop are combined since they cannot be easily separated in many queuing disciplines. The new enqueue operation corresponds to the following macro that is written with the old macros. .Bd -literal #define IFQ_ENQUEUE(ifq, m, pattr, err) \e do { \e if (ALTQ_IS_ENABLED((ifq))) \e ALTQ_ENQUEUE((ifq), (m), (pattr), (err)); \e else { \e if (IF_QFULL((ifq))) { \e m_freem((m)); \e (err) = ENOBUFS; \e } else { \e IF_ENQUEUE((ifq), (m)); \e (err) = 0; \e } \e } \e if ((err)) \e (ifq)->ifq_drops++; \e } while (0) .Ed .Pp .Fn IFQ_ENQUEUE does the following: .Bl -hyphen -compact .It queue a packet .It drop (and free) a packet if the enqueue operation fails .El If the enqueue operation fails, .Fa err is set to .Dv ENOBUFS . .Fa m is freed by the queuing discipline. The caller should not touch .Fa m after calling .Fn IFQ_ENQUEUE , so the caller may need to copy the .Fa m_pkthdr.len or .Fa m_flags fields beforehand for statistics. The caller should not use .Fn senderr since .Fa m was already freed. .Pp The new style .Fn if_output looks as follows: .Bd -literal ##old-style## ##new-style## | int | int ether_output(ifp, m0, dst, rt0) | ether_output(ifp, m0, dst, rt0) { | { ...... | ...... | | mflags = m->m_flags; | len = m->m_pkthdr.len; s = splimp(); | s = splimp(); if (IF_QFULL(&ifp->if_snd)) { | IFQ_ENQUEUE(&ifp->if_snd, m, | NULL, error); IF_DROP(&ifp->if_snd); | if (error != 0) { splx(s); | splx(s); senderr(ENOBUFS); | return (error); } | } IF_ENQUEUE(&ifp->if_snd, m); | ifp->if_obytes += | ifp->if_obytes += len; m->m_pkthdr.len; | if (m->m_flags & M_MCAST) | if (mflags & M_MCAST) ifp->if_omcasts++; | ifp->if_omcasts++; | if ((ifp->if_flags & IFF_OACTIVE) | if ((ifp->if_flags & IFF_OACTIVE) == 0) | == 0) (*ifp->if_start)(ifp); | (*ifp->if_start)(ifp); splx(s); | splx(s); return (error); | return (error); | bad: | bad: if (m) | if (m) m_freem(m); | m_freem(m); return (error); | return (error); } | } | .Ed .Ss Classifier The classifier mechanism is currently implemented in .Fn if_output . .Dv struct altq_pktattr is used to store the classifier result, and it is passed to the enqueue function. (We will change the method to tag the classifier result to mbuf in the future.) .Bd -literal int ether_output(ifp, m0, dst, rt0) { ...... struct altq_pktattr pktattr; ...... /* classify the packet before prepending link-headers */ IFQ_CLASSIFY(&ifp->if_snd, m, dst->sa_family, &pktattr); /* prepend link-level headers */ ...... IFQ_ENQUEUE(&ifp->if_snd, m, &pktattr, error); ...... } .Ed .Sh HOW TO CONVERT THE EXISTING DRIVERS First, make sure the corresponding .Fn if_output is already converted to the new style. .Pp Look for .Fa if_snd in the driver. Probably, you need to make changes to the lines that include .Fa if_snd . .Ss Empty check operation If the code checks .Fa ifq_head to see whether the queue is empty or not, use .Fn IFQ_IS_EMPTY . .Bd -literal ##old-style## ##new-style## | if (ifp->if_snd.ifq_head != NULL) | if (IFQ_IS_EMPTY(&ifp->if_snd) == 0) | .Ed Note that .Fn IFQ_POLL can be used for the same purpose, but .Fn IFQ_POLL could be costly for a complex scheduling algorithm since .Fn IFQ_POLL needs to run the scheduling algorithm to select the next packet. On the other hand, .Fn IFQ_IS_EMPTY checks only if there is any packet stored in the queue. Another difference is that even when .Fn IFQ_IS_EMPTY is .Dv FALSE , .Fn IFQ_DEQUEUE could still return .Dv NULL if the queue is under rate-limiting. .Ss Dequeue operation Replace .Fn IF_DEQUEUE by .Fn IFQ_DEQUEUE . Always check whether the dequeued mbuf is .Dv NULL or not. Note that even when .Fn IFQ_IS_EMPTY is .Dv FALSE , .Fn IFQ_DEQUEUE could return .Dv NULL due to rate-limiting. .Bd -literal ##old-style## ##new-style## | IF_DEQUEUE(&ifp->if_snd, m); | IFQ_DEQUEUE(&ifp->if_snd, m); | if (m == NULL) | return; | .Ed A driver is supposed to call .Fn if_start from transmission complete interrupts in order to trigger the next dequeue. .Ss Poll-and-dequeue operation If the code polls the packet at the head of the queue and actually uses the packet before dequeuing it, use .Fn IFQ_POLL and .Fn IFQ_DEQUEUE . .Bd -literal ##old-style## ##new-style## | m = ifp->if_snd.ifq_head; | IFQ_POLL(&ifp->if_snd, m); if (m != NULL) { | if (m != NULL) { | /* use m to get resources */ | /* use m to get resources */ if (something goes wrong) | if (something goes wrong) return; | return; | IF_DEQUEUE(&ifp->if_snd, m); | IFQ_DEQUEUE(&ifp->if_snd, m); | /* kick the hardware */ | /* kick the hardware */ } | } | .Ed It is guaranteed that .Fn IFQ_DEQUEUE immediately after .Fn IFQ_POLL returns the same packet. Note that they need to be guarded by .Fn splimp if called from outside of .Fn if_start . .Ss Eliminating IF_PREPEND If the code uses .Fn IF_PREPEND , you have to eliminate it since the prepend operation is not possible for many queuing disciplines. A common use of .Fn IF_PREPEND is to cancel the previous dequeue operation. You have to convert the logic into poll-and-dequeue. .Bd -literal ##old-style## ##new-style## | IF_DEQUEUE(&ifp->if_snd, m); | IFQ_POLL(&ifp->if_snd, m); if (m != NULL) { | if (m != NULL) { | if (something_goes_wrong) { | if (something_goes_wrong) { IF_PREPEND(&ifp->if_snd, m); | return; | return; } | } | | /* at this point, the driver | * is committed to send this | * packet. | */ | IFQ_DEQUEUE(&ifp->if_snd, m); | /* kick the hardware */ | /* kick the hardware */ } | } | .Ed .Ss Purge operation Use .Fn IFQ_PURGE to empty the queue. Note that a non-work conserving queue cannot be emptied by a dequeue loop. .Bd -literal ##old-style## ##new-style## | while (ifp->if_snd.ifq_head != NULL) {| IFQ_PURGE(&ifp->if_snd); IF_DEQUEUE(&ifp->if_snd, m); | m_freem(m); | } | | .Ed .Ss Attach routine Use .Fn IFQ_SET_MAXLEN to set .Fa ifq_maxlen to .Fa len . Add .Fn IFQ_SET_READY to show this driver is converted to the new style. (This is used to distinguish new-style drivers.) .Bd -literal ##old-style## ##new-style## | ifp->if_snd.ifq_maxlen = qsize; | IFQ_SET_MAXLEN(&ifp->if_snd, qsize); | IFQ_SET_READY(&ifp->if_snd); if_attach(ifp); | if_attach(ifp); | .Ed .Ss Other issues The new macros for statistics: .Bd -literal ##old-style## ##new-style## | IF_DROP(&ifp->if_snd); | IFQ_INC_DROPS(&ifp->if_snd); | ifp->if_snd.ifq_len++; | IFQ_INC_LEN(&ifp->if_snd); | ifp->if_snd.ifq_len--; | IFQ_DEC_LEN(&ifp->if_snd); | .Ed Some drivers instruct the hardware to invoke transmission complete interrupts only when it thinks necessary. Rate-limiting breaks its assumption. .Ss How to convert drivers using multiple ifqueues Some (pseudo) devices (such as slip) have another .Dv ifqueue to prioritize packets. It is possible to eliminate the second queue since .Nm provides more flexible mechanisms but the following shows how to keep the original behavior. .Bd -literal struct sl_softc { struct ifnet sc_if; /* network-visible interface */ ... struct ifqueue sc_fastq; /* interactive output queue */ ... }; .Ed The driver doesn't compile in the new model since it has the following line .Po .Fa if_snd is no longer a type of .Dv struct ifqueue .Pc . .Bd -literal struct ifqueue *ifq = &ifp->if_snd; .Ed A simple way is to use the original .Fn IF_XXX macros for .Fa sc_fastq and use the new .Fn IFQ_XXX macros for .Fa if_snd . The enqueue operation looks like: .Bd -literal ##old-style## ##new-style## | struct ifqueue *ifq = &ifp->if_snd; | struct ifqueue *ifq = NULL; | if (ip->ip_tos & IPTOS_LOWDELAY) | if ((ip->ip_tos & IPTOS_LOWDELAY) && ifq = &sc->sc_fastq; | !ALTQ_IS_ENABLED(&sc->sc_if.if_snd)) { | ifq = &sc->sc_fastq; if (IF_QFULL(ifq)) { | if (IF_QFULL(ifq)) { IF_DROP(ifq); | IF_DROP(ifq); m_freem(m); | m_freem(m); splx(s); | error = ENOBUFS; sc->sc_if.if_oerrors++; | } else { return (ENOBUFS); | IF_ENQUEUE(ifq, m); } | error = 0; IF_ENQUEUE(ifq, m); | } | } else | IFQ_ENQUEUE(&sc->sc_if.if_snd, | NULL, m, error); | | if (error) { | splx(s); | sc->sc_if.if_oerrors++; | return (error); | } if ((sc->sc_oqlen = | if ((sc->sc_oqlen = sc->sc_ttyp->t_outq.c_cc) == 0) | sc->sc_ttyp->t_outq.c_cc) == 0) slstart(sc->sc_ttyp); | slstart(sc->sc_ttyp); splx(s); | splx(s); | .Ed The dequeue operations looks like: .Bd -literal ##old-style## ##new-style## | s = splimp(); | s = splimp(); IF_DEQUEUE(&sc->sc_fastq, m); | IF_DEQUEUE(&sc->sc_fastq, m); if (m == NULL) | if (m == NULL) IF_DEQUEUE(&sc->sc_if.if_snd, m); | IFQ_DEQUEUE(&sc->sc_if.if_snd, m); splx(s); | splx(s); | .Ed .Sh QUEUEING DISCIPLINES Queuing disciplines need to maintain .Fa ifq_len .Po used by .Fn IFQ_IS_EMPTY .Pc . Queuing disciplines also need to guarantee the same mbuf is returned if .Fn IFQ_DEQUEUE is called immediately after .Fn IFQ_POLL . .Sh SEE ALSO .Xr pf.conf 5 , .Xr pfctl 8 .Sh HISTORY The .Nm system first appeared in March 1997.