· KLDP.org · KLDP.net · KLDP Wiki · KLDP BBS ·
Soft Irqs And Tasklets

  • Tasklets are implemented on top of softirqs. In kernel source code, both SoftIrq and TaskLet are displayed in "softirq"

  • statically allocated, while tasklets can also be allocated and initialized at runtime (for instance, when loading a kernel module)
  • Softirqs can run concurrently on several CPUs, even if they are of the same type. Thus, softirqs are reentrant functions and must explicitly protect their data structures with spin locks.
  • Tasklets of the same type are always serialized


  • raise_softirq()
    1. Excutes the local_irq_save to save the state of the IF flag of the eflags register and to disable interrupts on the local CPU
    2. Marks the softirq as pending by setting the bit corresponding to the index nr in the softirq bit mask of the local CPU
    3. If in_interrupt() yields the value 1. it jumps to step 5. This situation indicates either that raise_softirq() has been invoked in interrupt context, or that the softirqs are currently disabled.
    4. Otherwise, invokes wakeup_softirqd() to wake up, if necessary, the ksoftirqd kernel thread of the local CPU
    5. Executes the local_irq_restore macro to restore the state of the IF flag saved in step 1.

  • Checks for active (pending) softirqs should be perfomed periodically, but without inducing too much overhead. They are performed in a few points of the kernel code. Here is a list of the most significant points (be warned that number and position of the softirq checkpoints change both with the kernel version and with the supported hardware architecture

    1. When the kernel invokes the local_bh_enable( ) function to enable softirqs on the local CPU

    2. When the do_IRQ( ) function finishes handling an I/O interrupt and invokes the irq_exit( ) macro
    3. If the system uses an I/O APIC, when the smp_apic_timer_interrupt( ) function finishes handling a local timer interrupt

    4. In multiprocessor systems, when a CPU finishes handling a function triggered by a CALL_FUNCTION_VECTOR interprocessor interrupt

    5. When one of the special ksoftirqd/n kernel threads is awakened (see later)

    Implementation of Softirqs (kernel/softirq.c)

     * structure representing a single softirq entry
    struct softirq_action {
            void (*action)(struct softirq_action *); /* function to run */
            void *data;                              /* data to pass to function */

    A 32-entry array of this structure is declared in kernel/softirq.c:

    static struct softirq_action softirq_vec32;

    Each registered softirq consumes one entry in the array. Consequently, there can be a maximum of 32 registered softirqs. Note that this cap is fixedthe maximum number of registered softirqs cannot be dynamically changed. In the current kernel, however, only 6 of the 32 entries are used3.

    Most drivers use tasklets for their bottom half. Tasklets are built off softirqs

    The Softirq Handler

    The prototype of a softirq handler, action, looks like:

    void softirq_handler(struct softirq_action *)

    When the kernel runs a softirq handler, it executes this action function with a pointer to the corresponding softirq_action structure as its lone argument. For example, if my_softirq pointed to an entry in the softirq_vec array, the kernel would invoke the softirq handler function as


    It seems a bit odd that the kernel passes the entire structure, and not just the data value, to the softirq handler. This trick allows future additions to the structure without requiring a change in every softirq handler. Softirq handlers can retrieve the data value, if they need to, simply by dereferencing their argument and reading the data member. A softirq never preempts another softirq. In fact, the only event that can preempt a softirq is an interrupt handler.

    Using Softirqs

    Softirqs are reserved for the most timing-critical and important bottom-half processing on the system. Currently, only two subsystemsnetworking and SCSIdirectly use softirqs. Additionally, kernel timers and tasklets are built on top of softirqs. If you are adding a new softirq, you normally want to ask yourself why using a tasklet is insufficient. Tasklets are dynamically created and are simpler to use because of their weaker locking requirements, and they still perform quite well. Nonetheless, for timing-critical applications that are able to do their own locking in an efficient way, softirqs might be the correct solution.

    Softirq handler can be handled by another processor at the same time. Thus, It is careful to synchronize with global data. Because that, We recommend tasklet.

    Raising Your Softirq

    After a handler is added to the enum list and registered via open_softirq(), it is ready to run. To mark it pending, so that it is run at the next invocation of do_softirq(), call raise_softirq(). For example, the networking subsystem would call

    This raises the NET_TX_SOFTIRQ softirq. Its handler, net_tx_action(), runs the next time the kernel executes softirqs. This function disables interrupts prior to actually raising the softirq, and then restores them to their previous state. If interrupts are already off, the function raise_softirq_irqoff() can be used as a minor optimization. For example:
     * interrupts must already be off!

    Softirqs are most often raised from within interrupt handlers. In the case of interrupt handlers, the interrupt handler performs the basic hardware-related work, raises the softirq, and then exits. When processing interrupts, the kernel invokes do_softirq(). The softirq then runs and picks up where the interrupt handler left off. In this example, the "top half" and "bottom half" naming should make sense.

    you can count on one hand the users of softirqs. (미국인들도 이런 표현을 쓰는 구나 ?? )


    Because tasklets are implemented on top of softirqs, they are softirqs. As discussed, tasklets are represented by two softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ. The only real difference in these types is that the HI_SOFTIRQ-based tasklets run prior to the TASKLET_SOFTIRQ tasklets.

    The Tasklet Structure

    Tasklets are represented by the tasklet_struct structure. Each structure represents a unique tasklet. The structure is declared in <linux/interrupt.h>:

    struct tasklet_struct {
    struct tasklet_struct *next; /* next tasklet in the list */ unsigned long state; /* state of the tasklet */ atomic_t count; /* reference counter */ void (*func)(unsigned long); /* tasklet handler function */ unsigned long data; /* argument to the tasklet function */

    The func member is the tasklet handler (the equivalent of action to a softirq) and it receives data as its sole argument.

    The state member is one of zero, TASKLET_STATE_SCHED, or TASKLET_STATE_RUN. TASKLET_STATE_SCHED denotes a tasklet that is scheduled to run and TASKLET_STATE_RUN denotes a tasklet that is running. As an optimization, TASKLET_STATE_RUN is used only on multiprocessor machines because a uniprocessor machine always knows whether the tasklet is running (it is either the currently executing code, or not).

    The count field is used as a reference count for the tasklet. If it is nonzero, the tasklet is disabled and cannot run; if it is zero, the tasklet is enabled and can run if marked pending.

    Scheduling Tasklets

    Scheduled tasklets (the equivalent of raised softirqs)are stored in two per-processor structures: tasklet_vec (for regular tasklets) and tasklet_hi_vec (for high-priority tasklets). Both of these structures are linked lists of tasklet_struct structures. Each tasklet_struct structure in the list represents a different tasklet.

    Yet another example of the evil naming schemes at work here. Why are softirqs raised but tasklets scheduled? Who knows? Both terms mean to mark that bottom half pending so that it is executed soon

    Tasklets are scheduled via the tasklet_schedule() and tasklet_hi_schedule() functions, which receive a pointer to the tasklet's tasklet_struct as their lone argument. The two functions are very similar (the difference being that one uses TASKLET_SOFTIRQ and one uses HI_SOFTIRQ). For now, let's look at the details of tasklet_schedule():

    1. Check whether the tasklet's state is TASKLET_STATE_SCHED. If it is, the tasklet is already scheduled to run and the function can immediately return.

    2. Save the state of the interrupt system, and then disable local interrupts. This ensures that nothing on this processor will mess with the tasklet code while tasklet_schedule() is manipulating the tasklets.

    3. Add the tasklet to be scheduled to the head of the tasklet_vec or tasklet_hi_vec linked list, which is unique to each processor in the system.

    4. Raise the TASKLET_SOFTIRQ or HI_SOFTIRQ softirq, so do_softirq() will execute this tasklet in the near future.

    5. Restore interrupts to their previous state and return.

    At the next earliest convenience, do_softirq() is run as discussed in the previous section. Because most tasklets and softirqs are marked pending in interrupt handlers, do_softirq() most likely runs when the last interrupt returns. Because TASKLET_SOFTIRQ or HI_SOFTIRQ is now raised, do_softirq() executes the associated handlers. These handlers, tasklet_action() and tasklet_hi_action(), are the heart of tasklet processing. Let's look at what they do:

    1. Disable local interrupt delivery (there is no need to first save their state because the code here is always called as a softirq handler and interrupts are always enabled) and retrieve the tasklet_vec or tasklet_hi_vec list for this processor.

    2. Clear the list for this processor by setting it equal to NULL.

    3. Enable local interrupt delivery. Again, there is no need to restore them to their previous state because this function knows that they were always originally enabled.

    4. Loop over each pending tasklet in the retrieved list.

    5. If this is a multiprocessing machine, check whether the tasklet is running on another processor by checking the TASKLET_STATE_RUN flag. If it is currently running, do not execute it now and skip to the next pending tasklet (recall, only one tasklet of a given type may run concurrently).

    6. If the tasklet is not currently running, set the TASKLET_STATE_RUN flag, so another processor will not run it.

    7. Check for a zero count value, to ensure that the tasklet is not disabled. If the tasklet is disabled, skip it and go to the next pending tasklet.

    8. We now know that the tasklet is not running elsewhere, is marked as running so it will not start running elsewhere, and has a zero count value. Run the tasklet handler.

    9. After the tasklet runs, clear the TASKLET_STATE_RUN flag in the tasklet's state field.

    10. Repeat for the next pending tasklet, until there are no more scheduled tasklets waiting to run.

    The implementation of tasklets is simple, but rather clever. As you saw, all tasklets are multiplexed on top of two softirqs, HI_SOFTIRQ and TASKLET_SOFTIRQ. When a tasklet is scheduled, the kernel raises one of these softirqs. These softirqs, in turn, are handled by special functions that then run any scheduled tasklets. The special functions ensure that only one tasklet of a given type is running at the same time (but other tasklets can run simultaneously). All this complexity is then hidden behind a clean and simple interface.

    Declaring Your Tasklet

    You can create tasklets statically or dynamically. What option you choose depends on whether you have (or want) a direct or indirect reference to the tasklet. If you are going to statically create the tasklet (and thus have a direct reference to it), use one of two macros in <linux/interrupt.h>:
    DECLARE_TASKLET(name, func, data)
    DECLARE_TASKLET_DISABLED(name, func, data);

    Both these macros statically create a struct tasklet_struct with the given name. When the tasklet is scheduled, the given function func is executed and passed the argument data. The difference between the two macros is the initial reference count. The first macro creates the tasklet with a count of zero, and the tasklet is enabled. The second macro sets count to one, and the tasklet is disabled. Here is an example:
    DECLARE_TASKLET(my_tasklet, my_tasklet_handler, dev);

    This line is equivalent to
    struct tasklet_struct my_tasklet = { NULL, 0, ATOMIC_INIT(0),
                                         my_tasklet_handler, dev };

    This creates a tasklet named my_tasklet that is enabled with tasklet_handler as its handler. The value of dev is passed to the handler when it is executed.

    To initialize a tasklet given an indirect reference (a pointer) to a dynamically created struct tasklet_struct, t, call tasklet_init():
    tasklet_init(t, tasklet_handler, dev);  /* dynamically as opposed to statically */

    Writing Your Tasklet Handler

    The tasklet handler must match the correct prototype:
    void tasklet_handler(unsigned long data)

    As with softirqs, tasklets cannot sleep. This means you cannot use semaphores or other blocking functions in a tasklet. Tasklets also run with all interrupts enabled, so you must take precautions (for example, disable interrupts and obtain a lock) if your tasklet shares data with an interrupt handler. Unlike softirqs, however, two of the same tasklets never run concurrently although two different tasklets can run at the same time on two different processors. If your tasklet shares data with another tasklet or softirq, you need to use proper locking

    === Scheduling Your Tasklet=== To schedule a tasklet for execution, tasklet_schedule() is called and passed a pointer to the relevant tasklet_struct:

    tasklet_schedule(&my_tasklet);    /* mark my_tasklet as pending */

    After a tasklet is scheduled, it runs once at some time in the near future. If the same tasklet is scheduled again, before it has had a chance to run, it still runs only once. If it is already running, for example on another processor, the tasklet is rescheduled and runs again. As an optimization, a tasklet always runs on the processor that scheduled it making better use of the processor's cache, you hope.

    You can disable a tasklet via a call to tasklet_disable(), which disables the given tasklet. If the tasklet is currently running, the function will not return until it finishes executing. Alternatively, you can use tasklet_disable_nosync(), which disables the given tasklet but does not wait for the tasklet to complete prior to returning. This is usually not safe because you cannot assume the tasklet is not still running. A call to tasklet_enable() enables the tasklet. This function also must be called before a tasklet created with DECLARE_TASKLET_DISABLED() is usable. For example:
    tasklet_disable(&my_tasklet);    /* tasklet is now disabled */
    /* we can now do stuff knowing that the tasklet cannot run .. */
    tasklet_enable(&my_tasklet);     /* tasklet is now enabled */

    You can remove a tasklet from the pending queue via tasklet_kill(). This function receives a pointer as a lone argument to the tasklet's tasklet_struct. Removing a scheduled tasklet from the queue is useful when dealing with a tasklet that often reschedules itself. This function first waits for the tasklet to finish executing and then it removes the tasklet from the queue. Nothing stops some other code from rescheduling the tasklet, of course. This function must not be used from interrupt context because it sleeps.


    Softirq (and thus tasklet) processing is aided by a set of per-processor kernel threads. These kernel threads help in the processing of softirqs when the system is overwhelmed with softirqs.

    As already described, the kernel processes softirqs in a number of places, most commonly on return from handling an interrupt. Softirqs might be raised at very high rates (such as during intense network traffic). Further, softirq functions can reactivate themselves. That is, while running, a softirq can raise itself so that it runs again (indeed, the networking subsystem does this). The possibility of a high frequency of softirqs in conjunction with their capability to remark themselves active can result in user-space programs being starved of processor time. Not processing the reactivated softirqs in a timely manner, however, is unacceptable. When softirqs were first designed, this caused a dilemma that needed fixing, and neither obvious solution was a good one.

    The solution ultimately implemented in the kernel is to not immediately process reactivated softirqs. Instead, if the number of softirqs grows excessive, the kernel wakes up a family of kernel threads to handle the load. The kernel threads run with the lowest possible priority (nice value of 19), which ensures they do not run in lieu of anything important. This concession prevents heavy softirq activity from completely starving user-space of processor time. Conversely, it also ensures that "excess" softirqs do run eventually. Finally, this solution has the added property that on an idle system, the softirqs are handled rather quickly (because the kernel threads will schedule immediately).

    There is one thread per processor. The threads are each named ksoftirqd/n where n is the processor number. On a two-processor system, you would have ksoftirqd/0 and ksoftirqd/1. Having a thread on each processor ensures an idle processor, if available, is always able to service softirqs. After the threads are initialized, they run a tight loop similar to this:
    for (;;) {
            if (!softirq_pending(cpu))
            while (softirq_pending(cpu)) {
                    if (need_resched())

    If any softirqs are pending (as reported by softirq_pending()), ksoftirqd calls do_softirq() to handle them. Note that it does this repeatedly to handle any reactivated softirqs, too. After each iteration, schedule() is called if needed, to allow more important processes to run. After all processing is complete, the kernel thread sets itself TASK_INTERRUPTIBLE and invokes the scheduler to select a new runnable process.

    The softirq kernel threads are awakened whenever do_softirq() detects an executed kernel thread reactivating itself.

    sponsored by andamiro
    sponsored by cdnetworks
    sponsored by HP

    Valid XHTML 1.0! Valid CSS! powered by MoniWiki
    last modified 2006-05-14 13:24:26
    Processing time 0.0089 sec