Commit Graph

97 Commits

Author SHA1 Message Date
ananjaser1211
a9a7c24d76 Add BFQ I/O Scheduler. Enable ROW I/Q Scheduler in Makefile.
Add the BFQ-v7r8 I/O scheduler to 3.18.0.
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
(bfq_)queue is associated to each task doing I/O on a device, and each
time a scheduling decision has to be made a queue is selected and served
until it expires.

    - Slices are given in the service domain: tasks are assigned
      budgets, measured in number of sectors. Once got the disk, a task
      must however consume its assigned budget within a configurable
      maximum time (by default, the maximum possible value of the
      budgets is automatically computed to comply with this timeout).
      This allows the desired latency vs "throughput boosting" tradeoff
      to be set.

    - Budgets are scheduled according to a variant of WF2Q+, implemented
      using an augmented rb-tree to take eligibility into account while
      preserving an O(log N) overall complexity.

    - A low-latency tunable is provided; if enabled, both interactive
      and soft real-time applications are guaranteed a very low latency.

    - Latency guarantees are preserved also in the presence of NCQ.

    - Also with flash-based devices, a high throughput is achieved
      while still preserving latency guarantees.

    - BFQ features Early Queue Merge (EQM), a sort of fusion of the
      cooperating-queue-merging and the preemption mechanisms present
      in CFQ. EQM is in fact a unified mechanism that tries to get a
      sequential read pattern, and hence a high throughput, with any
      set of processes performing interleaved I/O over a contiguous
      sequence of sectors.

    - BFQ supports full hierarchical scheduling, exporting a cgroups
      interface.  Since each node has a full scheduler, each group can
      be assigned its own weight.

    - If the cgroups interface is not used, only I/O priorities can be
      assigned to processes, with ioprio values mapped to weights
      with the relation weight = IOPRIO_BE_NR - ioprio.

    - ioprio classes are served in strict priority order, i.e., lower
      priority queues are not served as long as there are higher
      priority queues.  Among queues in the same class the bandwidth is
      distributed in proportion to the weight of each queue. A very
      thin extra bandwidth is however guaranteed to the Idle class, to
      prevent it from starving.

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
Signed-off-by: Tkkg1994 <luca.grifo@outlook.com>
Signed-off-by: djb77 <dwayne.bakewell@gmail.com>
2018-07-19 15:28:28 +02:00
ananjaser1211
7eb995fed9 Add ROW I/O Scheduler.
Squashed commit of the following:

commit f49e14ccdcb6694ed27754e020057d27a8fcca07
Author: Andrei F <luxneb@gmail.com>
Date:   Thu Nov 26 22:40:38 2015 +0100

    elevator: Fix a race in elevator switching

    commit d50235b7bc upstream.

    There's a race between elevator switching and normal io operation.
        Because the allocation of struct elevator_queue and struct elevator_data
        don't in a atomic operation.So there are have chance to use NULL
        ->elevator_data.
        For example:
            Thread A:                               Thread B
            blk_queu_bio                            elevator_switch
            spin_lock_irq(q->queue_block)           elevator_alloc
            elv_merge                               elevator_init_fn

        Because call elevator_alloc, it can't hold queue_lock and the
        ->elevator_data is NULL.So at the same time, threadA call elv_merge and
        nedd some info of elevator_data.So the crash happened.

        Move the elevator_alloc into func elevator_init_fn, it make the
        operations in a atomic operation.

        Using the follow method can easy reproduce this bug
        1:dd if=/dev/sdb of=/dev/null
        2:while true;do echo noop > scheduler;echo deadline > scheduler;done

        The test method also use this method.

    Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Cc: Jonghwan Choi <jhbird.choi@samsung.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit daf22a727e64f1277b074442efb821366015ca72
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Jul 25 13:45:21 2013 +0300

    block: row: Remove warning massage from add_request

    Regular priority queues is marked as "starved" if it skipped a dispatch
    due to being empty. When a new request is added to a "starved" queue
    it will be marked as urgent.
    The removed WARN_ON was warning about an impossible case when a regular
    priority (read) queue was marked as starved but wasn't empty. This is
    a possible case due to the bellow:
    If the device driver fetched a read request that is pending for
    transmission and an URGENT request arrives, the fetched read will be
    reinserted back to the scheduler. Its possible that the queue it will be
    reinserted to was marked as "starved" in the meanwhile due to being empty.

    CRs-fixed: 517800
    Change-Id: Iaae642ea0ed9c817c41745b0e8ae2217cc684f0c
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit dca47e75f1413d58e4f97ef638e5d4456c55bdce
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Tue Jul 2 14:43:13 2013 +0300

    block: row: change hrtimer_cancel to hrtimer_try_to_cancel

    Calling hrtimer_cancel with interrupts disabled can result in a livelock.
    When flushing plug list in the block layer interrupts are disabled and an
    hrtimer is used when adding requests from that plug list to the scheduler.
    In this code flow, if the hrtimer (which is used for idling) is set, it's
    being canceled by calling hrtimer_cancel. hrtimer_cancel will perform
    the following in an endless loop:
    1. try cancel the timer
    2. if fails - rest_cpu
    the cancellation can fail if the timer function already started. Since
    interrupts are disabled it can never complete.
    This patch reduced the number of times the hrtimer lock is taken while
    interrupts are disabled by calling hrtimer_try_co_cancel. the later will
    try to cancel the timer just once and return with an error code if fails.

    CRs-fixed: 499887
    Change-Id: I25f79c357426d72ad67c261ce7cb503ae97dc7b9
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit a6047b9d808eaa787e4df3107bea7536334856cd
Author: Lee Susman <lsusman@codeaurora.org>
Date:   Sun Jun 23 16:27:40 2013 +0300

    block: row-iosched idling triggered by readahead pages

    In the current implementation idling is triggered only by request
    insertion frequency. This heuristic is not very accurate and may hit
    random requests that shouldn't trigger idling. This patch uses the
    PG_readahead flag in struct page's flags, which indicates that the page
    is part of a readahead window, to start idling upon dispatch of a request
    associated with a readahead page.

    The above readehead flag is used together with the existing
    insertion-frequency trigger. The frequency timer will catch read requests
    which are not part of a readahead window, but are still part of a
    sequential stream (and therefore dispatched in small time intervals).

    Change-Id: Icb7145199c007408de3f267645ccb842e051fd00
    Signed-off-by: Lee Susman <lsusman@codeaurora.org>

commit e70e4e8e1d1f111023dd2b2d0fc9237240cab9ab
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Wed May 1 14:35:20 2013 +0300

    block: urgent: Fix dispatching of URGENT mechanism

    There are cases when blk_peek_request is called not from blk_fetch_request
    thus the URGENT request may be started but the flag q->dispatched_urgent is
    not updated.

    Change-Id: I4fb588823f1b2949160cbd3907f4729767932e12
    CRs-fixed: 471736
    CRs-fixed: 473036
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 0e36870f6a436840eed1782d0e85b4adb300b59f
Author: Maya Erez <merez@codeaurora.org>
Date:   Sun Apr 14 15:19:52 2013 +0300

    block: row: Fix starvation tolerance values

    The current starvation tolerance values increase the boot time
    since high priority SW requests are delayed by regular priority requests.
    In order to overcome this, increase the starvation tolerance values.

    Change-Id: I9947fca9927cbd39a1d41d4bd87069df679d3103
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
    Signed-off-by: Maya Erez <merez@codeaurora.org>

commit 3cab8d28e735fdad300eda3bed703129ba05d70a
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Apr 11 14:57:15 2013 +0300

    block: urgent request: Update dispatch_urgent in case of requeue/reinsert

    The block layer implements a mechanism for verifying that the device
    driver won't be notified of an URGENT request if there is already an
    URGENT request in flight. This is due to the fact that interrupting an
    URGENT request isn't efficient.
    This patch fixes the above described mechanism in case the URGENT request
    was returned back to the block layer from some reason: by requeue or
    reinsert.

    CRs-fixed: 473376, 473036, 471736
    Change-Id: Ie8b8208230a302d4526068531616984825f1050d
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit e052e4574bb928b44e660b9679d23e14011b0b9d
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Mar 21 11:04:02 2013 +0200

    block: row: Update sysfs functions

    All ROW (time related) configurable parameters are stored in ms so there
    is no need to convert from/to ms when reading/updating them via sysfs.

    Change-Id: Ib6a1de54140b5d25696743da944c076dd6fc02ae
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

    Conflicts:
    	block/row-iosched.c

commit 2c3203650c2109c18abb3b17a5114d54bb22e683
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Mar 21 13:02:07 2013 +0200

    block: row: Prevent starvation of regular priority by high priority

    At the moment all REGULAR and LOW priority requests are starved as long as
    there are HIGH priority requests to dispatch.
    This patch prevents the above starvation by setting a starvation limit the
    REGULAR\LOW priority requests can tolerate.

    Change-Id: Ibe24207982c2c55d75c0b0230f67e013d1106017
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit a5434f618d395a03fe19ef430a8c5747bad069f9
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Tue Mar 12 21:02:33 2013 +0200

    block: urgent request: remove unnecessary urgent marking

    An urgent request is marked by the scheduler in rq->cmd_flags with the
    REQ_URGENT flag. There is no need to add an additional marking by
    the block layer.

    Change-Id: I05d5e9539d2f6c1bfa80240b0671db197a5d3b3f
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 3928fb74c2f78578c57913938644acb704b77586
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Tue Mar 12 21:17:18 2013 +0200

    block: row: Re-design urgent request notification mechanism

    When ROW scheduler reports to the block layer that there is an urgent
    request pending, the device driver may decide to stop the transmission
    of the current request in order to handle the urgent one. This is done
    in order to reduce the latency of an urgent request. For example:
    long WRITE may be stopped to handle an urgent READ.

    This patch updates the ROW URGENT notification policy to apply with the
    below:

    - Don't notify URGENT if there is an un-completed URGENT request in driver
    - After notifying that URGENT request is present, the next request
      dispatched is the URGENT one.
    - At every given moment only 1 request can be marked as URGENT.
      Independent of it's location (driver or scheduler)

    Other changes to URGENT policy:
    - Only READ queues are allowed to notify of an URGENT request pending.

    CR fix:
    If a pending urgent request (A) gets merged with another request (B)
    A is removed from scheduler queue but is not removed from
    rd->pending_urgent_rq.

    CRs-Fixed: 453712
    Change-Id: I321e8cf58e12a05b82edd2a03f52fcce7bc9a900
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 8912aa92e3d919ceabc72b2eddc829fc5e4bd7eb
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Jan 24 16:17:27 2013 +0200

    block: row: Update initial values of ROW data structures

    This patch sets the initial values of internal ROW
    parameters.

    Change-Id: I38132062a7fcbe2e58b9cc757e55caac64d013dc
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
    [smuckle@codeaurora.org: ported from msm-3.7]
    Signed-off-by: Steve Muckle <smuckle@codeaurora.org>

commit b709e1a8a56784cb83c2c31a4e7df574a6b29802
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Jan 24 15:08:40 2013 +0200

    block: row: Don't notify URGENT if there are un-completed urgent req

    When ROW scheduler reports to the block layer that there is an urgent
    request pending, the device driver may decide to stop the transmission
    of the current request in order to handle the urgent one. If the current
    transmitted request is an urgent request - we don't want it to be
    stopped.
    Due to the above ROW scheduler won't notify of an urgent request if
    there are urgent requests in flight.

    Change-Id: I2fa186d911b908ec7611682b378b9cdc48637ac7
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit eba966603cc8e6f8fb418bf702f5a6eca5f56f34
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Jan 24 04:01:59 2013 +0200

    block: add REQ_URGENT to request flags

    This patch adds a new flag to be used in cmd_flags field of struct request
    for marking request as urgent.
    Urgent request is the one that should be given priority currently handled
    (regular) request by the device driver. The decision of a request urgency
    is taken by the scheduler.

    Change-Id: Ic20470987ef23410f1d0324f96f00578f7df8717
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

    Conflicts:
    	include/linux/blk_types.h

commit 7c865ab1a9ae626d023d0b03ed7fbe5c57bcbe7c
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Jan 17 20:56:07 2013 +0200

    block: row: Idling mechanism re-factoring

    At the moment idling in ROW is implemented by delayed work that uses
    jiffies granularity which is not very accurate. This patch replaces
    current idling mechanism implementation with hrtime API, which gives
    nanosecond resolution (instead of jiffies).

    Change-Id: I86c7b1776d035e1d81571894b300228c8b8f2d92
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 72ea1d39c04734bf5eb52117968704148d2da42f
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Wed Jan 23 17:15:49 2013 +0200

    block: row: Dispatch requests according to their io-priority

    This patch implements "application-hints" which is a way the issuing
    application can notify the scheduler on the priority of its request.
    This is done by setting the io-priority of the request.
    This patch reuses an already existing mechanism of io-priorities developed
    for CFQ. Please refer to kernel/Documentation/block/ioprio.txt for
    usage example and explanations.

    Change-Id: I228ec8e52161b424242bb7bb133418dc8b73925a
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 9f8f3d2757788477656b1d25a3055ae11d97cee4
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Sat Jan 12 16:23:18 2013 +0200

    block: row: Aggregate row_queue parameters to one structure

    Each ROW queues has several parameters which default values are defined
    in separate arrays. This patch aggregates all default values into one
    array.
    The values in question are:
     - is idling enabled for the queue
     - queue quantum
     - can the queue notify on urgent request

    Change-Id: I3821b0a042542295069b340406a16b1000873ec6
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit d84ad45f3077661cab5984cd2fb7d5ef2ff06e39
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Sat Jan 12 16:21:47 2013 +0200

    block: row: fix sysfs functions - idle_time conversion

    idle_time was updated to be stored in msec instead of jiffies.
    So there is no need to convert the value when reading from user or
    displaying the value to him.

    Change-Id: I58e074b204e90a90536d32199ac668112966e9cf
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 202b21e9daf7b8a097f97f764bb4ad4712c75fa7
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Sat Jan 12 16:21:12 2013 +0200

    block: row: Insert dispatch_quantum into struct row_queue

    There is really no point in keeping the dispatch quantum
    of a queue outside of it. By inserting it to the row_queue
    structure we spare extra level in accessing it.

    Change-Id: Ic77571818b643e71f9aafbb2ca93d0a92158b199
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 58ca84f091faa6ff8c4f567b158be5d38f9a5c58
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Sun Jan 13 22:04:59 2013 +0200

    block: row: Add some debug information on ROW queues

    1. Add a counter for number of requests on queue.
    2. Add function to print queues status (number requests
       currently on queue and number of already dispatched requests
       in current dispatch cycle).

    Change-Id: I1e98b9ca33853e6e6a8ddc53240f6cd6981e6024
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 1bbb2c7ada5a647cab1f2306458d6cf9b821ddf7
Author: Subhash Jadavani <subhashj@codeaurora.org>
Date:   Thu Jan 10 02:15:13 2013 +0530

    block: blk-merge: don't merge the pages with non-contiguous descriptors

    blk_rq_map_sg() function merges the physically contiguous pages to use same
    scatter-gather node without checking if their page descriptors are
    contiguous or not.

    Now when dma_map_sg() is called on the scatter gather list, it would
    take the base page pointer from each node (one by one) and iterates
    through all of the pages in same sg node by keep incrementing the base
    page pointer with the assumption that physically contiguous pages will
    have their page descriptor address contiguous which may not be true
    if SPARSEMEM config is enabled. So here we may end referring to invalid
    page descriptor.

    Following table shows the example of physically contiguous pages but
    their page descriptor addresses non-contiguous.
    -------------------------------------------
    | Page Descriptor    |   Physical Address |
    ------------------------------------------
    | 0xc1e43fdc         |   0xdffff000       |
    | 0xc2052000         |   0xe0000000       |
    -------------------------------------------

    With this patch, relevant blk-merge functions will also check if the
    physically contiguous pages are having page descriptors address contiguous
    or not? If not then, these pages are separated to be in different
    scatter-gather nodes.

    CRs-Fixed: 392141
    Change-Id: I3601565e5569a69f06fb3af99061c4d4c23af241
    Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>

    Conflicts:
    	block/blk-merge.c

commit 9a9b428480c932ef8434d8b9bd3b7bafdcac3f84
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Dec 20 19:23:58 2012 +0200

    row: Add support for urgent request handling

    This patch adds support for handling urgent requests.
    ROW queue can be marked as "urgent" so if it was un-served in last
    dispatch cycle and a request was added to it - it will trigger
    issuing an urgent-request-notification to the block device driver.
    The block device driver may choose at stop the transmission of current
    ongoing request to handle the urgent one. Foe example: long WRITE may
    be stopped to handle an urgent READ. This decreases READ latency.

    Change-Id: I84954c13f5e3b1b5caeadc9fe1f9aa21208cb35e
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 8d5ec526b7e70307d3c4ce587b714349f44c0be8
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Dec 6 13:17:19 2012 +0200

    block:row: fix idling mechanism in ROW

    This patch addresses the following issues found in the ROW idling
    mechanism:
    1. Fix the delay passed to queue_delayed_work (pass actual delay
       and not the time when to start the work)
    2. Change the idle time and the idling-trigger frequency to be
       HZ dependent (instead of using msec_to_jiffies())
    3. Destroy idle_workqueue() in queue_exit

    Change-Id: If86513ad6b4be44fb7a860f29bd2127197d8d5bf
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

    Conflicts:
    	block/row-iosched.c

commit c26a95811462b9ba8eca23b4ba2150e7b660ca40
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Tue Oct 30 08:33:06 2012 +0200

    row: Adding support for reinsert already dispatched req

    Add support for reinserting already dispatched request back to the
    schedulers internal data structures.
    The request will be reinserted back to the queue (head) it was
    dispatched from as if it was never dispatched.

    Change-Id: I70954df300774409c25b5821465fb3aa33d8feb5
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit a1a6f09cae0149d935bcea3f20d4acb6556d68f9
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Tue Dec 4 16:04:15 2012 +0200

    block: Add API for urgent request handling

    This patch add support in block & elevator layers for handling
    urgent requests. The decision if a request is urgent or not is taken
    by the scheduler. Urgent request notification is passed to the underlying
    block device driver (eMMC for example). Block device driver may decide to
    interrupt the currently running low priority request to serve the new
    urgent request. By doing so READ latency is greatly reduced in read&write
    collision scenarios.

    Note that if the current scheduler doesn't implement the urgent request
    mechanism, this code path is never activated.

    Change-Id: I8aa74b9b45c0d3a2221bd4e82ea76eb4103e7cfa
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

    Conflicts:
    	block/blk-core.c

commit 4e907d9d6079629d6ce61fbdfb1a629d3587e176
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Tue Dec 4 15:54:43 2012 +0200

    block: Add support for reinsert a dispatched req

    Add support for reinserting a dispatched request back to the
    scheduler's internal data structures.
    This capability is used by the device driver when it chooses to
    interrupt the current request transmission and execute another (more
    urgent) pending request. For example: interrupting long write in order
    to handle pending read. The device driver re-inserts the
    remaining write request back to the scheduler, to be rescheduled
    for transmission later on.

    Add API for verifying whether the current scheduler
    supports reinserting requests mechanism. If reinsert mechanism isn't
    supported by the scheduler, this code path will never be activated.

    Change-Id: I5c982a66b651ebf544aae60063ac8a340d79e67f
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit 0675c27faab797f7149893b84cc357aadb37c697
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Mon Oct 15 20:56:02 2012 +0200

    block: ROW: Fix forced dispatch

    This patch fixes forced dispatch in the ROW scheduling algorithm.
    When the dispatch function is called with the forced flag on, we
    can't delay the dispatch of the requests that are in scheduler queues.
    Thus, when dispatch is called with forced turned on, we need to cancel
    idling, or not to idle at all.

    Change-Id: I3aa0da33ad7b59c0731c696f1392b48525b52ddc
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

commit ce6acf59662d1bbe5663a64aef9fe1695b8bbe1b
Author: Tatyana Brokhman <tlinder@codeaurora.org>
Date:   Thu Sep 20 10:46:10 2012 +0300

    block: Adding ROW scheduling algorithm

    This patch adds the implementation of a new scheduling algorithm - ROW.
    The policy of this algorithm is to prioritize READ requests over WRITE
    as much as possible without starving the WRITE requests.

    Change-Id: I4ed52ea21d43b0e7c0769b2599779a3d3869c519
    Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>

Signed-off-by: Tkkg1994 <luca.grifo@outlook.com>
Signed-off-by: djb77 <dwayne.bakewell@gmail.com>
2018-07-19 15:27:34 +02:00
BlackMesa123
7608e6b787 A530FXXU2BRG1
Signed-off-by: BlackMesa123 <brother12@hotmail.it>
2018-07-16 20:05:35 +02:00
Matias Bjørling
b2b7e00148 null_blk: register as a LightNVM device
Add support for registering as a LightNVM device. This allows us to
evaluate the performance of the LightNVM subsystem.

In /drivers/Makefile, LightNVM is moved above block device drivers
to make sure that the LightNVM media managers have been initialized
before drivers under /drivers/block are initialized.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Fix by Jens Axboe to remove unneeded slab cache and the following
memory leak.
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:22:28 -07:00
Christoph Hellwig
bbd3e06436 block: add an API for Persistent Reservations
This commits adds a driver API and ioctls for controlling Persistent
Reservations s/genericly/generically/ at the block layer.  Persistent
Reservations are supported by SCSI and NVMe and allow controlling who gets
access to a device in a shared storage setup.

Note that we add a pr_ops structure to struct block_device_operations
instead of adding the members directly to avoid bloating all instances
of devices that will never support Persistent Reservations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-21 14:46:56 -06:00
Dongsu Park
2ec3182f9c Documentation: update notes in biovecs about arbitrarily sized bios
Update block/biovecs.txt so that it includes a note on what kind of
effects arbitrarily sized bios would bring to the block layer.
Also fix a trivial typo, bio_iter_iovec.

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-13 12:32:07 -06:00
Christoph Hellwig
4246a0b63b block: add a bi_error field to struct bio
Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-29 08:55:15 -06:00
Jens Axboe
0034af0365 block: make /sys/block/<dev>/queue/discard_max_bytes writeable
Lots of devices support huge discard sizes these days. Depending
on how the device handles them internally, huge discards can
introduce massive latencies (hundreds of msec) on the device side.

We have a sysfs file, discard_max_bytes, that advertises the max
hardware supported discard size. Make this writeable, and split
the settings into a soft and hard limit. This can be set from
'discard_granularity' and up to the hardware limit.

Add a new sysfs file, 'discard_max_hw_bytes', that shows the hw
set limit.

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-17 08:41:53 -06:00
Leonid V. Fedorenchik
8962786ce3 Documentation: Remove mentioning of block barriers
Remove mentioning of block barriers since they were removed.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Leonid V. Fedorenchik <leonidsbox@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-03-20 07:41:56 -06:00
Linus Torvalds
caf292ae5b Merge branch 'for-3.19/core' of git://git.kernel.dk/linux-block
Pull block driver core update from Jens Axboe:
 "This is the pull request for the core block IO changes for 3.19.  Not
  a huge round this time, mostly lots of little good fixes:

   - Fix a bug in sysfs blktrace interface causing a NULL pointer
     dereference, when enabled/disabled through that API.  From Arianna
     Avanzini.

   - Various updates/fixes/improvements for blk-mq:

        - A set of updates from Bart, mostly fixing buts in the tag
          handling.

        - Cleanup/code consolidation from Christoph.

        - Extend queue_rq API to be able to handle batching issues of IO
          requests. NVMe will utilize this shortly. From me.

        - A few tag and request handling updates from me.

        - Cleanup of the preempt handling for running queues from Paolo.

        - Prevent running of unmapped hardware queues from Ming Lei.

        - Move the kdump memory limiting check to be in the correct
          location, from Shaohua.

        - Initialize all software queues at init time from Takashi. This
          prevents a kobject warning when CPUs are brought online that
          weren't online when a queue was registered.

   - Single writeback fix for I_DIRTY clearing from Tejun.  Queued with
     the core IO changes, since it's just a single fix.

   - Version X of the __bio_add_page() segment addition retry from
     Maurizio.  Hope the Xth time is the charm.

   - Documentation fixup for IO scheduler merging from Jan.

   - Introduce (and use) generic IO stat accounting helpers for non-rq
     drivers, from Gu Zheng.

   - Kill off artificial limiting of max sectors in a request from
     Christoph"

* 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits)
  bio: modify __bio_add_page() to accept pages that don't start a new segment
  blk-mq: Fix uninitialized kobject at CPU hotplugging
  blktrace: don't let the sysfs interface remove trace from running list
  blk-mq: Use all available hardware queues
  blk-mq: Micro-optimize bt_get()
  blk-mq: Fix a race between bt_clear_tag() and bt_get()
  blk-mq: Avoid that __bt_get_word() wraps multiple times
  blk-mq: Fix a use-after-free
  blk-mq: prevent unmapped hw queue from being scheduled
  blk-mq: re-check for available tags after running the hardware queue
  blk-mq: fix hang in bt_get()
  blk-mq: move the kdump check to blk_mq_alloc_tag_set
  blk-mq: cleanup tag free handling
  blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq <-> cpu map
  blk: introduce generic io stat accounting help function
  blk-mq: handle the single queue case in blk_mq_hctx_next_cpu
  genhd: check for int overflow in disk_expand_part_tbl()
  blk-mq: add blk_mq_free_hctx_request()
  blk-mq: export blk_mq_free_request()
  blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable
  ...
2014-12-13 14:14:23 -08:00
Christoph Hellwig
125c99bc8b scsi: add new scsi-command flag for tagged commands
Currently scsi piggy backs on the block layer to define the concept
of a tagged command.  But we want to be able to have block-level host-wide
tags assigned even for untagged commands like the initial INQUIRY, so add
a new SCSI-level flag for commands that are tagged at the scsi level, so
that even commands without that set can have tags assigned to them.  Note
that this alredy is the case for the blk-mq code path, and this just lets
the old path catch up with it.

We also set this flag based upon sdev->simple_tags instead of the block
queue flag, so that it is entirely independent of the block layer tagging,
and thus always correct even if a driver doesn't use block level tagging
yet.

Also remove the old blk_rq_tagged; it was only used by SCSI drivers, and
removing it forces them to look for the proper replacement.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
2014-11-12 11:19:40 +01:00
Jan Kara
b8ab956c54 block: Expand a bit documentation about elevator_allow_merge_fn
Explain that two requests can be merged without
elevator_allow_merge_fn() being called.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-11-04 08:28:15 -07:00
Linus Torvalds
d3dc366bba Merge branch 'for-3.18/core' of git://git.kernel.dk/linux-block
Pull core block layer changes from Jens Axboe:
 "This is the core block IO pull request for 3.18.  Apart from the new
  and improved flush machinery for blk-mq, this is all mostly bug fixes
  and cleanups.

   - blk-mq timeout updates and fixes from Christoph.

   - Removal of REQ_END, also from Christoph.  We pass it through the
     ->queue_rq() hook for blk-mq instead, freeing up one of the request
     bits.  The space was overly tight on 32-bit, so Martin also killed
     REQ_KERNEL since it's no longer used.

   - blk integrity updates and fixes from Martin and Gu Zheng.

   - Update to the flush machinery for blk-mq from Ming Lei.  Now we
     have a per hardware context flush request, which both cleans up the
     code should scale better for flush intensive workloads on blk-mq.

   - Improve the error printing, from Rob Elliott.

   - Backing device improvements and cleanups from Tejun.

   - Fixup of a misplaced rq_complete() tracepoint from Hannes.

   - Make blk_get_request() return error pointers, fixing up issues
     where we NULL deref when a device goes bad or missing.  From Joe
     Lawrence.

   - Prep work for drastically reducing the memory consumption of dm
     devices from Junichi Nomura.  This allows creating clone bio sets
     without preallocating a lot of memory.

   - Fix a blk-mq hang on certain combinations of queue depths and
     hardware queues from me.

   - Limit memory consumption for blk-mq devices for crash dump
     scenarios and drivers that use crazy high depths (certain SCSI
     shared tag setups).  We now just use a single queue and limited
     depth for that"

* 'for-3.18/core' of git://git.kernel.dk/linux-block: (58 commits)
  block: Remove REQ_KERNEL
  blk-mq: allocate cpumask on the home node
  bio-integrity: remove the needless fail handle of bip_slab creating
  block: include func name in __get_request prints
  block: make blk_update_request print prefix match ratelimited prefix
  blk-merge: don't compute bi_phys_segments from bi_vcnt for cloned bio
  block: fix alignment_offset math that assumes io_min is a power-of-2
  blk-mq: Make bt_clear_tag() easier to read
  blk-mq: fix potential hang if rolling wakeup depth is too high
  block: add bioset_create_nobvec()
  block: use bio_clone_fast() in blk_rq_prep_clone()
  block: misplaced rq_complete tracepoint
  sd: Honor block layer integrity handling flags
  block: Replace strnicmp with strncasecmp
  block: Add T10 Protection Information functions
  block: Don't merge requests if integrity flags differ
  block: Integrity checksum flag
  block: Relocate bio integrity flags
  block: Add a disk flag to block integrity profile
  block: Add prefix to block integrity profile flags
  ...
2014-10-18 11:53:51 -07:00
Martin K. Petersen
8492b68bc4 block: Remove integrity tagging functions
None of the filesystems appear interested in using the integrity tagging
feature. Potentially because very few storage devices actually permit
using the application tag space.

Remove the tagging functions.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:50 -06:00
Martin K. Petersen
180b2f95dd block: Replace bi_integrity with bi_special
For commands like REQ_COPY we need a way to pass extra information along
with each bio. Like integrity metadata this information must be
available at the bottom of the stack so bi_private does not suffice.

Rename the existing bi_integrity field to bi_special and make it a union
so we can have different bio extensions for each class of command.

We previously used bi_integrity != NULL as a way to identify whether a
bio had integrity metadata or not. Introduce a REQ_INTEGRITY to be the
indicator now that bi_special can contain different things.

In addition, bio_integrity(bio) will now return a pointer to the
integrity payload (when applicable).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:46 -06:00
Martin K. Petersen
e7258c1a26 block: Get rid of bdev_integrity_enabled()
bdev_integrity_enabled() is only used by bio_integrity_enabled().
Combine these two functions.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-09-27 09:14:44 -06:00
Arnd Hannemann
db4ced14c1 doc: queue-sysfs: minor fixes
This patches fixes a typo, and for consistency use
"IO" in upper case in the block/queue-sysfs.txt documentation.

Signed-off-by: Arnd Hannemann <arnd@arndnet.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-08-28 14:47:22 +02:00
Fam Zheng
a2787312e9 Documentation: Fix null_blk parameter irq_mode to irqmode
To match the real module parameter name we implemented.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-08-26 09:35:52 +02:00
Henrik Austad
3cf8ca1c25 Documentation/: update 00-INDEX files
Some of the 00-INDEX files are somewhat outdated and some folders does
not contain 00-INDEX at all.  Only outdated (with the notably exception
of spi) indexes are touched here, the 169 folders without 00-INDEX has
not been touched.

New 00-INDEX
 - spi/* was added in a series of commits dating back to 2006

Added files (missing in (*/)00-INDEX)
 - dmatest.txt was added by commit 851b7e16a0 ("dmatest: run test via
   debugfs")
 - this_cpu_ops.txt was added by commit a1b2a555d6 ("percpu: add
   documentation on this_cpu operations")
 - ww-mutex-design.txt was added by commit 040a0a3710 ("mutex: Add
   support for wound/wait style locks")
 - bcache.txt was added by commit cafe563591 ("bcache: A block layer
   cache")
 - kernel-per-CPU-kthreads.txt was added by commit 49717cb404
   ("kthread: Document ways of reducing OS jitter due to per-CPU
   kthreads")
 - phy.txt was added by commit ff76496347 ("drivers: phy: add generic
   PHY framework")
 - block/null_blk was added by commit 12f8f4fc03 ("null_blk:
   documentation")
 - module-signing.txt was added by commit 3cafea3076 ("Add
   Documentation/module-signing.txt file")
 - assoc_array.txt was added by commit 3cb989501c ("Add a generic
   associative array implementation.")
 - arm/IXP4xx was part of the initial repo
 - arm/cluster-pm-race-avoidance.txt was added by commit 7fe31d28e8
   ("ARM: mcpm: introduce helpers for platform coherency exit/setup")
 - arm/firmware.txt was added by commit 7366b92a77 ("ARM: Add
   interface for registering and calling firmware-specific operations")
 - arm/kernel_mode_neon.txt was added by commit 2afd0a0524 ("ARM:
   7825/1: document the use of NEON in kernel mode")
 - arm/tcm.txt was added by commit bc581770cf ("ARM: 5580/2: ARM TCM
   (Tightly-Coupled Memory) support v3")
 - arm/vlocks.txt was added by commit 9762f12d3e ("ARM: mcpm: Add
   baremetal voting mutexes")
 - blackfin/gptimers-example.c, Makefile was added by commit
   4b60779d5e ("Blackfin: add an example showing how to use the
   gptimers API")
 - devicetree/usage-model.txt was added by commit 31134efc68 ("dt:
   Linux DT usage model documentation")
 - fb/api.txt was added by commit fb21c2f428 ("fbdev: Add FOURCC-based
   format configuration API")
 - fb/sm501.txt was added by commit e6a0498071 ("video, sm501: add
   edid and commandline support")
 - fb/udlfb.txt was added by commit 96f8d864af ("fbdev: move udlfb out
   of staging.")
 - filesystems/Makefile was added by commit 1e0051ae48
   ("Documentation/fs/: split txt and source files")
 - filesystems/nfs/nfsd-admin-interfaces.txt was added by commit
   8a4c6e19cf ("nfsd: document kernel interfaces for nfsd
   configuration")
 - ide/warm-plug-howto.txt was added by commit f74c91413e ("ide: add
   warm-plug support for IDE devices (take 2)")
 - laptops/Makefile was added by commit d49129accc
   ("Documentation/laptop/: split txt and source files")
 - leds/leds-blinkm.txt was added by commit b54cf35a7f ("LEDS: add
   BlinkM RGB LED driver, documentation and update MAINTAINERS")
 - leds/ledtrig-oneshot.txt was added by commit 5e417281cd ("leds: add
   oneshot trigger")
 - leds/ledtrig-transient.txt was added by commit 44e1e9f8e7 ("leds:
   add new transient trigger for one shot timer activation")
 - m68k/README.buddha was part of the initial repo
 - networking/LICENSE.(qla3xxx|qlcnic|qlge) was added by commits
   40839129f7, c4e84bde1d, 5a4faa8737
 - networking/Makefile was added by commit 3794f3e812 ("docsrc: build
   Documentation/ sources")
 - networking/i40evf.txt was added by commit 105bf2fe6b ("i40evf: add
   driver to kernel build system")
 - networking/ipsec.txt was added by commit b3c6efbc36 ("xfrm: Add
   file to document IPsec corner case")
 - networking/mac80211-auth-assoc-deauth.txt was added by commit
   3cd7920a2b ("mac80211: add auth/assoc/deauth flow diagram")
 - networking/netlink_mmap.txt was added by commit 5683264c39
   ("netlink: add documentation for memory mapped I/O")
 - networking/nf_conntrack-sysctl.txt was added by commit c9f9e0e159
   ("netfilter: doc: add nf_conntrack sysctl api documentation") lan)
 - networking/team.txt was added by commit 3d249d4ca7 ("net: introduce
   ethernet teaming device")
 - networking/vxlan.txt was added by commit d342894c5d ("vxlan:
   virtual extensible lan")
 - power/runtime_pm.txt was added by commit 5e928f77a0 ("PM: Introduce
   core framework for run-time PM of I/O devices (rev.  17)")
 - power/charger-manager.txt was added by commit 3bb3dbbd56
   ("power_supply: Add initial Charger-Manager driver")
 - RCU/lockdep-splat.txt was added by commit d7bd2d68aa ("rcu:
   Document interpretation of RCU-lockdep splats")
 - s390/kvm.txt was added by 5ecee4b (KVM: s390: API documentation)
 - s390/qeth.txt was added by commit b4d72c08b3 ("qeth: bridgeport
   support - basic control")
 - scheduler/sched-bwc.txt was added by commit 88ebc08ea9 ("sched: Add
   documentation for bandwidth control")
 - scsi/advansys.txt was added by commit 4bd6d7f356 ("[SCSI] advansys:
   Move documentation to Documentation/scsi")
 - scsi/bfa.txt was added by commit 1ec90174bd ("[SCSI] bfa: add
   readme file")
 - scsi/bnx2fc.txt was added by commit 12b8fc10ea ("[SCSI] bnx2fc: Add
   driver documentation")
 - scsi/cxgb3i.txt was added by commit c3673464eb ("[SCSI] cxgb3i: Add
   cxgb3i iSCSI driver.")
 - scsi/hpsa.txt was added by commit 992ebcf14f ("[SCSI] hpsa: Add
   hpsa.txt to Documentation/scsi")
 - scsi/link_power_management_policy.txt was added by commit
   ca77329fb7 ("[libata] Link power management infrastructure")
 - scsi/osd.txt was added by commit 78e0c621de ("[SCSI] osd:
   Documentation for OSD library")
 - scsi/scsi-parameter.txt was created/moved by commit 163475fb11
   ("Documentation: move SCSI parameters to their own text file")
 - serial/driver was part of the initial repo
 - serial/n_gsm.txt was added by commit 323e84122e ("n_gsm: add a
   documentation")
 - timers/Makefile was added by commit 3794f3e812 ("docsrc: build
   Documentation/ sources")
 - virt/kvm/s390.txt was added by commit d9101fca3d ("KVM: s390:
   diagnose call documentation")
 - vm/split_page_table_lock was added by commit 49076ec2cc ("mm:
   dynamically allocate page->ptl if it cannot be embedded to struct
   page")
 - w1/slaves/w1_ds28e04 was added by commit fbf7f7b4e2 ("w1: Add
   1-wire slave device driver for DS28E04-100")
 - w1/masters/omap-hdq was added by commit e0a29382c6 ("hdq:
   documentation for OMAP HDQ")
 - x86/early-microcode.txt was added by commit 0d91ea86a8 ("x86, doc:
   Documentation for early microcode loading")
 - x86/earlyprintk.txt was added by commit a1aade4788 ("x86/doc:
   mini-howto for using earlyprintk=dbgp")
 - x86/entry_64.txt was added by commit 8b4777a4b5 ("x86-64: Document
   some of entry_64.S")
 - x86/pat.txt was added by commit d27554d874 ("x86: PAT
   documentation")

Moved files
 - arm/kernel_user_helpers.txt was moved out of arch/arm/kernel by
   commit 37b8304642 ("ARM: kuser: move interface documentation out of
   the source code")
 - efi-stub.txt was moved out of x86/ and down into Documentation/ in
   commit 4172fe2f8a ("EFI stub documentation updates")
 - laptops/hpfall.c was moved out of hwmon/ and into laptops/ in commit
   efcfed9bad ("Move hp_accel to drivers/platform/x86")
 - commit 5616c23ad9 ("x86: doc: move x86-generic documentation from
   Doc/x86/i386"):
   * x86/usb-legacy-support.txt
   * x86/boot.txt
   * x86/zero_page.txt
 - power/video_extension.txt was moved to acpi in commit 70e66e4df1
   ("ACPI / video: move video_extension.txt to Documentation/acpi")

Removed files (left in 00-INDEX)
 - memory.txt was removed by commit 00ea8990aa ("memory.txt: remove
   stray information")
 - gpio.txt was moved to gpio/ in commit fd8e198cfc ("Documentation:
   gpiolib: document new interface")
 - networking/DLINK.txt was removed by commit 168e06ae26
   ("drivers/net: delete old parallel port de600/de620 drivers")
 - serial/hayes-esp.txt was removed by commit f53a2ade0b ("tty: esp:
   remove broken driver")
 - s390/TAPE was removed by commit 9e280f6693 ("[S390] remove tape
   block docu")
 - vm/locking was removed by commit 57ea8171d2 ("mm: documentation:
   remove hopelessly out-of-date locking doc")
 - laptops/acer-wmi.txt was remvoed by commit 020036678e ("acer-wmi:
   Delete out-of-date documentation")

Typos/misc issues
 - rpc-server-gss.txt was added as knfsd-rpcgss.txt in commit
   030d794bf4 ("SUNRPC: Use gssproxy upcall for server RPCGSS
   authentication.")
 - commit b88cf73d92 ("net: add missing entries to
   Documentation/networking/00-INDEX")
   * generic-hdlc.txt was added as generic_hdlc.txt
   * spider_net.txt was added as spider-net.txt
 - w1/master/mxc-w1 was added as mxc_w1 by commit a5fd9139f7 ("w1: add
   1-wire master driver for i.MX27 / i.MX31")
 - s390/zfcpdump.txt was added as zfcpdump by commit 6920c12a40
   ("[S390] Add Documentation/s390/00-INDEX.")

Signed-off-by: Henrik Austad <henrik@austad.us>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	[rcu bits]
Acked-by: Rob Landley <rob@landley.net>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mark Brown <broonie@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Len Brown <len.brown@intel.com>
Cc: James Bottomley <JBottomley@parallels.com>
Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:40 -08:00
Jens Axboe
b28bc9b38c Linux 3.13-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJSwLfoAAoJEHm+PkMAQRiGi6QH/1U1B7lmHChDTw3jj1lfm9gA
 189Si4QJlnxFWCKHvKEL+pcaVuACU+aMGI8+KyMYK4/JfuWVjjj5fr/SvyHH2/8m
 LdSK8aHMhJ46uBS4WJ/l6v46qQa5e2vn8RKSBAyKm/h4vpt+hd6zJdoFrFai4th7
 k/TAwOAEHI5uzexUChwLlUBRTvbq4U8QUvDu+DeifC8cT63CGaaJ4qVzjOZrx1an
 eP6UXZrKDASZs7RU950i7xnFVDQu4PsjlZi25udsbeiKcZJgPqGgXz5ULf8ZH8RQ
 YCi1JOnTJRGGjyIOyLj7pyB01h7XiSM2+eMQ0S7g54F2s7gCJ58c2UwQX45vRWU=
 =/4/R
 -----END PGP SIGNATURE-----

Merge tag 'v3.13-rc6' into for-3.14/core

Needed to bring blk-mq uptodate, since changes have been going in
since for-3.14/core was established.

Fixup merge issues related to the immutable biovec changes.

Signed-off-by: Jens Axboe <axboe@kernel.dk>

Conflicts:
	block/blk-flush.c
	fs/btrfs/check-integrity.c
	fs/btrfs/extent_io.c
	fs/btrfs/scrub.c
	fs/logfs/dev_bdev.c
2013-12-31 09:51:02 -07:00
Matias Bjørling
200052440d null_blk: set use_per_node_hctx param to false
The defaults for the module is to instantiate itself with blk-mq and a
submit queue for each CPU node in the system.

To save resources, initialize instead with a single submit queue.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21 09:30:33 -07:00
Matias Bjørling
89ed05eea0 null_blk: corrections to documentation
Randy Dunlap reported a couple of grammar errors and unfortunate usages of
socket/node/core.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21 09:30:33 -07:00
Matias Bjorling
12f8f4fc03 null_blk: documentation
Add description of module and its parameters.

Signed-off-by: Matias Bjorling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-19 08:09:41 -07:00
Kent Overstreet
4550dd6c6b block: Immutable bio vecs
This adds a mechanism by which we can advance a bio by an arbitrary
number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done
indicates the number of bytes completed in the current bvec.

Various driver code still needs to be updated to not refer to the bvec
directly before we can use this for interesting things, like efficient
bio splitting.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
2013-11-23 22:33:49 -08:00
Kent Overstreet
4f024f3797 block: Abstract out bvec iterator
Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6
2013-11-23 22:33:47 -08:00
Paul Gortmaker
080506ad0a block: change config option name for cmdline partition parsing
Recently commit bab55417b1 ("block: support embedded device command
line partition") introduced CONFIG_CMDLINE_PARSER.  However, that name
is too generic and sounds like it enables/disables generic kernel boot
arg processing, when it really is block specific.

Before this option becomes a part of a full/final release, add the BLK_
prefix to it so that it is clear in absence of any other context that it
is block specific.

In addition, fix up the following less critical items:
 - help text was not really at all helpful.
 - index file for Documentation was not updated
 - add the new arg to Documentation/kernel-parameters.txt
 - clarify wording in source comments

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Cai Zhiyong <caizhiyong@huawei.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-30 14:31:02 -07:00
Cai Zhiyong
bab55417b1 block: support embedded device command line partition
Read block device partition table from command line.  The partition used
for fixed block device (eMMC) embedded device.  It is no MBR, save
storage space.  Bootloader can be easily accessed by absolute address of
data on the block device.  Users can easily change the partition.

This code reference MTD partition, source "drivers/mtd/cmdlinepart.c"
About the partition verbose reference
"Documentation/block/cmdline-partition.txt"

[akpm@linux-foundation.org: fix printk text]
[yongjun_wei@trendmicro.com.cn: fix error return code in parse_parts()]
Signed-off-by: Cai Zhiyong <caizhiyong@huawei.com>
Cc: Karel Zak <kzak@redhat.com>
Cc: "Wanglin (Albert)" <albert.wanglin@huawei.com>
Cc: Marius Groeger <mag@sysgo.de>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-11 15:56:57 -07:00
Masanari Iida
c9f3f2d8b3 doc: Fix typo in doucmentations
Correct typo (double words) in documentations.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-07-25 12:34:15 +02:00
Anatol Pomozov
f884ab15af doc: fix misspellings with 'codespell' tool
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-05-28 12:02:12 +02:00
Namjae Jeon
fdc6fdc52e Documentation: cfq-iosched: update documentation help for cfq tunables
Add the documentation text for latency, target_latency & group_idle
tunnable parameters in the block/cfq-iosched.txt.
Also fix few typo(spelling) mistakes.

Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>

Language somewhat modified by Jens.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-04-09 14:57:06 +02:00
Tejun Heo
d02f7aa8dc cfq-iosched: enable full blkcg hierarchy support
With the previous two patches, all cfqg scheduling decisions are based
on vfraction and ready for hierarchy support.  The only thing which
keeps the behavior flat is cfqg_flat_parent() which makes vfraction
calculation consider all non-root cfqgs children of the root cfqg.

Replace it with cfqg_parent() which returns the real parent.  This
enables full blkcg hierarchy support for cfq-iosched.  For example,
consider the following hierarchy.

        root
      /      \
   A:500      B:250
  /     \
 AA:500  AB:1000

For simplicity, let's say all the leaf nodes have active tasks and are
on service tree.  For each leaf node, vfraction would be

 AA: (500  / 1500) * (500 / 750) =~ 0.2222
 AB: (1000 / 1500) * (500 / 750) =~ 0.4444
  B:                 (250 / 750) =~ 0.3333

and vdisktime will be distributed accordingly.  For more detail,
please refer to Documentation/block/cfq-iosched.txt.

v2: cfq-iosched.txt updated to describe group scheduling as suggested
    by Vivek.

v3: blkio-controller.txt updated.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2013-01-09 08:05:11 -08:00
Kent Overstreet
4254bba17d block: Kill bi_destructor
Now that we've got generic code for freeing bios allocated from bio
pools, this isn't needed anymore.

This patch also makes bio_free() static, since without bi_destructor
there should be no need for it to be called anywhere else.

bio_free() is now only called from bio_put, so we can refactor those a
bit - move some code from bio_put() to bio_free() and kill the redundant
bio->bi_next = NULL.

v5: Switch to BIO_KMALLOC_POOL ((void *)~0), per Boaz
v6: BIO_KMALLOC_POOL now NULL, drop bio_free's EXPORT_SYMBOL
v7: No #define BIO_KMALLOC_POOL anymore

Signed-off-by: Kent Overstreet <koverstreet@google.com>
CC: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-09-09 10:35:39 +02:00
Namjae Jeon
4004e90cfa Documentation: update tunable options in block/cfq-iosched.txt
Update tunable options in block/cfq-iosched.txt.

Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-09 15:28:05 +02:00
Namjae Jeon
2792d87193 Documentation: update tunable options in block/cfq-iosched.txt
Update tunable options in block/cfq-iosched.txt.

Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-09 15:27:29 +02:00
Namjae Jeon
865826644e Documentation: update missing index files in block/00-INDEX
Update missing index files in block/00-INDEX.

Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-08-09 15:27:28 +02:00
Tejun Heo
a051661ca6 blkcg: implement per-blkg request allocation
Currently, request_queue has one request_list to allocate requests
from regardless of blkcg of the IO being issued.  When the unified
request pool is used up, cfq proportional IO limits become meaningless
- whoever grabs the next request being freed wins the race regardless
of the configured weights.

This can be easily demonstrated by creating a blkio cgroup w/ very low
weight, put a program which can issue a lot of random direct IOs there
and running a sequential IO from a different cgroup.  As soon as the
request pool is used up, the sequential IO bandwidth crashes.

This patch implements per-blkg request_list.  Each blkg has its own
request_list and any IO allocates its request from the matching blkg
making blkcgs completely isolated in terms of request allocation.

* Root blkcg uses the request_list embedded in each request_queue,
  which was renamed to @q->root_rl from @q->rq.  While making blkcg rl
  handling a bit harier, this enables avoiding most overhead for root
  blkcg.

* Queue fullness is properly per request_list but bdi isn't blkcg
  aware yet, so congestion state currently just follows the root
  blkcg.  As writeback isn't aware of blkcg yet, this works okay for
  async congestion but readahead may get the wrong signals.  It's
  better than blkcg completely collapsing with shared request_list but
  needs to be improved with future changes.

* After this change, each block cgroup gets a full request pool making
  resource consumption of each cgroup higher.  This makes allowing
  non-root users to create cgroups less desirable; however, note that
  allowing non-root users to directly manage cgroups is already
  severely broken regardless of this patch - each block cgroup
  consumes kernel memory and skews IO weight (IO weights are not
  hierarchical).

v2: queue-sysfs.txt updated and patch description udpated as suggested
    by Vivek.

v3: blk_get_rl() wasn't checking error return from
    blkg_lookup_create() and may cause oops on lookup failure.  Fix it
    by falling back to root_rl on blkg lookup failures.  This problem
    was spotted by Rakesh Iyer <rni@google.com>.

v4: Updated to accomodate 458f27a982 "block: Avoid missed wakeup in
    request waitqueue".  blk_drain_queue() now wakes up waiters on all
    blkg->rl on the target queue.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-26 18:42:49 -04:00
Wang Sheng-Hui
b31966816d Documentation: drop as block elevator reference in switching-sched.txt
Remove 'as' for as is no longer supported, and we can not use
'elevator=as' any more.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-11-04 12:01:48 -07:00
Paul Bolle
395cf9691d doc: fix broken references
There are numerous broken references to Documentation files (in other
Documentation files, in comments, etc.). These broken references are
caused by typo's in the references, and by renames or removals of the
Documentation files. Some broken references are simply odd.

Fix these broken references, sometimes by dropping the irrelevant text
they were part of.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-27 18:08:04 +02:00
Vivek Goyal
4931402a9d cfq-iosched: Add documentation about idling
There are always questions about why CFQ is idling on various conditions.
Recent ones is Christoph asking again why to idle on REQ_NOIDLE. His
assertion is that XFS is relying more and more on workqueues and is
concerned that CFQ idling on IO from every workqueue will impact
XFS badly.

So he suggested that I add some more documentation about CFQ idling
and that can provide more clarity on the topic and also gives an
opprotunity to poke a hole in theory and lead to improvements.

So here is my attempt at that. Any comments are welcome.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-08-05 09:42:20 +02:00
Dan Williams
5757a6d76c block: strict rq_affinity
Some systems benefit from completions always being steered to the strict
requester cpu rather than the looser "per-socket" steering that
blk_cpu_to_group() attempts by default. This is because the first
CPU in the group mask ends up being completely overloaded with work,
while the others (including the original submitter) has power left
to spare.

Allow the strict mode to be set by writing '2' to the sysfs control
file. This is identical to the scheme used for the nomerges file,
where '2' is a more aggressive setting than just being turned on.

echo 2 > /sys/block/<bdev>/queue/rq_affinity

Cc: Christoph Hellwig <hch@infradead.org>
Cc: Roland Dreier <roland@purestorage.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-23 20:44:25 +02:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Jens Axboe
7eaceaccab block: remove per-queue plugging
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-03-10 08:52:07 +01:00
Randy Dunlap
17a9e7bbae Documentation: remove anticipatory scheduler info
Remove anticipatory block I/O scheduler info from Documentation/
since the code has been deleted.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: "Robert P. J. Day" <rpjday@crashcourse.ca>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-11-11 12:09:59 +01:00
Jens Axboe
fa251f8990 Merge branch 'v2.6.36-rc8' into for-2.6.37/barrier
Conflicts:
	block/blk-core.c
	drivers/block/loop.c
	mm/swapfile.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-10-19 09:13:04 +02:00
Christoph Hellwig
04ccc65cd1 block: update documentation for REQ_FLUSH / REQ_FUA
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-10 12:35:37 +02:00
Vivek Goyal
6d6ac1c1a3 cfq-iosched: Documentation help for new tunables
Some documentation to provide help with tunables.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-08-23 12:25:29 +02:00
Nick Piggin
6e57559022 nick piggin: change email address
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-05 08:45:20 -07:00
FUJITA Tomonori
c2282adbde Documentation: fix block/biodoc.txt dma mapping description
- It looks incorrect to use blk_rq_map_sg with pci_map_page here about
  DMA mappings. dma_map_sg?

- better to use dma_map_page instead of pci_map_page.
  http://marc.info/?l=linux-kernel&m=126596737604808&w=2

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-03-08 09:11:07 +01:00
Jens Axboe
f11cbd74c5 Merge branch 'master' into for-2.6.34 2010-02-22 13:48:51 +01:00
Alan D. Brunelle
488991e28e block: Added in stricter no merge semantics for block I/O
Updated 'nomerges' tunable to accept a value of '2' - indicating that _no_
merges at all are to be attempted (not even the simple one-hit cache).

The following table illustrates the additional benefit - 5 minute runs of
a random I/O load were applied to a dozen devices on a 16-way x86_64 system.

nomerges        Throughput      %System         Improvement (tput / %sys)
--------        ------------    -----------     -------------------------
0               12.45 MB/sec    0.669365609
1               12.50 MB/sec    0.641519199     0.40% / 2.71%
2               12.52 MB/sec    0.639849750     0.56% / 2.96%

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-01-29 09:04:08 +01:00