| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for lz4 and lz4hc compression algorithm using the lib/lz4/*
codebase.
[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Chanho Min <chanho.min@lge.com>
Cc: "Darrick J. Wong" <djwong@us.ibm.com>
Cc: Bob Pearson <rpearson@systemfabricworks.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Herbert Xu <herbert@gondor.hengli.com.au>
Cc: Yann Collet <yann.collet.73@gmail.com>
Cc: Kyungsik Lee <kyungsik.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
crypto/Kconfig
crypto/Makefile
Change-Id: Ic3495647e3a902a12ae1b1951ff9bd89011696e0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds ARM NEON assembly implementation of SHA-512 and SHA-384
algorithms.
tcrypt benchmark results on Cortex-A8, sha512-generic vs sha512-neon-asm:
block-size bytes/update old-vs-new
16 16 2.99x
64 16 2.67x
64 64 3.00x
256 16 2.64x
256 64 3.06x
256 256 3.33x
1024 16 2.53x
1024 256 3.39x
1024 1024 3.52x
2048 16 2.50x
2048 256 3.41x
2048 1024 3.54x
2048 2048 3.57x
4096 16 2.49x
4096 256 3.42x
4096 1024 3.56x
4096 4096 3.59x
8192 16 2.48x
8192 256 3.42x
8192 1024 3.56x
8192 4096 3.60x
8192 8192 3.60x
Change-Id: Ibc318f8c9136507f57e2bb8d8f51b4714d8ed70b
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Iliyan Malchev <malchev@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds ARM NEON assembly implementation of SHA-1 algorithm.
tcrypt benchmark results on Cortex-A8, sha1-arm-asm vs sha1-neon-asm:
block-size bytes/update old-vs-new
16 16 1.04x
64 16 1.02x
64 64 1.05x
256 16 1.03x
256 64 1.04x
256 256 1.30x
1024 16 1.03x
1024 256 1.36x
1024 1024 1.52x
2048 16 1.03x
2048 256 1.39x
2048 1024 1.55x
2048 2048 1.59x
4096 16 1.03x
4096 256 1.40x
4096 1024 1.57x
4096 4096 1.62x
8192 16 1.03x
8192 256 1.40x
8192 1024 1.58x
8192 4096 1.63x
8192 8192 1.63x
Change-Id: I6df3c0a9ba8d450d034cf78785b6ce80a72bef4a
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Iliyan Malchev <malchev@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bit sliced AES gives around 45% speedup on Cortex-A15 for encryption
and around 25% for decryption. This implementation of the AES algorithm
does not rely on any lookup tables so it is believed to be invulnerable
to cache timing attacks.
This algorithm processes up to 8 blocks in parallel in constant time. This
means that it is not usable by chaining modes that are strictly sequential
in nature, such as CBC encryption. CBC decryption, however, can benefit from
this implementation and runs about 25% faster. The other chaining modes
implemented in this module, XTS and CTR, can execute fully in parallel in
both directions.
The core code has been adopted from the OpenSSL project (in collaboration
with the original author, on cc). For ease of maintenance, this version is
identical to the upstream OpenSSL code, i.e., all modifications that were
required to make it suitable for inclusion into the kernel have been made
upstream. The original can be found here:
http://git.openssl.org/gitweb/?p=openssl.git;a=commit;h=6f6a6130
Note to integrators:
While this implementation is significantly faster than the existing table
based ones (generic or ARM asm), especially in CTR mode, the effects on
power efficiency are unclear as of yet. This code does fundamentally more
work, by calculating values that the table based code obtains by a simple
lookup; only by doing all of that work in a SIMD fashion, it manages to
perform better.
Change-Id: Ife4f79ce9e8994e248d6fc01fcb23b0534265418
Cc: Andy Polyakov <appro@openssl.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add assembler versions of AES and SHA1 for ARM platforms. This has provided
up to a 50% improvement in IPsec/TCP throughout for tunnels using AES128/SHA1.
Platform CPU SPeed Endian Before (bps) After (bps) Improvement
IXP425 533 MHz big 11217042 15566294 ~38%
KS8695 166 MHz little 3828549 5795373 ~51%
Change-Id: I5b77e7aa89c8b1d54aef75065827325e90305638
Signed-off-by: David McCullough <ucdevel@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loading fpu without aesni-intel does nothing. Loading aesni-intel
without fpu causes modes like xts to fail. (Unloading
aesni-intel will restore those modes.)
One solution would be to make aesni-intel depend on fpu, but it
seems cleaner to just combine the modules.
This is probably responsible for bugs like:
https://bugzilla.redhat.com/show_bug.cgi?id=589390
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This feature no longer needs the experimental tag.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add missing dependency on NET since we require sockets for our
interface.
Should really be a select but kconfig doesn't like that:
net/Kconfig:6:error: found recursive dependency: NET -> NETWORK_FILESYSTEMS -> AFS_FS -> AF_RXRPC -> CRYPTO -> CRYPTO_USER_API_HASH -> CRYPTO_USER_API -> NET
Reported-by: Zimny Lech <napohybelskurwysynom2010@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AES-NI instructions are also available in legacy mode so the 32-bit
architecture may profit from those, too.
To illustrate the performance gain here's a short summary of a dm-crypt
speed test on a Core i7 M620 running at 2.67GHz comparing both assembler
implementations:
x86: i568 aes-ni delta
ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4%
CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3%
LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5%
XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7%
Additionally, due to some minor optimizations, the 64-bit version also
got a minor performance gain as seen below:
x86-64: old impl. new impl. delta
ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5%
CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9%
LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6%
XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7%
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reviewed-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the af_alg plugin for symmetric key ciphers,
corresponding to the ablkcipher kernel operation type.
Keys can optionally be set through the setsockopt interface.
Once a sendmsg call occurs without MSG_MORE no further writes
may be made to the socket until all previous data has been read.
IVs and and whether encryption/decryption is performed can be
set through the setsockopt interface or as a control message
to sendmsg.
The interface is completely synchronous, all operations are
carried out in recvmsg(2) and will complete prior to the system
call returning.
The splice(2) interface support reading the user-space data directly
without copying (except that the Crypto API itself may copy the data
if alignment is off).
The recvmsg(2) interface supports directly writing to user-space
without additional copying, i.e., the kernel crypto interface will
receive the user-space address as its output SG list.
Thakns to Miloslav Trmac for reviewing this and contributing
fixes and improvements.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the af_alg plugin for hash, corresponding to
the ahash kernel operation type.
Keys can optionally be set through the setsockopt interface.
Each sendmsg call will finalise the hash unless sent with a MSG_MORE
flag.
Partial hash states can be cloned using accept(2).
The interface is completely synchronous, all operations will
complete prior to the system call returning.
Both sendmsg(2) and splice(2) support reading the user-space
data directly without copying (except that the Crypto API itself
may copy the data if alignment is off).
For now only the splice(2) interface supports performing digest
instead of init/update/final. In future the sendmsg(2) interface
will also be modified to use digest/finup where possible so that
hardware that cannot return a partial hash state can still benefit
from this interface.
Thakns to Miloslav Trmac for reviewing this and contributing
fixes and improvements.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Martin Willi <martin@strongswan.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch creates the backbone of the user-space interface for
the Crypto API, through a new socket family AF_ALG.
Each session corresponds to one or more connections obtained from
that socket. The number depends on the number of inputs/outputs
of that particular type of operation. For most types there will
be a s ingle connection/file descriptor that is used for both input
and output. AEAD is one of the few that require two inputs.
Each algorithm type will provide its own implementation that plugs
into af_alg. They're keyed using a string such as "skcipher" or
"hash".
IOW this patch only contains the boring bits that is required
to hold everything together.
Thakns to Miloslav Trmac for reviewing this and contributing
fixes and improvements.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Martin Willi <martin@strongswan.org>
|
|
|
|
|
|
|
|
|
|
| |
Below is a patch to update the broken web addresses, in crypto/*
that I could locate. Some are just simple typos that needed to be
fixed, and some had a change in location altogether..
let me know if any of them need to be changed and such.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
| |
Signed-off-by: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Thu, Aug 05, 2010 at 07:01:21PM -0700, Linus Torvalds wrote:
> On Thu, Aug 5, 2010 at 6:40 PM, Herbert Xu <herbert@gondor.hengli.com.au> wrote:
> >
> > -config CRYPTO_MANAGER_TESTS
> > - bool "Run algolithms' self-tests"
> > - default y
> > - depends on CRYPTO_MANAGER2
> > +config CRYPTO_MANAGER_DISABLE_TESTS
> > + bool "Disable run-time self tests"
> > + depends on CRYPTO_MANAGER2 && EMBEDDED
>
> Why do you still want to force-enable those tests? I was going to
> complain about the "default y" anyway, now I'm _really_ complaining,
> because you've now made it impossible to disable those tests. Why?
As requested, this patch sets the default to y and removes the
EMBEDDED dependency.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a serious bug in the test disabling patch where
it can cause an spurious load of the cryptomgr module even when
it's compiled in.
It also negates the test disabling option so that its absence
causes tests to be enabled.
The Kconfig option is also now behind EMBEDDED.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
| |
By default, CONFIG_CRYPTO_MANAGER_TESTS will be enabled and thus
self-tests will still run, but it is now possible to disable them
to gain some time during bootup.
Signed-off-by: Alexander Shishkin <virtuoso@slind.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PCOMP Kconfig entry current allows the following combination
which is illegal:
ZLIB=y
PCOMP=y
ALGAPI=m
ALGAPI2=y
MANAGER=m
MANAGER2=m
This patch fixes this by adding PCOMP2 so that PCOMP can select
ALGAPI to propagate the setting to MANAGER2.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
| |
Signed-off-by: Gilles Espinasse <g.esp@free.fr>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|\
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
Documentation/filesystems/proc.txt
arch/arm/mach-u300/include/mach/debug-macro.S
drivers/net/qlge/qlge_ethtool.c
drivers/net/qlge/qlge_main.c
drivers/net/typhoon.c
|
| |
| |
| |
| |
| | |
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|/
|
|
|
|
|
|
|
| |
This patch adds a parallel crypto template that takes a crypto
algorithm and converts it to process the crypto transforms in
parallel. For the moment only aead algorithms are supported.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
CLMUL-NI accelerated GHASH should be turned off on non-x86_64 machine.
Reported-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PCLMULQDQ is used to accelerate the most time-consuming part of GHASH,
carry-less multiplication. More information about PCLMULQDQ can be
found at:
http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/
Because PCLMULQDQ changes XMM state, its usage must be enclosed with
kernel_fpu_begin/end, which can be used only in process context, the
acceleration is implemented as crypto_ahash. That is, request in soft
IRQ context will be defered to the cryptd kernel thread.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
This patch adds VMAC (a fast MAC) support into crypto framework.
Signed-off-by: Shane Wang <shane.wang@intel.com>
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
What about something like this? It defaults the CPRNG to m and makes FIPS
dependent on the CPRNG. That way you get a module build by default, but you can
change it to y manually during config and still satisfy the dependency, and if
you select N it disables FIPS as well. I rather like that better than making
FIPS a tristate. I just tested it out here and it seems to work well. Let me
know what you think
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 215ccd6f55a2144bd553e0a3d12e1386f02309fd.
It causes CPRNG and everything selected by it to be built-in
whenever FIPS is enabled. The problem is that it is selecting
a tristate from a bool, which is usually not what is intended.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the dedicated GHASH implementation in GCM, and uses the GHASH
digest algorithm instead. This will make GCM uses hardware accelerated
GHASH implementation automatically if available.
ahash instead of shash interface is used, because some hardware
accelerated GHASH implementation needs asynchronous interface.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
| |
GHASH is implemented as a shash algorithm. The actual implementation
is copied from gcm.c. This makes it possible to add
architecture/hardware accelerated GHASH implementation.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
The ANSI CPRNG has no dependence on FIPS support. FIPS support however,
requires the use of the CPRNG. Adjust that depedency relationship in Kconfig.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
| |
The RNG should work with FIPS disabled.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because kernel_fpu_begin() and kernel_fpu_end() operations are too
slow, the performance gain of general mode implementation + aes-aesni
is almost all compensated.
The AES-NI support for more modes are implemented as follow:
- Add a new AES algorithm implementation named __aes-aesni without
kernel_fpu_begin/end()
- Use fpu(<mode>(AES)) to provide kenrel_fpu_begin/end() invoking
- Add <mode>(AES) ablkcipher, which uses cryptd(fpu(<mode>(AES))) to
defer cryption to cryptd context in soft_irq context.
Now the ctr, lrw, pcbc and xts support are added.
Performance testing based on dm-crypt shows that cryption time can be
reduced to 50% of general mode implementation + aes-aesni implementation.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blkcipher touching FPU need to be enclosed by kernel_fpu_begin() and
kernel_fpu_end(). If they are invoked in cipher algorithm
implementation, they will be invoked for each block, so that
performance will be hurt, because they are "slow" operations. This
patch implements "fpu" template, which makes these operations to be
invoked for each request.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
| |
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
| |
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current "comp" crypto interface supports one-shot (de)compression only,
i.e. the whole data buffer to be (de)compressed must be passed at once, and
the whole (de)compressed data buffer will be received at once.
In several use-cases (e.g. compressed file systems that store files in big
compressed blocks), this workflow is not suitable.
Furthermore, the "comp" type doesn't provide for the configuration of
(de)compression parameters, and always allocates workspace memory for both
compression and decompression, which may waste memory.
To solve this, add a "pcomp" partial (de)compression interface that provides
the following operations:
- crypto_compress_{init,update,final}() for compression,
- crypto_decompress_{init,update,final}() for decompression,
- crypto_{,de}compress_setup(), to configure (de)compression parameters
(incl. allocating workspace memory).
The (de)compression methods take a struct comp_request, which was mimicked
after the z_stream object in zlib, and contains buffer pointer and length
pairs for input and output.
The setup methods take an opaque parameter pointer and length pair. Parameters
are supposed to be encoded using netlink attributes, whose meanings depend on
the actual (name of the) (de)compression algorithm.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
| |
keventd_wq has potential starvation problem, so use dedicated
kcrypto_wq instead.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original cryptd thread implementation has scalability issue, this
patch solve the issue with a per-CPU thread implementation.
struct cryptd_queue is defined to be a per-CPU queue, which holds one
struct cryptd_cpu_queue for each CPU. In struct cryptd_cpu_queue, a
struct crypto_queue holds all requests for the CPU, a struct
work_struct is used to run all requests for the CPU.
Testing based on dm-crypt on an Intel Core 2 E6400 (two cores) machine
shows 19.2% performance gain. The testing script is as follow:
-------------------- script begin ---------------------------
#!/bin/sh
dmc_create()
{
# Create a crypt device using dmsetup
dmsetup create $2 --table "0 `blockdev --getsize $1` crypt cbc(aes-asm)?cryptd?plain:plain babebabebabebabebabebabebabebabe 0 $1 0"
}
dmsetup remove crypt0
dmsetup remove crypt1
dd if=/dev/zero of=/dev/ram0 bs=1M count=4 >& /dev/null
dd if=/dev/zero of=/dev/ram1 bs=1M count=4 >& /dev/null
dmc_create /dev/ram0 crypt0
dmc_create /dev/ram1 crypt1
cat >tr.sh <<EOF
#!/bin/sh
for n in \$(seq 10); do
dd if=/dev/dm-0 of=/dev/null >& /dev/null &
dd if=/dev/dm-1 of=/dev/null >& /dev/null &
done
wait
EOF
for n in $(seq 10); do
/usr/bin/time sh tr.sh
done
rm tr.sh
-------------------- script end ---------------------------
The separator of dm-crypt parameter is changed from "-" to "?", because
"-" is used in some cipher driver name too, and cryptds need to specify
cipher driver name instead of cipher name.
The test result on an Intel Core2 E6400 (two cores) is as follow:
without patch:
-----------------wo begin --------------------------
0.04user 0.38system 0:00.39elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6566minor)pagefaults 0swaps
0.07user 0.35system 0:00.35elapsed 121%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6567minor)pagefaults 0swaps
0.06user 0.34system 0:00.30elapsed 135%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.37system 0:00.36elapsed 119%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6607minor)pagefaults 0swaps
0.06user 0.36system 0:00.35elapsed 120%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.37system 0:00.31elapsed 136%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6594minor)pagefaults 0swaps
0.04user 0.34system 0:00.30elapsed 126%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6597minor)pagefaults 0swaps
0.06user 0.32system 0:00.31elapsed 125%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6571minor)pagefaults 0swaps
0.06user 0.34system 0:00.31elapsed 134%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6581minor)pagefaults 0swaps
0.05user 0.38system 0:00.31elapsed 138%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6600minor)pagefaults 0swaps
-----------------wo end --------------------------
with patch:
------------------w begin --------------------------
0.02user 0.31system 0:00.24elapsed 141%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6554minor)pagefaults 0swaps
0.05user 0.34system 0:00.31elapsed 127%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6606minor)pagefaults 0swaps
0.07user 0.33system 0:00.26elapsed 155%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6559minor)pagefaults 0swaps
0.07user 0.32system 0:00.26elapsed 151%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.05user 0.34system 0:00.26elapsed 150%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6603minor)pagefaults 0swaps
0.03user 0.36system 0:00.31elapsed 124%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.04user 0.35system 0:00.26elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6586minor)pagefaults 0swaps
0.03user 0.37system 0:00.27elapsed 146%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6562minor)pagefaults 0swaps
0.04user 0.36system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6594minor)pagefaults 0swaps
0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+6557minor)pagefaults 0swaps
------------------w end --------------------------
The middle value of elapsed time is:
wo cryptwq: 0.31
w cryptwq: 0.26
The performance gain is about (0.31-0.26)/0.26 = 0.192.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
| |
Use dedicated workqueue for crypto subsystem
A dedicated workqueue named kcrypto_wq is created to be used by crypto
subsystem. The system shared keventd_wq is not suitable for
encryption/decryption, because of potential starvation problem.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD)
instructions that are going to be introduced in the next generation of
Intel processor, as of 2009. These instructions enable fast and secure
data encryption and decryption, using the Advanced Encryption Standard
(AES), defined by FIPS Publication number 197. The architecture
introduces six instructions that offer full hardware support for
AES. Four of them support high performance data encryption and
decryption, and the other two instructions support the AES key
expansion procedure.
The white paper can be downloaded from:
http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf
AES may be used in soft_irq context, but MMX/SSE context can not be
touched safely in soft_irq context. So in_interrupt() is checked, if
in IRQ or soft_irq context, the general x86_64 implementation are used
instead.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes sha512 and sha384 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes michael_mic to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes wp512, wp384 and wp256 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes tgr192, tgr160 and tgr128 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes sha256 and sha224 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes md5 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes md4 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
|
| |
This patch changes sha1 to the new shash interface.
Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
| |
This patch changes rmd320 to the new shash interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
|
|
| |
This patch changes rmd256 to the new shash interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|