Invited Lectures

Frontmatter

Physical Attacks and Beyond

Abstract

Physical attacks have been subject of extensive research since more than twenty years. Nevertheless, several problems still have to be solved. This paper, after recalling the most popular physical attacks, introduces three (of the many) possible research directions in the area: the methodological study of the interaction between countermeasures against one type of attack and the resistance against another attack, the development of automated techniques for applying and verifying the correct application of countermeasures, and the study of physical attacks in the novel and changed scenario of cyber-physical systems.

Francesco Regazzoni

Post-quantum Key Exchange for the Internet and the Open Quantum Safe Project

Abstract

Designing public key cryptosystems that resist attacks by quantum computers is an important area of current cryptographic research and standardization. To retain confidentiality of today’s communications against future quantum computers, applications and protocols must begin exploring the use of quantum-resistant key exchange and encryption. In this paper, we explore post-quantum cryptography in general and key exchange specifically. We review two protocols for quantum-resistant key exchange based on lattice problems: BCNS15, based on the ring learning with errors problem, and Frodo, based on the learning with errors problem. We discuss their security and performance characteristics, both on their own and in the context of the Transport Layer Security (TLS) protocol. We introduce the Open Quantum Safe project, an open-source software project for prototyping quantum-resistant cryptography, which includes liboqs, a C library of quantum-resistant algorithms, and our integrations of liboqs into popular open-source applications and protocols, including the widely used OpenSSL library.

Douglas Stebila, Michele Mosca

Side Channels and Fault Attacks

Frontmatter

Detecting Side Channel Vulnerabilities in Improved Rotating S-Box Masking Scheme—Presenting Four Non-profiled Attacks

Abstract

Improved Rotating S-box Masking (RSM2.0 for short) is a well-known countermeasure designed and implemented by DPA Contest V4.2 committee to provide security protection for AES-128. By combining both 1st-order masking and shuffling techniques, improved RSM claims to offer at least non-profiled resistance for its software implementation and up to now no systematic research has been published to challenge such security claim yet. To study the practical security of RSM2.0 against non-profiled attacks, we first propose an analytical methodology to guide the detection of the exploitable vulnerabilities in RSM2.0. On the basis of the methodology, several potential flaws hidden in both the algorithm design and detailed implementation of RSM2.0 are discovered and we make use of them to design six attacking schemes in total, all of which belong to non-profiled attacks. Four representative attacks are eventually implemented and submitted to DPA Contest V4.2 for official evaluation and the results show that all the submitted attacks are both practical and feasible. Among them, the best attack scheme requires only 257 power traces to crack the complete 128-bit master key with 80% success rate. To further improve the security level of RSM2.0, we also discuss some possible strategies to eliminate or mitigate the threats proposed by us.

Zeyi Liu, Neng Gao, Chenyang Tu, Yuan Ma, Zongbin Liu

Bridging the Gap: Advanced Tools for Side-Channel Leakage Estimation Beyond Gaussian Templates and Histograms

Abstract

The accuracy and the fast convergence of a leakage model are both essential components for the efficiency of side-channel analysis. Thus for efficient leakage estimation an evaluator is requested to pick a Probability Density Function (PDF) that constitutes the optimal trade-off between both aspects. In the case of parametric estimation, Gaussian templates are a common choice due to their fast convergence, given that the actual leakages follow a Gaussian distribution (as in the case of an unprotected device). In contrast, histograms and kernel-based estimations are examples for non-parametric estimation that are capable to capture any distribution (even that of a protected device) at a slower convergence rate.

With this work we aim to enlarge the statistical toolbox of a side-channel evaluator by introducing new PDF estimation tools that fill the gap between both extremes. Our tools are designed for parametric estimation and can efficiently characterize leakages up to the fourth statistical moment. We show that such an approach is superior to non-parametric estimators in contexts where key-dependent information in located in one of those moments of the leakage distribution. Furthermore, we successfully demonstrate how to apply our tools for the (worst-case) information-theoretic evaluation on masked implementations with up to four shares, in a profiled attack scenario. We like to remark that this flexibility capturing information from different moments of the leakage PDF can provide very valuable feedback for hardware designers to their task to evaluate the individual and combined criticality of leakages in their (protected) implementations.

Tobias Schneider, Amir Moradi, François-Xavier Standaert, Tim Güneysu

Uniform First-Order Threshold Implementations

Abstract

Most masking schemes used as a countermeasure against side-channel analysis attacks require an extensive amount of fresh random bits on the fly. This is burdensome especially for lightweight cryptosystems. Threshold implementations (TIs) that are secure against first-order attacks have the advantage that fresh randomness is not required if the sharing of the underlying function is uniform. However, finding uniform realizations of nonlinear functions that also satisfy other TI properties can be a challenging task. In this paper, we discuss several methods that advance the search for uniformly shared functions for TIs. We focus especially on three-share implementations of quadratic functions due to their low area footprint. Our methods have low computational complexity even for 8-bit Boolean functions.

Tim Beyne, Begül Bilgin

Attacking Embedded ECC Implementations Through cmov Side Channels

Abstract

Side-channel attacks against implementations of elliptic-curve cryptography have been extensively studied in the literature and a large tool-set of countermeasures is available to thwart different attacks in different contexts. The current state of the art in attacks and countermeasures is nicely summarized in multiple survey papers, the most recent one by Danger et al. [21]. However, any combination of those countermeasures is ineffective against attacks that require only a single trace and directly target a conditional move (cmov) – an operation that is at the very foundation of all scalar-multiplication algorithms. This operation can either be implemented through arithmetic operations on registers or through various different approaches that all boil down to loading from or storing to a secret address. In this paper we demonstrate that such an attack is indeed possible for ECC software running on AVR ATmega microcontrollers, using a protected version of the popular \(\mu \)NaCl library as an example. For the targeted implementations, we are able to recover 99.6% of the key bits for the arithmetic approach and 95.3% of the key bits for the approach based on secret addresses, with confidence levels 76.1% and 78.8%, respectively. All publicly available ECC software for the AVR that we are aware of uses one of the two approaches and is thus in principle vulnerable to our attack.

Erick Nascimento, Łukasz Chmielewski, David Oswald, Peter Schwabe

Lattice Attacks Against Elliptic-Curve Signatures with Blinded Scalar Multiplication

Abstract

Elliptic curve cryptography is today the prevailing approach to get efficient public-key cryptosystems and digital signatures. Most of elliptic curve signature schemes use a nonce in the computation of each signature and the knowledge of this nonce is sufficient to fully recover the secret key of the scheme. Even a few bits of the nonce over several signatures allow a complete break of the scheme by lattice-based attacks. Several works have investigated how to efficiently apply such attacks when partial information on the nonce can be recovered through side-channel attacks. However, these attacks usually target unprotected implementation and/or make ideal assumptions on the recovered information, and it is not clear how they would perform in a scenario where common countermeasures are included and where only noisy information leaks via side channels. In this paper, we close this gap by applying such attack techniques against elliptic-curve signature implementations based on a blinded scalar multiplication. Specifically, we extend the famous Howgrave-Graham and Smart lattice attack when the nonces are blinded by the addition of a random multiple of the elliptic-curve group order or by a random Euclidean splitting. We then assume that noisy information on the blinded nonce can be obtained through a template attack targeting the underlying scalar multiplication and we show how to characterize the obtained likelihood scores under a realistic leakage assumption. To deal with this scenario, we introduce a filtering method which given a set of signatures and associated likelihood scores maximizes the success probability of the lattice attack. Our approach is backed up with attack simulation results for several signal-to-noise ratio of the exploited leakage.

Dahmun Goudarzi, Matthieu Rivain, Damien Vergnaud

Loop-Abort Faults on Lattice-Based Fiat-Shamir and Hash-and-Sign Signatures

Abstract

Although postquantum cryptography is of growing practical concern, not many works have been devoted to implementation security issues related to postquantum schemes.

In this paper, we look in particular at fault attacks against implementations of lattice-based signature schemes, looking both at Fiat-Shamir type constructions (particularly BLISS, but also GLP, PASSSing and Ring-TESLA) and at hash-and-sign schemes (particularly the GPV-based scheme of Ducas–Prest–Lyubashevsky). These schemes include essentially all practical lattice-based signatures, and achieve the best efficiency to date in both software and hardware. We present several fault attacks against those schemes yielding a full key recovery with only a few or even a single faulty signature, and discuss possible countermeasures to protect against these attacks.

Thomas Espitau, Pierre-Alain Fouque, Benoît Gérard, Mehdi Tibouchi

Design and Implementation of Symmetric Cryptography

Frontmatter

On the Construction of Hardware-Friendly and S-Boxes

Abstract

With the emergence of the Internet of Things and lightweight cryptography, one can observe a gradual shift of interest in the design of block ciphers. Naturally, security is still of paramount importance, but one is willing to trade a part of that security in order to obtain higher speed and/or smaller implementation area. Accordingly, a common metric in many cipher proposals has been the gate count for realizing the cipher in hardware. On the other side, it is also important, especially for battery powered devices, to have a small energy consumption. That is why we can observe the following shift of research focus: from the analysis of the energy consumption of existing ciphers and their building blocks to the design of new ciphers and building blocks, specifically for low energy. Existing research results focusing on the energy consumption of symmetric ciphers, suggest that the S-box is the most expensive part in the majority of lightweight implementations. If we only consider purely combinatorial S-boxes, we can focus on reducing the power consumption of the S-box in order to minimize the energy consumption of the overall cipher. In this paper, we propose several methods to obtain \(4 \times 4\) and \(5\times 5\) S-boxes that are either power or area efficient. Our results show that heuristics should be considered as a viable choice for the generation of S-boxes with good implementation properties.

Stjepan Picek, Bohan Yang, Vladimir Rozic, Nele Mentens

All the AES You Need on Cortex-M3 and M4

Abstract

This paper describes highly-optimized AES-\(\{128,192,256\}\)-CTR assembly implementations for the popular ARM Cortex-M3 and M4 embedded microprocessors. These implementations are about twice as fast as existing implementations. Additionally, we provide the fastest bitsliced constant-time and masked implementations of AES-128-CTR to protect against timing attacks, power analysis and other (first-order) side-channel attacks. All implementations, including an architecture-specific instruction scheduler and register allocator, which we use to minimize expensive loads, are released into the public domain.

Peter Schwabe, Ko Stoffelen

Efficient Symmetric Primitives

Frontmatter

Hold Your Breath, PRIMATEs Are Lightweight

Abstract

This work provides the first hardware implementations of PRIMATEs family of authenticated encryption algorithms. PRIMATEs are designed to be lightweight in hardware, hence we focus on designs for constrained devices. We provide several serial implementations, smallest of which requires only 1.2 kGE. Additionally, we present a variety of threshold implementations that range from 4.7 kGE to 10.3 kGE.

The second part of this work presents a design of a lightweight PRIMATEs coprocessor. It is designed to conform versatile use of the core permutation, which allows implementation of the entire PRIMATEs family, with small differences in hardware. We implement HANUMAN-80 coprocessor, adapted for a 16-bit microcontroller from the Texas Instruments MSP430 family of microcontrollers. The entire HANUMAN-80 coprocessor is tested on a Spartan-6 (XC6SLX45) development board, where it occupies 72 slices (1.06% of available resources). ASIC synthesis yields a 2 kGE implementation using 90 nm library, achieves 33 kbits/sec throughput at 100 kHz operating frequency. It dissipates 0.53 \(\upmu \)W of power on average, resulting in energy consumption of 15.60 pJ/bit.

Danilo Šijačić, Andreas B. Kidmose, Bohan Yang, Subhadeep Banik, Begül Bilgin, Andrey Bogdanov, Ingrid Verbauwhede

Keymill: Side-Channel Resilient Key Generator, A New Concept for SCA-Security by Design

A New Concept for SCA-Security by Design

Abstract

In the crypto community, it is widely acknowledged that any cryptographic scheme that is built with no countermeasure against side-channel analysis (SCA) can be easily broken. In this paper, we challenge this intuition. We investigate a novel approach in the design of cryptographic primitives that promotes inherent security against side-channel analysis without using redundant circuits. We propose Keymill, a new keystream generator that is immune against SCA attacks. Security of the proposed scheme depends on mixing key bits in a special way that expands the size of any useful key hypothesis to the full entropy, which enables SCA-security that is equivalent to the brute force. Doing so, we do not propose a better SCA countermeasure, but rather a new one. The current solution focuses exclusively on side-channel analysis and works on top of any unprotected block cipher for mathematical security. The proposed primitive is generic and can turn any block cipher into a protected mode using only 775 equivalent NAND gates, which is almost half the area of the best countermeasure available in the literature.

Mostafa Taha, Arash Reyhani-Masoleh, Patrick Schaumont

Lightweight Fault Attack Resistance in Software Using Intra-instruction Redundancy

Abstract

Fault attack countermeasures can be implemented by storing or computing sensitive data in redundant form, such that the faulty data can be detected and restored. We present a class of lightweight, portable software countermeasures for block ciphers. Our technique is based on redundant bit-slicing, and it is able to detect faults in the execution of a single instruction. In comparison to earlier techniques, we are able to intercept data faults as well as instruction sequence faults using a uniform technique. Our countermeasure thwarts precise bit-fault injections through pseudo-random shifts in the allocation of data bit-slices. We demonstrate our solution on a full AES design and confirm the claimed security protection through a detailed fault simulation for a 32-bit embedded processor. We also quantify the overhead of the proposed fault countermeasure, and find a minimal increase in footprint (14%), and a moderate performance overhead between 125% to 317%, depending on the desired level of fault-attack resistance.

Conor Patrick, Bilgiday Yuce, Nahid Farhady Ghalaty, Patrick Schaumont

Cryptanalysis of Symmetric Primitives

Frontmatter

New Second Preimage Attacks on Dithered Hash Functions with Low Memory Complexity

Abstract

Dithered hash functions were proposed by Rivest as a method to mitigate second preimage attacks on Merkle-Damgård hash functions. Despite that, second preimage attacks against dithered hash functions were proposed by Andreeva et al. One issue with these second preimage attacks is their huge memory requirement in the precomputation and the online phases. In this paper, we present new second preimage attacks on the dithered Merkle-Damgård construction. These attacks consume significantly less memory in the online phase (with a negligible increase in the online time complexity) than previous attacks. For example, in the case of MD5 with the Keränen sequence, we reduce the memory complexity from about \(2^{51}\) blocks to about \(2^{26.7}\) blocks (about 545 MB). We also present an essentially memoryless variant of Andreeva et al. attack. In case of MD5-Keränen or SHA1-Keränen, the offline and online memory complexity is \(2^{15.2}\) message blocks (about 188–235 KB), at the expense of increasing the offline time complexity.

Muhammad Barham, Orr Dunkelman, Stefan Lucks, Marc Stevens

New Differential Bounds and Division Property of Lilliput: Block Cipher with Extended Generalized Feistel Network

Abstract

This paper provides security analysis of lightweight block cipher Lilliput, which is an instantiation of extended generalized Feistel network (EGFN) developed by Berger et al. at SAC 2013. Its round function updates a part of the state only linearly, which yields several security concerns. The first important discovery is that the lower bounds of the number of active S-boxes provided by the designers are incorrect. Then the new bounds are derived by using mixed integer linear programming (MILP), which shows an interesting fact that the actual bounds are better than the designers originally expected. Another contribution is the best third-party cryptanalysis. Owing to its unique computation structure, the designers expected that EGFN efficiently enhances security against integral cryptanalysis. However, the security is not enhanced as the designers expect. In fact, division property, which is a new method to find integral distinguishers, finds a 13-round distinguisher which improves the previous distinguisher by 4 rounds. The new distinguisher is further extended to a 17-round key recovery attack which improves the previous best attack by 3 rounds.

Yu Sasaki, Yosuke Todo

Cryptanalysis of Simpira v1

Abstract

Simpira v1 is a recently proposed family of permutations, based on the AES round function. The design includes recommendations for using the Simpira permutations in block ciphers, hash functions, or authenticated ciphers. The designers’ security analysis is based on computer-aided bounds for the minimum number of active S-boxes. We show that the underlying assumptions of independence, and thus the derived bounds, are incorrect. For family member Simpira-4, we provide differential trails with only 40 (instead of 75) active S-boxes for the recommended 15 rounds. Based on these trails, we propose full-round collision attacks on the proposed Simpira-4 Davies-Meyer hash construction, with complexity \(2^{82.62}\) for the recommended full 15 rounds and a truncated 256-bit hash value, and complexity \(2^{110.16}\) for 16 rounds and the full 512-bit hash value. These attacks violate the designers’ security claims that there are no structural distinguishers with complexity below \(2^{128}\).

Christoph Dobraunig, Maria Eichlseder, Florian Mendel

An Efficient Affine Equivalence Algorithm for Multiple S-Boxes and a Structured Affine Layer

Abstract

An affine equivalence problem is to find affine mappings A and B such that \(F=B\circ S\circ A\) for given two permutations F and S, which was first studied by Biryukov et al. Their algorithm for solving an affine equivalence problem is quite efficient and has been used in the cryptanalytic toolbox for many cryptographic schemes. Recently, Baek et al. presented a specialized affine equivalence algorithm (SAEA), which solves an affine equivalence problem in the case that S is a concatenation of several smaller S-boxes. The SAEA is more efficient than the affine equivalence algorithm for special cases, but its complexity mainly depends on the entire input size of F.

In this paper, we revisit the affine equivalence problem for a special ASA structure with multiple S-boxes and a structured input affine layer. We show that the work factor in SAEA can be reduced if the input affine layer in ASA has a certain structure. Moreover, the complexity of our algorithm mainly depends on the input size of smaller S-boxes, and not on the entire input size of F. We also present a new attack algorithm on the white-box AES implementation proposed by Baek et al. The cryptanalysis efficiently extracts the secret key from the implementation with a complexity of \(2^{33}\), where the claimed security level is \(2^{110}\).

Jung Hee Cheon, Hyunsook Hong, Joohee Lee, Jooyoung Lee

Estimating the Cost of Generic Quantum Pre-image Attacks on SHA-2 and SHA-3

Abstract

We investigate the cost of Grover’s quantum search algorithm when used in the context of pre-image attacks on the SHA-2 and SHA-3 families of hash functions. Our cost model assumes that the attack is run on a surface code based fault-tolerant quantum computer. Our estimates rely on a time-area metric that costs the number of logical qubits times the depth of the circuit in units of surface code cycles. As a surface code cycle involves a significant classical processing stage, our cost estimates allow for crude, but direct, comparisons of classical and quantum algorithms.

We exhibit a circuit for a pre-image attack on SHA-256 that is approximately \(2^{153.8}\) surface code cycles deep and requires approximately \(2^{12.6}\) logical qubits. This yields an overall cost of \(2^{166.4}\) logical-qubit-cycles. Likewise we exhibit a SHA3-256 circuit that is approximately \(2^{146.5}\) surface code cycles deep and requires approximately \(2^{20}\) logical qubits for a total cost of, again, \(2^{166.5}\) logical-qubit-cycles. Both attacks require on the order of \(2^{128}\) queries in a quantum black-box model, hence our results suggest that executing these attacks may be as much as 275 billion times more expensive than one would expect from the simple query analysis.

Matthew Amy, Olivia Di Matteo, Vlad Gheorghiu, Michele Mosca, Alex Parent, John Schanck

MACs and PRNGs

Frontmatter

Output Masking of Tweakable Even-Mansour Can Be Eliminated for Message Authentication Code

Abstract

In this paper we consider the simplest possible construction of PMAC from a permutation. PMAC-type schemes have been usually constructed from a tweakable blockcipher (TBC). Regarding TBCs, there have been research directions from (1) to (2) and from (1) to (3) described as follows. Here, \(E_{K'}:\{0,1\}^n\rightarrow \{0,1\}^n\) is a blockcipher with a key \(K'\), \(P:\{0,1\}^n\rightarrow \{0,1\}^n\) is a permutation, \(h_K\) is a hash function of a uniform and almost XOR universal family from some tweak space \(\mathcal {TW}\) to \(\{0,1\}^n\), \(tw \in \mathcal {TW}\) is a tweak, and \(x \in \{0,1\}^n\) is an input to a TBC.

(1)

Liskov et al. proposed a blockcipher-based TBC defined as \((tw,x) \mapsto h_K(tw) \oplus E_{K'}(h_K(tw) \oplus x)\). They proved that this scheme is a secure tweakable SPRP (Strong Pseudo-Random Permutation) up to the birthday bound, assuming \(E_{K'}\) is a secure SPRP.

(2)

Kurosawa eliminated \(E_{K'}\) from Liskov et al.’s TBC, where it is replaced with a permutation. This scheme is called Tweakable Even-Mansour (TEM), defined as \((tw,x) \mapsto h_K(tw) \oplus P(h_K(tw) \oplus x)\). He proved that TEM is a secure tweakable SPRP up to the birthday bound, assuming \(P\) is a public random permutation to which everyone can access. Therefore, one can construct a permutation-based PMAC by incorporating TEM with PMAC.

(3)

Rogaway eliminated the output masking. The resultant scheme is called XE, defined as \((tw,x) \mapsto E_{K'}(h_K(tw) \oplus x)\). He proved that XE is a secure tweakable PRP (Pseudo-Random Permutation) up to the birthday bound, assuming \(E_{K'}\) is a secure PRP. Indeed the XE-style constructions have been employed in almost all blockcipher-based PMAC-type schemes.

From these research directions, it is quite natural to consider the scheme defined as \((tw,x) \mapsto P(h_K(tw) \oplus x)\). We call the scheme XP (Xor-Permutation). From TEM to XP, the output masking is eliminated, and from XE to XP, the keyed blockcipher is eliminated, where it is replaced with a permutation. However, XP is not a secure tweakable (S)PRP, since the offset \(h_K(tw)\) can be obtained by inverting the underlying permutation from the output of XP. The next question is to find a secure TBC-based cryptographic scheme, incorporating XP with it instead of a TBC. We prove that incorporating XP with PMAC and truncating some bits of XP at the last block of PMAC, the resultant scheme becomes a secure pseudorandom function up to the birthday bound, assuming \(P\) is a public random permutation.

Shoichi Hirose, Yusuke Naito, Takeshi Sugawara

Improved Algebraic MACs and Practical Keyed-Verification Anonymous Credentials

Abstract

Until quite recently, anonymous credentials systems were based on public key primitives. A new approach, that relies on algebraic Message Authentication Codes (MACs) in prime-order groups, has recently been introduced by Chase et al. at CCS 2014. They proposed two anonymous credentials systems referred to as “Keyed-Verification Anonymous Credentials (KVAC)” as they require the verifier to know the issuer secret key. Unfortunately, both systems presentation proof, for n unrevealed attributes, is of complexity O(n) in the number of group elements. In this paper, we propose a new KVAC system that provides multi-show unlinkability of credentials and is of complexity O(1) in the number of group elements while being almost as efficient as Microsoft’s U-Prove anonymous credentials system (which does not ensure multi-show unlinkability) and many times faster than IBM’s Idemix. Our credentials are constructed based on a new algebraic MAC scheme which is of independent interest. Through slight modifications on the verifier side, our KVAC system, which is proven secure in the random oracle model, can be easily turned into a public-key credentials system. By implementing it on a standard NFC SIM card, we show its efficiency and suitability for real-world use cases and constrained devices. In particular, a credential presentation, with 3 attributes, can be performed in only 88 ms.

Amira Barki, Solenn Brunet, Nicolas Desmoulins, Jacques Traoré

A Robust and Sponge-Like PRNG with Improved Efficiency

Abstract

Ever since Keccak won the SHA3 competition, sponge-based constructions are being suggested for many different applications, including pseudo-random number generators (PRNGs). Sponges are very desirable, being well studied, increasingly efficient to implement and simplistic in their design. The initial construction of a sponge-based PRNG (Bertoni et al. CHES 2010) based its security on the well known sponge indifferentiability proof in the random permutation model and provided no forward security.

Since then, another improved sponge-based PRNG has been put forward by Gaži and Tessaro (Eurocrypt 2016) who point out the necessity for a public seed to prevent an adversarial sampler from gaining non-negligible advantage. The authors further update the security model of Dodis et al. (CCS 2013) to accommodate a public random permutation, modelled in the ideal cipher model, and how this affects the notions of security.

In this paper we introduce Reverie, an improved and practical, sponge-like pseudo-random number generator together with a formal security analysis in the PRNG with input security model of Dodis et al. with the modifications of the Gaži and Tessaro paper.

We prove that Reverie is robust when used with a public random permutation; robustness is the strongest notion of security in the chosen security model. Robustness is proved by establishing two weaker notions of security, preserving and recovering security, which together, can be shown to imply the robustness result. The proofs utilise the H-coefficient technique that has found recent popularity in this area; providing a very useful tool for proving the generator meets the necessary security notions.

Daniel Hutchinson

Lattice-Based Cryptography

Frontmatter

Fixed-Point Arithmetic in SHE Schemes

Abstract

The purpose of this paper is to investigate fixed-point arithmetic in ring-based Somewhat Homomorphic Encryption (SHE) schemes. We provide three main contributions: firstly, we investigate the representation of fixed-point numbers. We analyse the two representations from Dowlin et al., representing a fixed-point number as a large integer (encoded as a scaled polynomial) versus a polynomial-based fractional representation. We show that these two are, in fact, isomorphic by presenting an explicit isomorphism between the two that enables us to map the parameters from one representation to another. Secondly, given a computation and a bound on the fixed-point numbers used as inputs and scalars within the computation, we achieve a way of producing lower bounds on the plaintext modulus p and the degree of the ring d needed to support complex homomorphic operations. Finally, as an application of these bounds, we investigate homomorphic image processing.

Anamaria Costache, Nigel P. Smart, Srinivas Vivek, Adrian Waller

A Full RNS Variant of FV Like Somewhat Homomorphic Encryption Schemes

Abstract

Since Gentry’s breakthrough work in 2009, homomorphic cryptography has received a widespread attention. Implementation of a fully homomorphic cryptographic scheme is however still highly expensive. Somewhat Homomorphic Encryption (SHE) schemes, on the other hand, allow only a limited number of arithmetical operations in the encrypted domain, but are more practical. Many SHE schemes have been proposed, among which the most competitive ones rely on Ring Learning With Errors (RLWE) and operations occur on high-degree polynomials with large coefficients. This work focuses in particular on the Chinese Remainder Theorem representation (a.k.a. Residue Number Systems) applied to the large coefficients. In SHE schemes like that of Fan and Vercauteren (FV), such a representation remains hardly compatible with procedures involving coefficient-wise division and rounding required in decryption and homomorphic multiplication. This paper suggests a way to entirely eliminate the need for multi-precision arithmetic, and presents techniques to enable a full RNS implementation of FV-like schemes. For dimensions between \(2^{11}\) and \(2^{15}\), we report speed-ups from \(5{\times }\) to \(20{\times }\) for decryption, and from \(2{\times }\) to \(4{\times }\) for multiplication.

Jean-Claude Bajard, Julien Eynard, M. Anwar Hasan, Vincent Zucca

Security Considerations for Galois Non-dual RLWE Families

Abstract

We explore further the hardness of the non-dual discrete variant of the Ring-LWE problem for various number rings, give improved attacks for certain rings satisfying some additional assumptions, construct a new family of vulnerable Galois number fields, and apply some number theoretic results on Gauss sums to deduce the likely failure of these attacks for 2-power cyclotomic rings and unramified moduli.

Hao Chen, Kristin Lauter, Katherine E. Stange

Efficient Classical Public Key Cryptography

Frontmatter

Fast, Uniform Scalar Multiplication for Genus 2 Jacobians with Fast Kummers

Abstract

We give one- and two-dimensional scalar multiplication algorithms for Jacobians of genus 2 curves that operate by projecting to Kummer surfaces, where we can exploit faster and more uniform pseudomultiplication, before recovering the proper “signed” output back on the Jacobian. This extends the work of López and Dahab, Okeya and Sakurai, and Brier and Joye to genus 2, and also to two-dimensional scalar multiplication. The technique is especially interesting in genus 2, because Kummer surfaces can outperform comparable elliptic curve systems.

Ping Ngai Chung, Craig Costello, Benjamin Smith

PhiRSA: Exploiting the Computing Power of Vector Instructions on Intel Xeon Phi for RSA

Abstract

Efficient implementations of public-key cryptographic algorithms on general-purpose computing devices, facilitate the applications of cryptography in communication security. Existing solutions work in two different directions: implementations on GPUs achieve high throughput but great latency, while those on CPUs are with low throughput and small latency. Intel Xeon Phi is the first highly parallel coprocessor of Many Integrated Core (MIC) architecture, with up to 61 cores and one 512-bit Vector Processing Unit (VPU) in each core, which offers the potential to achieve both high throughput and small latency. In this paper, we propose a vector-oriented Montgomery multiplication design based on vector carry propagation chain (VCPC) method to fully exploit the computing power of vector instructions on Intel Xeon Phi. Two key features of our design sharply reduce the number of instructions: (1) organizing the additions in Montgomery multiplication to be four VCPCs for saving the overhead of handling carry bits; (2) computing the intermediate scalar variable q in every round without breaking the flow of VCPCs. Furthermore, we offer the optimal Montgomery multiplication implementation of our design on Intel Xeon Phi, which make VPUs fully pipelined and maintain carry bits in vector mask registers. Based on the above, we implement RSA named PhiRSA and evaluate it on Intel Xeon Phi 7120P. For 1024, 2048 and 4096-bit RSA, PhiRSA performs 258,370, 41,803 and 5,358 decryptions per second, and the latencies are 0.94, 5.84 and 45.54 ms, respectively. These results achieve 4.1 to 8.5 times performance of the existing RSA implementations on Intel Xeon Phi, exhibit high throughput comparable to those on GPUs but with much less parallel tasks, and small latency comparable to those on CPUs.

Yuan Zhao, Wuqiong Pan, Jingqiang Lin, Peng Liu, Cong Xue, Fangyu Zheng

FourNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors

Abstract

We present a high-speed, high-security implementation of the recently proposed elliptic curve Four\(\mathbb {Q}\) (ASIACRYPT 2015) for 32-bit ARM processors with NEON support. Exploiting the versatile and compact arithmetic of this curve, we design a vectorized implementation that achieves high-performance across a large variety of ARM platforms. Our software is fully protected against timing and cache attacks, and showcases the impressive speed of Four\(\mathbb {Q}\) when compared with other curve-based alternatives. For example, one single variable-base scalar multiplication is computed in about 235,000 Cortex-A8 cycles or 132,000 Cortex-A15 cycles which, compared to the results of the fastest genus 2 Kummer and Curve25519 implementations on the same platforms, offer speedups between 1.3x–1.7x and between 2.1x–2.4x, respectively. In comparison with the NIST standard curve K-283, we achieve speedups above 4x and 5.5x.

Patrick Longa

Cryptanalysis of Asymmetric Primitives

Frontmatter

Sieving for Closest Lattice Vectors (with Preprocessing)

Abstract

Lattice-based cryptography has recently emerged as a prime candidate for efficient and secure post-quantum cryptography. The two main hard problems underlying its security are the shortest vector problem (SVP) and the closest vector problem (CVP). Various algorithms have been studied for solving these problems, and for SVP, lattice sieving currently dominates in terms of the asymptotic time complexity: one can heuristically solve SVP in time \(2^{0.292d + o(d)}\) in high dimensions d [Becker–Ducas–Gama–Laarhoven, SODA’16]. Although several SVP algorithms can also be used to solve CVP, it is not clear whether this also holds for heuristic lattice sieving methods. The best time complexity for CVP is currently \(2^{0.377d + o(d)}\) [Becker–Gama–Joux, ANTS’14].

In this paper we revisit sieving algorithms for solving SVP, and study how these algorithms can be modified to solve CVP and its variants as well. Our first method is aimed at solving one problem instance and minimizes the overall time complexity for a single CVP instance with a time complexity of \(2^{0.292d + o(d)}\). Our second method minimizes the amortized time complexity for several instances on the same lattice, at the cost of a larger preprocessing cost. Using nearest neighbor searching with a balanced space-time tradeoff, with this method we can solve the closest vector problem with preprocessing (CVPP) with \(2^{0.636d + o(d)}\) space and preprocessing, in \(2^{0.136d + o(d)}\) time, while the query complexity can be further reduced to \(2^{0.059d + o(d)}\) at the cost of \(2^{d + o(d)}\) space and preprocessing, or even to \(2^{\varepsilon d + o(d)}\) for arbitrary \(\varepsilon > 0\), at the cost of preprocessing time and memory complexities of \((1/\varepsilon )^{O(d)}\).

For easier variants of CVP, such as approximate CVP and bounded distance decoding (BDD), we further show how the preprocessing method achieves even better complexities. For instance, we can solve approximate CVPP with large approximation factors \(\kappa \) with polynomial-sized advice in polynomial time if \(\kappa = \varOmega (\sqrt{d / \log d})\). This heuristically closes the gap between the decision-CVPP result of [Aharonov–Regev, FOCS’04] (with equivalent \(\kappa \)) and the search-CVPP result of [Dadush–Regev–Stephens-Davidowitz, CCC’14] (which required larger \(\kappa \)).

Thijs Laarhoven

Key Recovery Attack on the Cubic ABC Simple Matrix Multivariate Encryption Scheme

Abstract

In the last few years multivariate public key cryptography has experienced an infusion of new ideas for encryption. Among these new strategies is the ABC Simple Matrix family of encryption schemes which utilize the structure of a large matrix algebra to construct effectively invertible systems of nonlinear equations hidden by an isomorphism of polynomials. The cubic version of the ABC Simple Matrix Encryption was developed with provable security in mind and was published including a heuristic security argument claiming that an attack on the scheme should be at least as difficult as solving a random system of quadratic equations over a finite field.

In this work, we prove that these claims are erroneous. We present a complete key recovery attack breaking full sized instances of the scheme. Interestingly, the same attack applies to the quadratic version of ABC, but is far less efficient; thus, the enhanced security scheme is less secure than the original.

Dustin Moody, Ray Perlner, Daniel Smith-Tone

Solving Discrete Logarithms on a 170-Bit MNT Curve by Pairing Reduction

Abstract

Pairing based cryptography is in a dangerous position following the breakthroughs on discrete logarithms computations in finite fields of small characteristic. Remaining instances are built over finite fields of large characteristic and their security relies on the fact the embedding field of the underlying curve is relatively large. How large is debatable. The aim of our work is to sustain the claim that the combination of degree 3 embedding and too small finite fields obviously does not provide enough security. As a computational example, we solve the DLP on a 170-bit MNT curve, by exploiting the pairing embedding to a 508-bit, degree-3 extension of the base field.

Aurore Guillevic, François Morain, Emmanuel Thomé

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Invited Lectures

Frontmatter

Physical Attacks and Beyond

Post-quantum Key Exchange for the Internet and the Open Quantum Safe Project

Side Channels and Fault Attacks

Frontmatter

Detecting Side Channel Vulnerabilities in Improved Rotating S-Box Masking Scheme—Presenting Four Non-profiled Attacks

Bridging the Gap: Advanced Tools for Side-Channel Leakage Estimation Beyond Gaussian Templates and Histograms

Uniform First-Order Threshold Implementations

Attacking Embedded ECC Implementations Through cmov Side Channels

Lattice Attacks Against Elliptic-Curve Signatures with Blinded Scalar Multiplication

Loop-Abort Faults on Lattice-Based Fiat-Shamir and Hash-and-Sign Signatures

Design and Implementation of Symmetric Cryptography

Frontmatter

On the Construction of Hardware-Friendly and S-Boxes

All the AES You Need on Cortex-M3 and M4

Efficient Symmetric Primitives

Frontmatter

Hold Your Breath, PRIMATEs Are Lightweight

Keymill: Side-Channel Resilient Key Generator, A New Concept for SCA-Security by Design

Lightweight Fault Attack Resistance in Software Using Intra-instruction Redundancy

Cryptanalysis of Symmetric Primitives

Frontmatter

New Second Preimage Attacks on Dithered Hash Functions with Low Memory Complexity

New Differential Bounds and Division Property of Lilliput: Block Cipher with Extended Generalized Feistel Network

Cryptanalysis of Simpira v1

An Efficient Affine Equivalence Algorithm for Multiple S-Boxes and a Structured Affine Layer

Estimating the Cost of Generic Quantum Pre-image Attacks on SHA-2 and SHA-3

MACs and PRNGs

Frontmatter

Output Masking of Tweakable Even-Mansour Can Be Eliminated for Message Authentication Code

Improved Algebraic MACs and Practical Keyed-Verification Anonymous Credentials

A Robust and Sponge-Like PRNG with Improved Efficiency

Lattice-Based Cryptography

Frontmatter

Fixed-Point Arithmetic in SHE Schemes

A Full RNS Variant of FV Like Somewhat Homomorphic Encryption Schemes

Security Considerations for Galois Non-dual RLWE Families

Efficient Classical Public Key Cryptography

Frontmatter

Fast, Uniform Scalar Multiplication for Genus 2 Jacobians with Fast Kummers

PhiRSA: Exploiting the Computing Power of Vector Instructions on Intel Xeon Phi for RSA

FourNEON: Faster Elliptic Curve Scalar Multiplications on ARM Processors

Cryptanalysis of Asymmetric Primitives

Frontmatter

Sieving for Closest Lattice Vectors (with Preprocessing)

Key Recovery Attack on the Cubic ABC Simple Matrix Multivariate Encryption Scheme

Solving Discrete Logarithms on a 170-Bit MNT Curve by Pairing Reduction

Backmatter