#103: PCI Express: What’s Next for Storage
PCI Express® (PCIe®) 3.0 architecture has enabled Flash Storage to transition to high speed, low latency power efficient performance over the past few years. However, the hunger for additional performance in power constrained devices continues and PCI-SIG® continues its nearly three decade history of delivering performance doubling and additional features with the development of the PCIe 4.0 and PCIe 5.0 specifications.This presentation will review the major features of PCIe 4.0 and PCIe 5.0 technology, which will continue to enable power efficient performance required as NAND capacities scale and faster SCM (Storage Class Memories) become mainstream. Session attendees will gain insight into the current status of the PCIe 4.0 technology rollout and testing and will learn about the PCIe 5.0 specification development and timeline for completion in 2019.Learning Objectives: 1) Learn how PCIe is becoming the I/O of choice for storage; 2) Gain insight into the status of PCIe 4.0 roll-out for storage applications; 3) Understand the PCIe roadmap to 32GT/s.
29 Jul 2019
#110: Datacenter Management of NVMe Drives
This talk describes work going on in three different organizations to enable scale out management of NVMe SSDs. The soon to be released NVME-MI 1.1 standard will allow management from host based agents as well as BMCs. This might be extended to allow support for Binary Encoded JSON (BEJ) in support of host agents and BMCs that want to support the Redfish Standard. We will also cover work going on in SNIA (Object Drive TWG) and DMTF in support.Learning Objectives: 1) Principles and limitations of scale out datacenter management; 2) An understanding of the NVMe-MI standard; 3) A Redfish profile for NVMe drives; 4) Inside the box management networks and outside the box management networks; 5) Platform Layer Data Model (PLDM).
8 Oct 2019
#105: Dual-Mode SSD Architecture for Next-Generation Hyperscale Data Centers
Increasing proliferation of Artificial Intelligence, E-commerce, Big Data and Cloud applications is leading to highly diversified workloads and use cases in hyperscale data centers, which poses new challenges to solid state storage in terms of performance, flexibility and TCO optimizations. Moreover, there are increasing demands for software/hardware co-optimization and more control over I/O path from applications. Standard SSDs that are tuned for a few generic workloads cannot meet these challenges, resulting in suboptimal performance and TCO.We present our Dual-Mode SSD Architecture, a new storage architecture designed for our next-generation hyperscale data centers. We define our Open Channel SSD specification and build a Dual-Mode SSD platform that supports both Open Channel mode and standard NVMe mode. We develop our Open Channel software stack in full user space as well as in kernel space. Working seamlessly with our storage engine software, we build customized FTL solutions for different business applications. Our software/hardware co-optimization solutions is leading to significant benefits in performance, Quality-of-Service and TCO.Learning Objectives: 1) Challenges to solid storage systems in next-generation hyperscale data centers; 2) Dual-Mode SSD architecture; 3) Full user space Open Channel software stack, and software/hardware co-optimization solutions.
13 Aug 2019
#108: SPDK NVMe: An In-depth Look at its Architecture and Design
The Storage Performance Development Kit (SPDK) open source project is gaining momentum in the storage industry for its drivers and libraries for building userspace, polled mode storage applications and appliances. The SPDK NVMe driver was SPDK’s first released building block and is its most well-known. The driver’s design and architecture is heavily influenced by SPDK’s userspace polled-mode framework which has resulted in some significant differences compared to traditional kernel NVMe drivers. This presentation will present an overview of the SPDK NVMe driver’s architecture and design, a historical perspective on key design decisions and a discussion on the driver’s advantages and limitations.Learning Objectives: 1) Gain a deeper understanding of the architecture and design of the SPDK NVMe driver; 2) Identify the key design differences between a userspace polled-mode driver and a traditional kernel-mode driver; 3) Describe the key advantages and limitations of SPDK and its polled mode NVMe driver.
17 Sep 2019
Most Popular Podcasts
#107: The Long and Winding Road to Persistent Memories
Persistent Memory is getting a lot of attention. SNIA has released a programming standard, NVDIMM makers, with the help of JEDEC, have created standardized hardware to develop & test PM, and chip makers continue to promote upcoming devices, although few are currently available.In this talk two industry analysts, Jim Handy & Tom Coughlin, will provide the state of Persistent Memory and show a realistic roadmap of what the industry can expect to see and when they can expect to see it. The presentation, based on three critical reports covering New Memory Technologies, NVDIMMs, and Intel’s 3D XPoint Memory (also known as Optane) will illustrate the Persistent Memory market, the technologies that vie to play a role, and the critical economic obstacles that continue to impede these technologies’ progress. We will also explore how advanced logic process technologies are likely to cause persistent memories to become a standard ingredient in embedded applications, such as IoT nodes long before they make sense in servers.Learning Objectives: 1) What is the state of emerging memory technologies; 2) What technologies will be used in future NVDIMMS; 3) Emerging memory use in embedded and enterprise applications; 4) What are the costs for making emerging memories.
26 Aug 2019
#106: Container Attached Storage (CAS) with openEBS
Applying micro service patterns to storage giving each workload its own Container Attached Storage (CAS) system. This puts the DevOps persona within full control of the storage requirements and brings data agility to k8s persistent workloads. We will go over the concept and the implementation of CAS, as well as its orchestration.Learning Objectives: 1) Go over the modern day apps and their storage needs; under the notion of applications have changed someone forgot to tell storage; 2) What are the problems to use technologies like user space IO, in particular using technologies like SPDK among others; 3) Looking devops and the k8s model, how can we bring the power of user space storage in developers hands? Virtio for containers? direct access from the go run time for example SPDK?; 4) We have tried both, and like to share the outcome of this with you.
19 Aug 2019
#125: Opening up Linux to the wider world
After a year of implementation progress of the The SMB3 .1.1 POSIX Extensions, a set of protocol extensions to allow for optimal Linux and Unix interoperability with NAS and Cloud file servers- what is the current status- what have we learned- what has changed in the protocol specification in the past year- what advice do we have for implementers- and users …These extensions greatly improve the experience for users of Linux. This presentation will review the state of the protocol extensions and their current implementation in the Linux kernel and Samba among others, and provide an opportunity for feedback and suggestions for additions to the POSIX extensions.This has been an exciting year with many improvements to the implementations of the SMB3.1.1 POSIX Extensions in Samba and Linux!Learning Objectives: 1) What is the current status of Linux interoperability with various SMB3.1.1 servers?; 2) How have the protocol extensions for Linux/POSIX progressed over the past year? What has changed? What works?; 3) What are suggestions for implementors of SMB3.1.1 servers?; 4) What is useful information for users to know to try these extensions?; 5) How do new Linux file system features map to these extensions?
5 May 2020
#122: 10 Million I/Ops From a Single Thread
One of the most common benchmarks in the storage industry is 4KiB random read I/O per second. Over the years, the industry first saw the publication of 1M I/Ops on a single box, then 1M I/Ops on a single thread (by SPDK). More recently, there have been publications outlining 10M I/Ops on a single box using high performance NVMe devices and more than 100 CPU cores.This talk will present a benchmark of SPDK performing more than 10 million random 4KiB read operations per second from a single thread to 20 NVMe devices, a large advance compared to the state of the art of the industry. SPDK has developed a number of novel techniques to reach this level of performance, which will be outlined in detail here. These techniques include polling, advanced MMIO doorbell batching strategies, PCIe and DDIO considerations, careful management of the CPU cache, and the use of non-temporal CPU instructions. This will be a low level talk with real examples of eliminating data dependent loads, profiling last level cache misses, pre-fetching, and more. Additionally, there remains a number of techniques that have not yet been employed that warrant future research. These techniques often push devices outside of their original intended operating mode, while remaining within the bounds of the specification, and so often require collaboration between NVMe controller and device designers, the NVMe specification body, and software developers such as the SPDK team.Learning Objectives: 1) Optimal use of NVMe devices; 2) Optimal use of PCIe and MMIO in a storage stack; 3) Leveraging advanced x86-64 CPU instructions and making best use of the CPU cache.
30 Mar 2020
#118: Linux NVMe and Block Layer Status Update
This talks explains the exciting new features in the Linux NVMe driver and software target in the last two years, as well as the relevant block layer changes to support these features.Learning Objectives: 1) Learn about new Linux features; 2) Learn about new NVMe features; 3) Have fun!
28 Jan 2020
#115: Accelerating RocksDB with Eideticom’s NoLoad NVMe-based Computational Storage Processor
RocksDB, a high performance key-value database developed by Facebook, has proven effective in using the high data speeds made possible by Solid State Drives (SSDs). By leveraging the NVMe standard, Eideticom’s NoLoad® presents FPGA computational storage processors as NVMe namespaces to the operating system and enables efficient data transfer between the NoLoad® Computational Storage Processors (CSPs), host memory and other NVMe/PCIe devices in the system. Presenting Computational Storage Processors as NVMe namespaces has the significant benefit of minimal software effort to integrate computational resources.In this presentation we use Eideticom’s NoLoad® to speed up RocksDB. Compared to software compaction running on a Dell R7425 PowerEdge server, our NoLoad®, running on Xilinx’s Avleo U280, resulted in 6x improvement in database transactions and 2.5x reduction is CPU usage while reducing worst case latency by 2.7x.Learning Objectives: 1) Computational storage with NVMe; 2) Presenting computational storage processors as NVMe namespaces; 3) Accelerating database access with NVMe computational storage processors.
4 Dec 2019
#114: NVM Express Specifications: Mastering Today’s Architecture and Preparing for Tomorrow’s
Since the first release of NVMe 1.0 in 2011, the NVMe family of specifications continue to expand to support current and future storage markets, increasing the amount of new features and functionality. With that natural, organic growth, however, comes additional complexity.In order to refocus on simplicity and ease-of-development, the NVM Express group has undertaken a massive effort to refactor the specification. The upcoming refactored specification - NVMe 2.0 - integrates the scalable and flexible NVMe over Fabrics architecture within the NVMe base specification, meeting the needs of platform designers, device vendors and developers.But how can developers optimally design their products using the new NVMe 2.0 specification?This session will provide attendees with the following insights:• An overview of the existing specification structure, its logic and limitations• Highlights on how developers use the current specification before refactoring• Information showing how the refactored specification enables companies to architect their products with better awareness of future areas of innovation• Details on how new features and functionalities will be included in the refactored specification• Descriptions of how developers can leverage the refactored NVMe 2.0 specification to simply and efficiently bring new products to market• Examination of the current projects and how to contributeLearning Objectives: 1) Overview of the current NVMe specification structure; 2) Introduction to NVMe 2.0: the refactored specification enables companies and developers to simply and efficiently bring new products to market; 3) The new features and functionalities that will be included in NVMe 2.0 and how to get involved in current projects.
22 Nov 2019
#112: Computational Storage Architecture Development
With the onset of the Computational Storage TWG and growth of interest in the market for these new and emerging solutions, it is imperative to understand how to develop, deploy and scale these new technologies. This session will walk through the new definitions, how each can be deployed and show use cases of NGD Systems Computational Storage Devices (CSD).Learning Objectives:1. Learn the different kinds of Computational Storage2. Understand the use cases for each type of solutions3. Determine the ease of deployment and the value of these solutions
29 Oct 2019
#102: Achieving 10-Million IOPS from a single VM on Windows Hyper-V
Many server workloads, for example OLTP database workloads, require high I/O throughput and low latency. With the industry trend of moving high-end scale-up workloads to virtualization environment, it is essential for cloud providers and on-premises servers to achieve near native performance by reducing I/O virtualization overhead which mainly comes from two sources: DMA operations and interrupt delivery mechanism for I/O completions. The direct PCIe NVMe device assign techniques allow a VM to interact with HW devices directly and avoid using traditional Hyper-V para-virtualized I/O path.To improve interrupt handling in the virtualization environment, Intel introduces Posted Interrupts (PI) as an enhanced method to mitigate interrupt delivery overhead in a virtualized environment, bypassing hypervisor involvement completely. In this talk, we will present Microsoft implementation and optimization of Intel PI and Hyper-V direct PCIe NVMe access on Windows platform. The results showed that we were able to achieve more than 10-Million IOPS from a single VM for the first time in the industry using an Intel Skylake based HPE commodity server with these techniques.
15 Jul 2019
#101: Introduction to Persistent Memory Configuration and Analysis Tools
Have you heard of non-volatile/persistent memory but don’t know how to get started with this disruptive technology? Memory is the new Storage. Next generation storage tiered architectures are evolving with persistent memory and hardware delivering NVDIMMs. Are you a Linux or Windows application developer familiar with C, C++, Java, or Python, keen to develop the next revolutionary application or modify an existing application, but not sure where to start? Do you know what performance and analysis tools can be used to identify optimizations in your app to take advantage of persistent memory? Are you a software, server, or cloud architect that wants to get a jump start on this disruptive technology? This presentation will get you started on the persistent memory solution path. The future is in your hands. The future is now!Learning Objectives: 1) We’ll deliver an introductory understanding of persistent memory, introduce the SNIA Programming Model, Direct Access (DAX) filesystems, and show where persistent memory fits in the storage hierarchy; 2) We’ll provide several options for creating development environments (you don’t need physical modules to get started!); 3) We’ll introduce application programming using the Persistent Memory Developers Kit (PMDK); 4) We’ll introduce and describe how to create and manage Persistent Memory Regions, Namespaces, and Labels; 5) Describe existing analysis tools to identify applications that are good candidates for persistent memory.
8 Jul 2019
#99: SNIA Nonvolatile Memory Programming TWG - Remote Persistent Memory
The SNIA NVMP Technical Workgroup (TWG) continues to make significant progress on defining the architecture for interfacing applications to PM. In this talk, we will focus on the important Remote Persistent Memory scenario, and how the NVMP TWG’s programming model applies. Application use of these interfaces, along with fabric support such as RDMA and platform extensions, are part of this, and the talk will describe how the larger ecosystem fits together to support PM as low-latency remote storage.Learning Objectives: 1) Persistent Memory programming; 2) RDMA extensions; 3) SNIA PM initiatives.
17 Jun 2019
#98: Rethinking Ceph Architecture for Disaggregation Using NVMe-over-Fabrics
Ceph protects data by making 2-3 copies of the same data but that means 2-3x more storage servers and related costs. It also means higher write latencies as data hops between OSD nodes. Customers are now starting to deploy Ceph using SSDs for high-performance workloads and for data lakes supporting real-time analytics. We describe a novel approach that eliminates the added server cost by creating Containerized, stateless OSDs and leveraging NVMe-over-fabrics to replicate data in server-less storage nodes. We propose redefining the boundaries of separation within SDS architectures to address disaggregation overheads. Specifically, we decouple control and data plane operations and transfer block ownership to execute on remote storage targets. It also dramatically reduces write latency to enable Ceph to be used for databases and to speed up large file writes. As part of the solution, we also describe how OSD node failover is preserved via a novel mechanism using standby stateless OSD nodes.Learning Objectives: 1) Storage disaggregation; 2) NVMe over fabrics; 3) Ceph architecture.
10 Jun 2019
#97: Delivering Scalable Distributed Block Storage using NVMe over Fabrics
NVMe and NVMe over Fabrics (NVMe-oF) protocols provide a highly efficient access to flash storage inside a server and over the network respectively. Current generation of distributed storage software stacks use proprietary protocols which are sub-optimal to deliver end to end low latency. Moreover it increases operational complexity to manage NVMe-oF managed flash storage and distributed flash storage in private cloud infrastructure. In this session, we present NVMe over Fabrics based high performance distributed block storage that combines the best of both worlds to deliver performance, elasticity and rich data services.Learning Objectives: 1) NVMe, NVMe-oF for flash data path IO architecture; 2) Programming, architecture and optimization for flash; 3) Distributed storage, data services.
3 Jun 2019
#96: Solid State Datacenter Transformation
Intel Fellow Amber Huffman has been at the center of Intel’s development for SSDs, with emphasis on SSD storage interfaces and next generation form factors. In this talk, she will discuss the rationale behind decisions that were made in advancing storage architecture leading to the emergence of solid state only Data Centers. Amber will also discuss the key factors influencing the future of the Data Center and the important role storage continues to play.
20 May 2019
#95: Tunneling through Barriers
Join Dr. Andy Walker for a wonderfully illustrated tour through 90 years of physics and materials science, leading to modern solid state memory technologies via “The Golden Thread of Tunneling.”
13 May 2019
#92: Fibre Channel – The Most Trusted Fabric Delivers NVMe
As data-intensive workloads transition to low-latency NVMe flash-based storage to meet increasing user demand, the Fibre Channel industry is combining the lossless, highly deterministic nature of Fibre Channel with NVMe. FC-NVMe targets the performance, application response time, and scalability needed for next-generation data centers while leveraging existing Fibre Channel infrastructures. This presentation will provide an overview of why Fibre Channel’s inherent multi-queue capability, parallelism, deep queues, and battle-hardened reliability make it an ideal transport for NVMe across the fabric.Learning Objectives: 1) A reminder of how Fibre Channel works; 2) A reminder of how NVMe over Fabrics work; 3) A high-level overview of Fibre Channel and NVMe, especially how they work together.
16 Apr 2019