Spring 2018 · Stanford Software Research Lunch

Program for the spring quarter of 2018.

4/5: Organizational Lunch

Time: Thursday, April 5, 2018, 12 noon - 1pm
Location: Gates 415

Organizational lunch. Come sign up to give a talk during the quarter.

Food: Todd

4/12: Upcoming hardware changes in the \"post Moore\" world and how to prepare

Time: Thursday, April 12, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Georgios Michelogiannakis

Abstract: With the decline and eventual end of historical rates of lithographic scaling, we arrive at a crossroad where synergistic and holistic decisions are required to preserve performance over power scaling in digital computing. Numerous technologies are emerging with this exact aim, from devices (transistors), memories, 3D integration, specialization, photonics, and others. With so many emerging technologies, the landscape of computer architecture in the 10-15 year time frame may be radically different. This talk will present an overview of potential game-changing technologies, and will then move to a discussion of what this means to the software and how to start preparing for the future now.

Food: Yoni

4/19: Practical Program Analysis: Principles and Techniques

Time: Thursday, April 19, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Qirun Zhang

Abstract: Software reliability is critical and challenging, for which program analysis offers a principled methodology. However, developing practical program analyses that are scalable and precise is difficult, both conceptually and engineering-wise. This talk highlights two lines of my research that significantly advance the state-of-the-art of program analysis. First, I will present a new reachability-based analysis framework and asymptotically faster algorithms that drastically outperform existing frameworks/algorithms in both speed and precision. The popular LLVM compiler infrastructure has adopted the techniques for fast, precise alias analysis. Second, I will describe a principled, scalable program enumeration framework for rigorous compiler testing. This work has led to 300+ confirmed/fixed bugs in important production/research compilers (such as GCC/LLVM/CompCert, Scala, and Rust) and enjoyed wide public acknowledgments from the compiler developer community. I will conclude the talk by discussing my research vision and plan for building reliable and performant software.

Food: Wonchan

4/26: Nerpa: Concise, Modular Programming of Industrial SDNs

Time: Thursday, April 26, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Leonid Ryzhyk

Abstract: Industrial software-defined networks (SDNs) like Google's Espresso and VMWare's NSX are written today using general-purpose languages like Java or C/C++ that expose programmers to details of low-level flow tables and control/dataplane synchronization, resulting in large, hard to maintain code bases. We introduce Nerpa, an SDN programming language, which we believe dramatically simplifies real-world SDN development by effectively combining relational and imperative constructs. A from-scratch implementation of OVN, an open source virtual network, takes ~1000 lines of Nerpa code, 20x less than the native C implementation. Our implementation is modular and extensible, making it easy to add new features with a few lines of code. Nerpa's high-level abstractions perform on par with hand-optimized code. Nerpa computes an incremental update to a large virtual network with 20,000 virtual machines in one millisecond; the native OVN implementation takes 17 seconds for an equivalent computation.

Food: Zhihao

5/3: Not Given

Time: Thursday, May 3, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Sergio Benitez

Abstract: Programs operating “close to the metal” necessarily handle memory directly. Because of this, they must be written in languages like C or C++. These languages lack any kind of guarantee on memory or race safety, often leading to security vulnerabilities and unreliable software. Ideally, we would like a practical language that gives programmers direct control over memory and aliasing while also offering race and memory safety guarantees.
This talk describes Metal, our formally specified Rust-based language that enjoys memory-safe and race-free references through ownership and restricted aliasing in the type system. Metal's type system models references and ownership as capabilities, where bindings have indirect capabilities on value locations. Syntactically, Metal is a strict subset of Rust. Semantically, Metal can track locations, and thus capabilities, with greater precision and thus accepts a wider range of safe programs in its syntactic subset. On the other hand, Metal lacks Rust’s region-based “lifetimes”, leading to coarser grained capability restoration. A small subset of the syntax, semantics, and type system of Metal are presented, as well as a comparison between Metal and Rust and speculative extensions to Metal and Rust that allow greater flexibility in single threaded programs.

Food: Todd

5/10: Data-Driven Program Analysis with Deep Learning

Time: Thursday, May 10, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Ke Wang

Abstract: Deep learning has led to breakthroughs in various application areas; it also starts to find exciting applications in software development, a key mission of Computer Science. Through cross-disciplinary research in machine learning, programming languages and software engineering, my goal is to develop practical deep learning-powered techniques and tools to significantly advance the state-of-the-art science and practice of software development. This talk highlights my research in this direction that bridges deep learning and program analysis. In particular, I will present my “Search, Align, and Repair” data-driven framework (SARFGEN) for generating instant, precise and complex fixes for MOOC-scale introductory programming exercises, which has been in production use in the Microsoft-DEV204.1X and Microsoft-DEV330x class. I will describe SARFGEN’s core algorithms for identifying similar programs, and aligning correct/incorrect programs, and its novel deep neural architecture for discovering minimal repairs. I will conclude the talk by discussing my research vision and plan for developing practical, intelligent software analysis and engineering tools.

Food: Zhihao

5/17: Isometry: A Path-Based Distributed Data Transfer System

Time: Thursday, May 17, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Zhihao Jia

Abstract: Data transfers in parallel systems have a significant impact on the performance of applications. Most existing systems generally support only data transfers between memories with a direct hardware connection and have limited facilities for handling transformations to the data’s layout in memory. As a result, to move data between memories that are not directly connected, higher levels of the software stack must explicitly divide a multi-hop transfer into a sequence of single-hop transfers and decide how and where to perform data layout conversions if needed. This approach results in inefficiencies, as the higher levels lack enough information to plan transfers as a whole, while the lower level that does the transfer sees only the individual single-hop requests.
We present Isometry, a path-based distributed data transfer system. The Isometry path planner selects an efficient path for a transfer and submits it to the Isometry runtime, which is optimized for managing and coordinating the direct data transfers. The Isometry runtime automatically pipelines sequential direct transfers within a path and can incorporate flexible scheduling policies, such as prioritizing one transfer over another. Our evaluation shows that Isometry can speed up data transfers by up to 2.2× and reduce the completion time of high priority transfers by up to 95% compared to the baseline Realm data transfer system. We evaluate Isometry on three benchmarks and show that Isometry reduces transfer time by up to 80% and overall completion time by up to 60%.

Food: Manolis

5/24: Online Detection of Effectively Callback Free Objects with Applications to Smart Contracts

Time: Thursday, May 24, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Mooly Sagiv

Abstract: Callbacks are essential in many programming environments, but drastically complicate program understanding and reasoning because they allow to mutate object’s local states by external objects in unexpected fashions, thus breaking modularity. The famous DAO bug in the cryptocurrency framework Ethereum, employed callbacks to steal $150M. We defne the notion of Effectively Callback Free (ECF) objects in order to allow callbacks without preventing modular reasoning. An object is ECF in a given execution trace if there exists an equivalent execution trace without callbacks to this object. An object is ECF if it is ECF in every possible execution trace. We study the decidability of dynamically checking ECF in a given execution trace and statically checking if an object is ECF. We also show that dynamically checking ECF in Ethereum is feasible and can be done online. By running the history of all execution traces in Ethereum, we were able to verify that virtually all existing contract executions, excluding these of the DAO or of contracts with similar known vulnerabilities, are ECF. Finally, we show that ECF, whether it is verified dynamically or statically, enables modular reasoning about objects with encapsulated state.

Food: Cristian

5/31: Beyond Data and Model Parallelism for Deep Neural Networks

Time: Thursday, May 31, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Zhihao Jia

Abstract: The computational requirements for training deep neural networks (DNNs) have grown to the point that it is now standard practice to parallelize training. Existing deep learning systems commonly use data or model parallelism, but unfortunately, these strategies often result in suboptimal parallelization performance. In this paper, we define a more comprehensive search space of parallelization strategies for DNNs called SOAP, which includes strategies to parallelize a DNN in the Sample, Operation, Attribute, and Parameter dimensions. We also propose FlexFlow, a deep learning framework that uses guided randomized search of the SOAP space to find a fast parallelization strategy for a specific parallel machine. To accelerate this search, FlexFlow introduces a novel execution simulator that can accurately predict a parallelization strategy’s performance and is three orders of magnitude faster than prior approaches that have to execute each strategy. We evaluate FlexFlow with six real-world DNN benchmarks on two GPU clusters and show that FlexFlow can increase training throughput by up to 3.8x over state-of-the-art approaches, even when including its search time, and also improves scalability.

Food: Karthik

6/7: p4v: Practical Verification for Programmable Data Planes

Time: Thursday, June 7, 2018, 12 noon - 1pm
Location: Gates 415

Speaker: Nate Foster

Abstract: This talk will present the design and implementation of p4v, a tool for verifying data planes specified using the P4 programming language. The tool is based on classic techniques but adds several key innovations including a novel mechanism for incorporating assumptions about the control plane, as well as domain-specific optimizations that are necessary to scale to large programs. With just a few hundred lines of control-plane annotations, p4v is able to quickly verify critical safety properties for a variety of real-world programs including switch.p4, a large program that implements all of the functionality found in a modern data center switch.
Joint work with Jed Liu (Barefoot), Bill Hallahan (Yale), Cole Schlesinger (Barefoot), Milad Sharif (Barefoot), Robert Soulé (Lugano), Han Wang (Barefoot), Calin Cascaval (Barefoot), and Nick McKeown (Stanford).

Food: Todd