Fall 2022 · Stanford Software Research Lunch

Program for the fall quarter of 2022.

9/6: Continuous Formal Verification of Smart Contracts: Mooly Sagiv, Certora and TAU

Time: Tuesday, September 6, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Mooly Sagiv

Abstract: Formal verification is a well-understood topic in computer science, still awaiting ideal application domains. There are several challenges in industrializing formal verification: 1. Most code is either not safety critical or rarely changed. 2. It is difficult, if not impossible, for developers to formally specify program behavior. 3. Formally verifying program correctness is computationally complex and requires enormous human resources with both domain knowledge and formal verification knowledge. DeFi is an interesting application domain for formal verification since DeFi programs (e.g., smart contracts) are typically small yet carry tremendous value. These programs often change between deployments. Due to the high financial cost of errors, developers in this domain value strong invariants that guarantee program correctness. I will show four complementary technologies developed by Certora for verifying smart contracts that leverage modern SMT solvers: 1. A language, CVL, for expressing high-level invariants of low-level code. 2. The Certora Prover, which converts EVM Bytecode and the CVL spec into a mathematical formula and then harnesses SMT solvers for formal verification. 3. A fuzzer for finding executions leading to CVL violations. 4. A mutation tester for checking that the CVL spec is not vacuous.

9/15: Pathways: A Cluster-wide Runtime for Distributed ML

Time: Thursday, September 15, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Michael Isard

Abstract: I will describe the Pathways system and the larger research project that it is a part of. We are building a cluster-wide runtime system with two related goals: 1) to improve the overall utilization of contended accelerator resources; 2) to support novel ML research that moves beyond lockstep SPMD computations, to exploit computational sparsity and sharing of learned components between different clients and ML models. The long term goal of Pathways is to let us build ML models trained for orders of magnitude more tasks, without deploying orders of magnitude more hardware. I will describe some requirements of a system built with the above goals in mind, and the systems research opportunities that they expose. I will also do a deep dive into how the current Pathways systems executes computations on the cluster.

9/29: An Infinite Needle in a Finite Haystack: Finding Infinite Counter-Models in Deductive Verification

Time: Thursday, September 29, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Neta Elad

Abstract: First-order logic, and quantifiers in particular, are widely used in deductive verification of programs and systems. Significant effort has been dedicated to finding quantifier instantiations that establish unsatisfiability of quantified formulas, thus ensuring validity of a system's verification conditions. However, in many cases the formulas are satisfiable --- this is often the case in intermediate steps of the verification process, e.g., when an invariant is not yet inductive. For such cases, existing tools are limited to finding finite models. We tackle the problem of finding infinite models. When displayed to the user, these models give insight into the verification failure, and allow the user to identify and fix bugs in the modeling of the system and its properties. Our approach consists of three parts. Firstly, we introduce templates as a way to represent certain infinite models. Secondly, we identify a new decidable fragment of first-order logic that extends and subsumes EPR, where satisfiable formulas always have a model representable by a template of a bounded size. Lastly, we describe an effective decision procedure to symbolically explore this (usually vast) search space of templates. We implement our approach in a tool called FIT and evaluate it on examples from a variety of domains. Our results show that FIT quickly finds infinite counter-models that distill the source of failure and demonstrate it in a simple way, even in cases beyond the decidable fragment, while state-of-the-art SMT solvers, such as Z3 and cvc5, and resolution-based theorem provers, such as Vampire and SPASS, diverge or return unknown.

10/6: Proof-Stitch: Proof Combination for Divide-and-Conquer SAT Solvers

Time: Thursday, October 6, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Abhishek Anil Nair

Abstract: With the increasing availability of parallel computing power, there is a growing focus on parallelizing algorithms for important automated reasoning problems such as Boolean satisfiability (SAT). Divide-and-Conquer (D&C) is a popular parallel SAT solving paradigm that partitions SAT instances into independent sub-problems which are then solved in parallel. For unsatisfiable instances, state-of-the-art D&C solvers generate DRAT refutations for each sub-problem. However, they do not generate a single refutation for the original instance. To close this gap, we present Proof-Stitch, a procedure for combining refutations of different sub-problems into a single refutation for the original instance. We prove the correctness of the procedure and propose optimizations to reduce the size and checking time of the combined refutations by invoking existing trimming tools in the proof-combination process. We also provide an extensible implementation of the proposed technique. Experiments on instances from last year’s SAT competition show that the optimized refutations are checkable up to seven times faster than unoptimized refutations.

10/13: Unity: Accelerating DNN Training Through Joint Optimization of Algebraic Transformations and Parallelization

Time: Thursday, October 13, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Colin Unger

Abstract: Unity is the first system that jointly optimizes algebraic transformations and parallelization in distributed DNN training. Unity represents both parallelization and algebraic transformations as substitutions on a unified parallel computation graph (PCG), which simultaneously expresses the computation, parallelization, and communication of a distributed DNN training procedure. Optimizations, in the form of graph substitutions, are automatically generated given a list of operator specifications, and are formally verified correct using an automated theorem prover. Unity then uses a novel hierarchical search algorithm to jointly optimize algebraic transformations and parallelization while maintaining scalability. The combination of these techniques provides a generic and extensible approach to optimizing distributed DNN training, capable of integrating new DNN operators, parallelization strategies, and model architectures with minimal manual effort. We evaluate Unity on seven real-world DNNs running on up to 192 GPUs on 32 nodes and show that Unity outperforms existing DNN training frameworks by up to 3.6× while keeping optimization times under 20 minutes. Unity is available to use as part of the open-source DNN training framework FlexFlow at https://github.com/flexflow/flexflow.

10/20: langcc: A Next-Generation Compiler Compiler

Time: Thursday, October 20, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Joe Zimmerman

Abstract: Given a declarative language spec in BNF, langcc automatically generates a full compiler frontend (from BNF), including AST + traversals, lexer, parser, and pretty-printer. langcc can serve as a replacement for lex+yacc, but also provides many additional features, including novel optimizations and grammar transformations, semantic attributes, AST source-line mapping, and user-friendly presentation of LR conflicts. Its generated parsers are faster than the standard parsers for Go 1.17.8 and Python 3.9.12, and the tool is general enough to be self-hosting, i.e., it generates its own frontend. Under the hood, langcc is based on several novel innovations on the LR parsing paradigm of Knuth; see the companion technical report for further details: http://arxiv.org/abs/2209.08383 tool: https://github.com/jzimmerman/langcc

10/27: Performance Modeling of Sparse Matrix Multiplication

Time: Thursday, October 27, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Mert Hidayetoglu

Abstract: Sparse matrix - dense matrix multiplication (SpMM) is one of the fundamental computational kernels widely used in scientific computing, artificial intelligence, and graph analytics applications. The SpMM throughput is typically bound by memory bandwidth and can potentially benefit from the high memory bandwidth of modern GPUs. However, load imbalance and poor data reuse across threads can significantly impact the computational throughput on GPUs. For guidance on algorithmic optimizations such as loop reordering, tiling, and fusing, one needs a reliable analytical performance model. Nevertheless, it is not trivial to predict the SpMM throughput on GPUs due to the unstructured nature of the memory accesses and the high complexity of memory hierarchy on GPUs. In this work, we develop a simple, yet reliable performance model based on the arithmetic intensity of SpMM kernels with advanced tiling strategies. The proposed intensity equation considers register- and scratchpad-tiling strategies and captures the performance implications of (1) the sparsity pattern of memory accesses and (2) the hierarchical memory architecture of GPUs with high fidelity. We benchmark the model against measurements with 200 diverse application cases on Nvidia A100 GPUs to assess the accuracy. With the guidance of the proposed model, we apply load balancing and row reordering optimizations to predictably improve the SpMM tiling performance. As a result, we obtain 2.03x geometrical mean speedup over Nvidia’s cuSPARSE library (19.2x speedup on average). We are in the process of merging our tiled SpMM kernel into Meta’s open-source FBGEMM library and will release benchmarking scripts for reproducibility upon acceptance of our manuscript that is currently under submission.

11/3: SpDISTAL: Compiling Distributed Sparse Tensor Computations

Time: Thursday, November 3, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Rohan Yadav

Abstract: We introduce SpDISTAL, a compiler for sparse tensor algebra that targets distributed systems. SpDISTAL com- bines separate descriptions of tensor algebra expressions, sparse data structures, data distribution, and computation distribution. Thus, it enables distributed execution of sparse tensor algebra expressions with a wide variety of sparse data structures and data distributions. SpDISTAL is implemented as a C++ library that targets a distributed task-based runtime system and can generate code for nodes with both multi-core CPUs and multiple GPUs. SpDISTAL generates distributed code that achieves performance competitive with hand-written distributed functions for specific sparse tensor algebra expressions and that outperforms general interpretation-based systems by one to two orders of magnitude.

11/10: A Core Calculus for Equational Proofs of Cryptographic Protocols

Time: Thursday, November 10, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Joshua Gancher

Abstract: Many proofs of interactive cryptographic protocols (e.g., as in Universal Composability) operate by proving the protocol at hand to be observationally equivalent to an idealized specification. While pervasive, formal tool support for observational equivalence of cryptographic protocols is still a nascent area of research. Current mechanization efforts tend to either focus on diff-equivalence, which establishes observational equivalence between protocols with identical control structures, or require an explicit witness for the observational equivalence in the form of a bisimulation relation. Our goal is to simplify proofs for cryptographic protocols by introducing a core calculus, IPDL, for cryptographic observational equivalences. Via IPDL, we aim to address a number of theoretical issues for cryptographic proofs in a simple manner, including probabilistic behaviors, distributed message-passing, and resource-bounded adversaries and simulators. We demonstrate IPDL on a number of case studies, including a distributed coin toss protocol, Oblivious Transfer, and the GMW multi-party computation protocol. All proofs of case studies are mechanized via an embedding of IPDL into the Coq proof assistant.

11/17: High-Performance Frameworks for Static and Streaming Graph Processing

Time: Thursday, November 17, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Julian Shun

Abstract: There has been significant interest in high-performance graph processing due to their applications in many domains, including social network and Web analytics, machine learning, biology, and physical simulations. However, writing efficient parallel graph programs for processing the large-scale graphs available today can be very difficult and time consuming, and therefore it is important to have tools that make the task easier. In this talk, I will present our work on designing frameworks for parallel graph processing, for both static graphs and streaming graphs. I will first present GraphIt, a domain-specific language that separates algorithm logic from performance optimizations to achieve high performance across different static graph algorithms and datasets. I will then present Aspen, a framework for processing streaming graphs that introduces a new purely-functional compressed tree data structure to enable graph queries and updates to be performed concurrently with low latency. Our solutions provide high-level programming interfaces that simplify the task of writing parallel graph programs, while achieving high performance at the same time.

11/24: Thanksgiving

Time: Thursday, November 24, 2022, 12 noon - 1pm
Location: Gates 459

Speaker:

12/1: Formal Specifications from Natural Language

Time: Thursday, December 1, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Christopher Hahn

Abstract: Formal or semi-formal specifications are essential for automated verification tasks. In this talk, we study the abilities of language models (LMs) to translate natural language into formal specifications with complex semantics. The primary motivation is a) to facilitate the writing of formal specifications and b) to enable verification of the output of LMs for code. We will consider a variety of experiments in diverse domains (Regex, FOL, LTL): Can LMs achieve sufficient accuracy on this non-traditional translation task to be helpful, i.e., generalize to the semantics? Can LMs simultaneously maintain their generalization capabilities from pre-trained knowledge of natural language? We will reserve some time at the end of the talk for a prototype demo for interactively generating formal specifications.

12/8: On Correctness of Automatic Differentiation for Non-Differentiable Functions

Time: Thursday, December 8, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Wonyeol Lee

Abstract: Differentiation lies at the core of many machine-learning algorithms, and is well supported by popular autodiff systems, such as TensorFlow and PyTorch. Originally, these systems have been developed to compute derivatives of differentiable functions, but in practice, they are commonly applied to functions with non-differentiabilities. For instance, neural networks using ReLU define nondifferentiable functions in general, but the gradients of losses involving those functions are computed using autodiff systems in practice. This status quo raises a natural question: are autodiff systems correct in any formal sense when they are applied to such non-differentiable functions? In this talk, I will provide a positive answer to this question. Using counterexamples, we first point out flaws in often used informal arguments, such as: non-differentiabilities arising in deep learning do not cause any issues because they form a measure-zero set. We then investigate a class of functions, called PAP functions, that includes nearly all (possibly nondifferentiable) functions in deep learning nowadays. For these PAP functions, we propose a new type of derivatives, called intensional derivatives, and prove that these derivatives always exist and coincide with standard derivatives for almost all inputs. We also show that these intensional derivatives are what most autodiff systems compute or try to compute essentially. In this way, we formally establish the correctness of autodiff systems applied to non-differentiable functions. This talk is based on my paper, published at NeurIPS 2020 (Spotlight) and presented at Differentiable Programming Workshop in 2022.

12/15: Provably Exhaustive Testing for Tree-Manipulating Programs

Time: Thursday, December 15, 2022, 12 noon - 1pm
Location: Gates 459

Speaker: Matthew Sotoudeh

Abstract: When a total function has only finitely-many possible inputs, questions about its behavior can be answered by testing on all of those inputs. In this talk I will present work-in-progress research extending the notion of exhaustive testing to functions that operate on tree-shaped, integer-valued inputs, such as linked lists and binary search trees.