Monday, August 22nd

  1. 9:00am
    A walk to remember: Debugging a distributed system failure

    Debugging distributed systems has a different set of complications than other fields in our industry. Each system may behave differently depending on the environment it's running in and this undeterministic behavior makes the process more challenging. If the debugging happens on a production environment the risk increases and the nerves get to us.

    Flavio Percoco
    Principal Software Engineer
    Prior to Red Hat, Flavio worked on Big Data oriented applications, search engines and message systems. He was also an active member of Gnome's a11y team where he contributed to Orca and created MouseTrap, a head-tracker application. Outside Red Hat Flavio likes to take pictures, swim, run, travel, hang around with family and friends and whatever seems interesting. Flavio spends most of his time hacking on storage and messaging modules. He has both Italian and Venezuelan roots, and is currently based in Italy where he works remotely for Red Hat. Flavio is also an actively open-source contributor, part of Mongodb Masters group and an active Rust lang contributor.
  2. 10:00am
    Getting the Word Out: Membership, Dissemination and Population protocols

    We are building an instrumentation platform that runs across dozens of datacenters to provide operational visibility for internal systems and applications. This platform must remain up as much as possible and allow support and operations staff to understand and diagnose problems quickly. They must be able to ask questions like "what machines and applications are publishing metrics?", "what systems appear to be offline?", "what order did these errors occur in?", all without consulting every datacenter. Furthermore, they must be able to change configuration quickly, with confidence that every affected system will receive and act upon it.

    To help with these problems, we are implementing several recently developed protocols for cluster membership, epidemic broadcast, and monotonic time. Respectively, these protocols allow us to know what nodes are peers, to disseminate configuration and status information, and to agree on roughly relative orders of events. Best of all, they are all synchronization-free, meaning we can achieve our goals while remaining highly available. In this talk, we'll discuss the protocols we chose, challenges to implementing them, and some preliminary results from deploying the protocols across our infrastructure.

    Sean Chribbs
    Software Engineer

    Sean Cribbs is a distributed systems and web architecture enthusiast, currently building innovative cloud and infrastructure software at Comcast Cable. Previously, Sean spent five years with Basho Technologies contributing to nearly every part of Riak including client libraries, CRDTs and tools. In his free time, he has ported Basho’s Webmachine HTTP server toolkit from Erlang to Ruby, created a popular parser-generator for Erlang, and has contributed to many other open-source projects, including Chef, Homebrew, and Radiant CMS.

  3. 11:00am
    Conquering Chaos to Land Humans on Mars

    Landing on Mars is challenging for lots of reasons but primarily because of the inability to test a vehicle in a Mars environment prior to flight at the planet. Unable to do flight tests at Mars, we rely on Earth based ground and flight tests, wind tunnels and computer simulations for mission design, test and verification. The simulations need to include models of all variables that effect flight. The models include characterizations of the atmosphere, aerodynamics, surface terrain, vehicle mass properties, engines, and landing sensors to name a few. Additionally, uncertainties are included in each model to account for known unknowns and margin is added to protect from unknown unknowns. The process for coping with chaos has been used successfully at NASA five times in the past 20 years to land robotic missions on the surface at Mars; the most recent being the 900 kg Curiosity rover in 2012.

    This presentation will review the overall Mars entry, descent and landing flight design process, including simulation development and various approaches used to land rovers on Mars. Additionally, the presentation will describe how the techniques developed for robotic missions are being used to land human missions on Mars.

    Alicia M. Dwyer Cianciolo
    Alicia M. Dwyer Cianciolo
    Aerospace Engineer

    Alicia Dwyer Cianciolo is an aerospace engineer at the NASA Langley Research Center. She specializes in developing simulations to analyze vehicle flight through different atmospheres in the solar system. Primarily focusing on Mars over the past 15 years, she has worked on several missions to the planet including the Odyssey and Reconnaissance Orbiter aerobraking operations, the Exploration Rovers, and as a member of the Entry, Decent and Landing Team that successfully landed the Curiosity Rover on Mars in August of 2012. She is currently supporting NASA’s the next lander mission to Mars, InSight, and is working to analyze entry technologies that will enable human exploration of the planet. She holds a Bachelor of Science degree in Physics from Creighton University and a Master of Science degree in Mechanical Engineering from The George Washington University

  4. 1:00pm
    Fetching Moths from the Works: Correctness Methods in Software

    We live in a nice world. There’s a wealth of historical thought on achieving correctness in software–shipping code that does only what is intended, not less and not more–and there are a whole bunch of methods available to us as practitioners. Some of these are hard to apply, some are easy. For instance, case testing is widely used and considered standard practice. Property testing is understood to exist but not widely used. The application of advanced logics? Way out there.

    If you look around you’ll find a lot of software fails a lot of the time. Why is that?

    In this talk I’ll give an overview of the methods for producing correct systems and will discuss each in its historical context. With each method, we’ll keep an eye out for present applications and the difficulty of doing so. We’ll discuss why there’s so much buggy software in the world. I expect there will be talk of spaceships a bit.

    By the end of this talk you ought to be able to make reasoned decisions about applying correctness methods in your own work and have a good shot at building better software.

    Brian Troutwine
    Software Engineer

    JBrian L. Troutwine is a software engineer with a focus on fault-tolerance and real-time critical systems. He works extensively in Erlang and is an Infrastructure Engineer at Postmates. Brian likes things that cause disasters on failure.

  5. 2:00pm
    Voice Controlled ChatOps

    So you have seen Tony Stark (Iron Man) talk to his computer J.A.R.V.I.S. (Just A Rather Very Intelligent System), right? Voice controlled ChatOps is pretty much like talking to J.A.R.V.I.S. It is a much more natural interface than keyboard and mouse. You talk to the hardware (Alexa/Raspberry Pi) and it sends the commands off through the Amazon Skills Kit (ASK) to Amazon Lambda. From here it makes it to the chat room, where Hubot takes over and integrates to the rest of the operations tool chain (PagerDuty, New Relic, deployment, etc.).

    Aaron Blythe
    Senior Automation and System Engineer

    Aaron Blythe has worked with software for over a decade. He is currently a Sr. Automation and Release Engineer working remote for Hearst. He is genuinely curious and interested in understanding things and making them better. He has co-organized the Kansas City DevOps community meetup for the past few years.

  6. 3:00pm
    Replacing a Jet Engine Mid-flight, or How We Launched New Architecture for a Planet-Scale Distributed System at Google

    As our systems evolve and succeed at establishing themselves as the go-to solution for a problem domain, over time, the need arises to re-think the architecture of the system to better support the most popular (and potentially unanticipated) use cases and growth. Often, this results in a significant re-write of the system. In globally distributed systems, like the distributed build system at Google which serves millions of requests per day, the luxury of downtime is not an option. In this talk, we’ll look at how we managed to replace the previous production system with a new architecture, and how we did so with no downtime or user visible effects.

    Aysylu Geenberg
    Software Engineer

    Aysylu Greenberg works at Google on their distributed build system. In her spare time, she ponders the design of systems that deal with inaccuracies, paints and sculpts.

  7. 4:00pm
    Practical Accommodations for Mental Health

    We have accommodations for many physical health issues. But when it comes to mental health, things are pretty abysmal even though it is incredibly important. Based on talks with other folks and my own experience, I will present some practical ways that mental health can be accommodated at work.

    Laura Ku
    Laura Ku
    Software Engineer

    Laura is a software engineer at a company called CarbonFive during the day and a super hero fighting monsters at night. One of those two is probably a lie. She also thinks that React is starting to make everything look like a nail.

Tuesday, August 23rd

  1. 9:00am
    The Edge of Chaos

    In the software industry, we are regularly faced with understanding complex systems--which can only be understood holistically, and not as the sum of their parts--with the limited capacity of our human brains. Moreover, our complex systems are developed and operated by people, which makes our overall sociotechnical systems complex *adaptive* systems with emergent behavior that wasn't contemplated when the system was first designed.

    In this talk, we'll describe the nature of this problem and cover coping strategies drawn from control theory and other non-software contexts. We'll use this point of view to describe *why* Agile software development, DevOps, and microservices are business imperatives for avoiding falling over the "Edge of Chaos".

    Jon Moore
    Senior Fellow
    Jon Moore is a Senior Fellow at Comcast Cable, where he leads the Core Application Platforms group that focuses on building scalable, performant, robust software components for the company's varied software product development groups. His current interests include distributed systems, fault tolerance, building healthy and engaging engineering cultures, and Texas Hold'em. Jon received his Ph.D. in Computer and Information Science from the University of Pennsylvania and currently resides in West Philadelphia, although he was neither born there nor raised there and does not spend most of his days on playgrounds.
  2. 10:00am
    Mesos: Automate your Data Center with Containers

    Maybe you’ve heard of Mesos, that thing like Kubernetes? Or perhaps you read that Twitter is powered by open-source infrastructure? Is Mesos meant for continuous delivery or for microservices? In this talk, David Greenberg, author of the O’Reilly book “Building Applications on Mesos” will introduce you to Mesos. We’ll learn how Mesos makes it easy to host scalable, fault-tolerant application servers, continuous integration, and even your databases, through the development of a hypothetical sample application. We'll also learn about the open source and proprietary software that Mesos can automatically and reliably deploy for you, such as Spark, Wordpress, and Jenkins. At the end of this talk, you’ll be equipped to evaluate Mesos and understand its place in your software development projects.

    David Greenberg
    Author, Consultant
    David Greenberg loves learning new things. He is an independent consultant who previously worked at Two Sigma, where he led the effort to rebuild their computing infrastructure. His desire to learn has lead him to study Russian, and he enjoys practicing cooking techniques. He's interested in high performance software and distributed systems with Mesos. He's the author of the O'Reilly book "Building Applications on Mesos" and the designer of Cook, a Mesos framework written in Clojure and Datomic which coordinates containers to optimize task scheduling.
  3. 11:00am
    Assigning Meaning to Programs

    In 1968 Turing Award winner Robert Floyd wrote a seminal paper in formal program validation entitled ""Assigning Meaning to Programs."" In the paper, Floyd describes a technique which bounds each step in a computation with logical predicates on input and output conditions

    In this talk, we will use the same technique to understand how and why a program behaves like it does. We use explore techniques to automate call graph visualizations, create logical predicates for the steps in program execution and discuss what conclusions we can draw for our own work.

    By the time you leave, you will feel comfortable applying this technique to acquire a deep understanding of the behavior of your code. This approach will help you squash a well hidden bug, refactor too-complex code into simpler modular units or better comprehend a code base that is new to you.

    This talk is suitable for programmers of all skill levels from novice to seasoned.

    Mark Allen
    Software Developer

    Mark has been a software developer for over 20 years. He has used a lot of programming languages to build APIs and spoken at many conferences including Strange Loop, OSCON, Midwest.io and more.

  4. 1:00pm
    Murphy's Law for Conferences

    Don't believe in Murphy's Law? Throw a conference and get back to me. // Want to learn about organizing a conference or just pick up a few tips for your next event? Join me, Amanda Harlin, and I'll share some insight on how we make Thunder Plains happen without losing our cool.

    Amanda Harlin
    Co-founder at the Techlahoma Foundation

    Amanda Harlin is a community organizer, JS consultant, and new mom. She co-organizes the Techlahoma Foundation, Thunder Plains, and OKC.js with her husband and some great friends.

  5. 2:00pm
    Living on the Edge

    Joshua Bloch said that "Public APIs, like diamonds, are forever." and Antoine de Saint-Exupéry said that "Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away." However, good APIs don't always spring to life fully formed. How, as client and server, can you deal with changes? We'll demonstrate through some real world examples learned while implementing the ever-changing FHIR standard.

    Jenni Syed
    Principal Architect

    Jenni is a Principal Architect at Cerner, and has worked on various service architectures for 15 years. Since 2008, Jenni has been working specifically with web services and now concentrates on implementing and providing input to the evolving Fast Healthcare Interoperability Resources (FHIR) standard. When not working, Jenni chases her two children around, reads, and plays video games.

  6. 3:00pm
    Testing the hard stuff and staying sane

    Even the best test suites can't entirely prevent nasty surprises: race conditions, unexpected interactions, faults in distributed protocols and so on, still slip past them into production. Yet writing even more tests of the same kind quickly runs into diminishing returns. I'll talk about new automated techniques that can dramatically improve your testing, letting you focus on what your code should do, rather than which cases should be tested--with plenty of war stories from the likes of Ericsson, Volvo Cars, and Basho Technologies, to show how these new techniques really enable us to nail the hard stuff.

    John Hughes
    Professor & Founder of QuviQ

    John Hughes has been a functional programming enthusiast for more than thirty years, at the Universities of Oxford, Glasgow, and since 1992 Chalmers University in Gothenburg, Sweden. He served on the Haskell design committee, co-chairing the committee for Haskell 98, and is the author of more than 75 papers, including "Why Functional Programming Matters", one of the classics of the area. With Koen Claessen, he created QuickCheck, the most popular testing tool among Haskell programmers, and in 2006 he founded Quviq to commercialise the technology using Erlang.

  7. 4:00pm
    Let's Make the PAIN Visible!

    What if we could measure the indirect costs of pain building up on a software project? What if we could measure the effects of learning curves, collaboration pain, and problems building up in the code?

    We could:
    • Identify the biggest problems causing our pain
    • Make the case to management for improvement
    • Create a data-driven feedback loop for learning what works
    • Idea Flow Learning Framework is a process for data-driven software mastery. With a data-driven feedback loop for systematically optimizing productivity, we can start learning faster, and improving faster than ever before.
    • Best practices are based off of a history of anecdote and gut feel. We can write tons of tests but that doesn't mean our tests will catch our bugs. We can have well-modularized code with minimal duplication, and it can still be extremely difficult to troubleshoot problems and track down bugs.
    • What are we optimizing for as an industry? Is it brevity? Modularity? Code coverage? All of these things are really a means to an end. By measuring “Idea Flow”, the flow of ideas between the developer and the software”, we can focus on the friction in developer experience that really matters.
    • Idea Flow gives a universal definition of effective practice. It's not about whether we've got automated tests, or modular code, or whether we're following the right practices or the right process. It's about whether the things we're doing are actually solving our problems.
    • Idea Flow gives us a universal language for sharing our experience. Rather than having conversations completely in pronouns, we can create an explicit vocabulary of contextualized patterns and principles to describe what works (and what doesn't).
    • Idea Flow gives us the capability to start learning together as an industry, and take our organizations to a whole new level of effectiveness.
    Janelle Klein

    Janelle is a NFJS Tour Speaker, author of the book, Idea Flow: How to Measure the PAIN in Software Development (leanpub.com/ideaflow), and founder of Open Mastery (openmastery.org), an industry collaborative learning network focused on mastering the art of software development with a data-driven feedback loop.

    She founded Open Mastery to rally the industry in working together, and learning together to break down the wall of ignorance between managers and developers that drives our software projects into the ground. By making the pain visible with Idea Flow, we have a universal definition of effective practice, a universal language for sharing our experiences, and an opportunity to learn together like never before. Open Mastery is about taking the industry to a whole new level of effectiveness by working together.

    Aside from Open Mastery, Janelle has been working with New Iron for the last 10 years, as a developer, consultant, and now as CTO. Her development background is specialized in data-intensive analytic systems from financial core processors to factory automation, supply chain optimization and statistical process control (SPC). Her consulting work has focused on Continuous Delivery infrastructure, database automation, test automation strategies, and helping companies identify and solve their biggest problems.