Pseudo-calculations to show the Silliness of “No Defects”

I find it quite silly, though understandable, when people say things like “we cannot afford to have any defects.” Many software teams, especially ones in large organizations, seem to have this policy. In fact, I remember when I first started thinking about this. My tech lead at the time made such a statement and I said “actually, its an Return On Investment calculation.” He literally seethed and said somethign to the effect of “not really.”

Well, he was wrong. I think I have some interesting pseudo-calculations around what defects are and how we can manage them.

The Problem

You might have heard this in your own team-space and have thought “yeah, what is wrong with no defects?” The problem is it encourages a bias towards inertia. The logic is simple: the only way to guarantee no defects is to never release any software. Though no one is likely to accept this extreme idea seeps its way into our minds every time we hear “no defects.”

We are in a climate where every day, sometimes every hour, that we can get something valuable out the door can give us a competitive advantage. So even the slight delay of “well, are you sure we should deploy this? Are you certain it won’t cause a bug?” is an opportunity lost for us.

Risk Gives Us Profit

Instead we need to invert the narrative to focusing on delivering value while hedging the risk of that delivery. As Peter Drucker said: All profit is derived from risk. We should not shy away from risk, we need it to make any money! If companies took no risks, then their competition would eat up their cutomer base.

The key, then is to manage the risk we put forth as we strive to innovate and produce profitable features in our software. I believe I have some pseudo-calculations that can help us frame how to manage this risk.

Negative Business Impact

As I see it, the root of most fear over defects is what I call Negative Business Impact. This takes the form of things like service downtime, customers unable to submit their orders, or the button shows up misaligned at the bottom of the screen.

All of these have varying levels of business impact. The misaligned button, for example, will be annoying but does not stop a customer from seeking their goals. A service being down, however, can have large ramifications.

Here is how I view it:

Negative Business Impact (NBI) = Frequency of Incident * Severity of Incident.

Incident: this is an occurrence of a potential defect.

Frequency: This is the number of times that incident occurs in a time period.

Severity: This is the juicy one. This is how bad even one occurrence of an incident can stop your software from generating revenue or saving costs.

A Dive Into Severity

While an initial glance can show you not all defects or incidents are created equal, I feel this calculation is worth a deeper dive.

First, Severity is a very subjective variable. We can rank incidents, or even assign a dollar value for types of incidents. I find that having buckets of severities is the simplest thing to do. Later we will see why it is useful to have some quantity of value lost per each bucket.

There are two types of severity:

Severity, Constant = Value Lost. This is for things that once they happen, there is nothing you can do about it. If you submit your order and you get the wrong parts, it already happened. It does not get worse over time.

Severity = Value Lost/Time*Time To Recover. This is where the duration of the incident impacts your value lost. For example, if your service is down the value lost increases the longer it is down. You could try to tag each failed attempt to connect as their own incidents, but since you are down you probably cannot count those, so its best to just calculate or guess an average of value lost per unit of time.

Value Lost: revenue lost or a more abstract quantity, like a customer delight score.

Time to Recovery = Time to Triage + Time To Resolve. This is how long it takes to both triage/investigate an incident and then resolve it so the system works again. For common incidents, it is often calculated as a Mean Time To Recover (MTTR). plugging in MTTR into Severity would give your Mean Severity for the incident.

The Sneakiness of Customer Trust

A common retort form those who ahve not thought through the above calculations is “well, any incident will reduce customer trust.” Will it really though?

Take two video games as examples. The first one is some one-man-show that a kid developed in their spare time. Or maybe it is an EA game. The mechanics are clunky and not fun at all. In addition, there is a bug where it sometimes autosaves during combat, when a player may die and then reload in a sticky situation. In this case you can bet customers will not trust the game developer at all.

But let’s say a AAA game was just released and fans clamor to get it. They play it and love it. It also has the same bug: it sometimes autosaves during combat. But the game is so fun fans just shrug and say “They will fix it in the next patch” and continue on with their gameplay.

We can boil down these examples into the following pseudo-calculation:

Customer Trust = Value Delivered – Negative Business Impact

Value Delivered: Again this could be revenue or a more abstract quantity, but it is for all the thing that have been delivered that have also caused our current set of incidents.

This calculation lets me easily respond to the customer trust problem by saying “And how much more trust is lost if you are not putting anything valuable into their hands?” People will be more forgiving the more valuable things you give them. Admittedly, knowing what value you will give them in advance can be tricky to impossible.

Hedge Your Risk

If we take these pseudo-calculations to heart, we can derive what our goals should be: optimize for Value Delivered while keeping Negative Business Impact as low as possible. This means we want mechanisms to continually deliver value.

But we also want to find ways to keep Severity low. This usually means focusing no reducing Mean Time To Recover, through mechanisms like self-healing services. It also can mean applying things like Canary Releasing so the cost and/or frequency of incidents stays low.

Conclusion

Even though these calculations are not strictly scientific, I still find them useful. It is good for us to question the idea that “defects are bad” as a maxim. We need to understand the deeper context of what tradeoffs we make when managing defect risk. And I believe these pseudo-calculations, like Negative Business Impact, can help us form a framework to do that. And with that in hand, we can explore even further into testing in production.

On Mapping

Recently I have been obsessing about the idea of sharing team knowledge. I am seeing huge challenges in getting a team to understand certain things. Software development teams are constantly dealing with software in a state of evolution and transition, yet we really don’t have any tools to show and manage this.

Then I read about Wardley Maps. In it, Simon expresses his journey as a CEO of a successful company but still feeling lost. He was playing a chess game without the board. He eventually figures out that the great generals of wars and battles always had a map on which they would draw out their tactics and their plans. Yet in business, we have no such thing. So he makes one, as I will show you below.

This not only has implications for me when consulting my clients in their business but also got me thinking about how we do clearly show our architectural landscape and how to move through it. In order to see why mapping is useful, let’s discuss what is a map.

What Is a Map

Not everything visual is a map. Most things are only diagrams. In order to be a map, accordingly to Simon Wardley, needs 5 things:

Visual – More than text.
Context-Specific -Not all maps can be used for all purposes. A geographical map of Country territories is different than a topographical map of a city.
Position – Where things are placed spatially matters. In geography, this is North, South, East, West etc.
Based on an Anchor – Something needs to dictate the positioning of the space. In a geographical map, this is the “North” key.
Movement. It needs to show something shifting position.

The below flowchart, for example, is only a diagram:

It is visual. It is context specific (Ordering something). However, it has no anchor for positioning nor does it have any movement. For examples I could re-arrange the diagram like this:

It still expresses the same thing. Also, it shows no movement. What is flowing through those arrows? Let’s add the necessary components:

It is still Visual and Context-Specific. The “Time” arrow is the anchor, positioning all the steps relative to their chronological order. And finally, we give it Movement by showing that it is an Order moving through this set of steps.

Why Are Maps Valuable?

Now we know what they are I want to explain their power. One of the maxims given to leadership is “provide a strong vision that people can align with, execute on. Don’t tell people what to do.” The Devil, of course, is in the details. What does this vision look like? I have tried different mediums myself and most of them have fallen short, creating lots of ambiguity in the team space.

Usually, a vision is something like “Let’s make our system the most resilient ordering system in the market!” This statement may sound inspiring and may be a good starting point but it is missing a lot of things. What is stopping us today from already beginning resiliency? what are the components needed to be resilient? What are the overall steps we are going to take from step A to C? Let’s turn this vision statement into a map:

Let’s see if it has all our elements. It is visual. It has the same context: Online Ordering. It has a chronological position relative to the anchor of time. In other words, each step is in chronological order. And it shows movement towards greater resiliency. It should be clear that our end goal is to get each of these steps to the top of the map.

And look how much more powerful it is than a vision statement! With this, the team can assess what components are most valuable to target first. They can make decisions base don’t he ordering of steps or other criteria. They can counter the map itself, saying a component should move up or down the map.

A vision statement is a good start, but it needs follow-up. Maps can be this follow-up, providing clarity to the team and providing a central point of discussion and execution. That is powerful. Let’s talk about a couple of different kinds of maps.

Wardley Maps

This idea of mapping lead Simon Wardley to create a way to map strategic business landscapes. Here is an example of one:

I am not going to cover all the details of this map, you can find them here. It has all the elements a map needs. The Value Chain is the positioning relative to a Customer, the anchor. It is visual. It is context specific to a software system. And it has movement in the Evolution axis.

Architectural Maps

This is something with which I am still struggling. I think this is where I want to focus most of my study and experimentation. We really don’t have much that can map the evolution of a set of software components.

However, I am determined to figure out what we can do here. I fully expect to post about this in the near future.

Conclusion

The idea of mapping has been profoundly valuable int he last few months. I not only can dominate a strategic landscape with Wardley maps in consulting engagements. I can also give guidance to my development teams by figuring out how to visually map the current state of our system to our desired state. And this has blown any other way of sharing a vision out of the water.

Canary Deployment: What It Is and How To Use It

Deploying to production can be risky. Despite all the mitigation strategies we put in place—QA specialists, automated test suites, monitoring and alerts, code reviews, static analysis—systems are getting more complex every day. This makes it more likely that a new feature can ripple into other areas of your app, causing unforeseen bugs and eroding the trust customers have in you.

Taking their cue from the miners of old, developers created the idea of canary deployments: releasing a new feature to just a subset of your users or systems. Rollout.io calls this gradual rollout. If we enable a feature within just part of our system, we can monitor any problems it creates. This lets us keep general customer trust high while freeing us to focus on innovation, delivering excellent new features to our customers.

History in Mining

The term “canary deployment” comes from an old coal mining technique. These mines often contained carbon monoxide and other dangerous gases that could kill the miners. Canaries are sensitive to air-based toxins, so miners would use them as early detectors. The birds would often fall victim to these gases before it reached the miners. This helped ensure the miners’ safety—one bird dying or falling ill could save multiple humans’ lives. In the same sense, the first part of our system to which we release a new feature acts as our canary: it detects potential bugs and disruption without affecting every other system running.

OK, But How Do I Make This Magic Happen?

The idea itself is straightforward, but there are a lot of nuances in how we should approach deploying these features. Often, we must know ahead of time that we’ll be canary releasing.

Does the Feature Need It?

Canary deployments have a cost. They add noise to your codebase that slows down development. The feature’s release will need to be maintained over a noticeable period of time, so this eats a bit into your team’s capacity. If you want to put a feature in a canary deployment, you need to be able to justify these costs.

Does this feature touch multiple areas of the application? Is this feature highly visible to the customers? Does it have a large impact on the customer base? Is it a relatively complex feature compared to others in the application? These types of questions can help you determine if canary deployment will be worth it.

It probably won’t be worth it to canary deploy a new field on the customer admin screen. But it might be worth it if you’re adding a major uplift to customer shopping carts.

What Will Be Your Canary?

It is important to know what things in your system you can use to partition features. There are commonly two areas that make great canaries: users and instances.

By User

Most applications have some concept of user. And most applications also make it easy to get certain pieces of information about the user, such as age, gender, and geographic location. You can query this information when running a feature to see if you should show it to that user.

You could partition by geographical region, showing only your Chinese customers a new feature. Or you could even partition on pure percentage, only showing 5% of users the new feature and seeing if your error counts spike or if your responsiveness slows down. Try to choose a partition where trust is high or where the loss of customer trust will have a low impact. Perhaps sales in your Bulgarian market is small enough that a bad release won’t hurt the bottom line too much.

Another idea is to create an early adopters program, letting people opt into new features. Doing this ensures that customers expect some level of disruption and will be more willing to overlook problems. Video game companies have been doing this for years.

By Instance

Separating by users is an easy way to start canary deployment. But if your system is large enough, you can consider using your application and service instances as canaries. If you have multiple instances of your application, you can configure a subset of them to have the new feature. This can be especially useful if you have multiple regional data centers. However, this is often less flexible than partitioning by user.

A good partition is a sliding scale or a set of discrete values. You want to avoid partitions that are only on/off so that you can better correlate impacts as you scale up the feature in your system.

What Infrastructure Do I Need?

If you want to implement the ability to canary deploy in your system, there are lot of options. The system needs to be able to partition the feature in some capacity, based on what you know will be your canary. You also want to ensure you can change this partition at runtime. This can be homegrown, meaning you can just slap in a database table and a class to take in your user context. You can use your load balancers to route traffic based on regional or user headers in the requests. And you can save some development time and purchase tooling that will make it easy to set up canaries.

How Do I Know if Something Goes Wrong?

Canary deployments will only be useful to you if you can track their impact on your system. You’ll want to have some level of monitoring or analytics in place in your application. These analytics must correlate to how you’re partitioning your features. For example, if you’re partitioning by users in a region, you should be able to see traffic volume and latency by each region. Some useful analytics are latency, internal error count, volume, memory usage, and CPU usage.

Fortunately, it’s easy these days to wire in analytics and monitoring. Google Analytics lets you slap JavaScript on a page header. You can grab open source options for no upfront cost, or you can get great capabilities through purchasing commercial products. If you’re on a cloud platform, many of these metrics are built in. It’s usually not worth building it yourself, but you may want to tweak an existing package according to your needs.

When Do I Release the Feature to Everyone?

As I mentioned earlier, canary-deployed features need to be maintained over time. Eventually, we want to remove the partition completely and let everyone use the feature.

Have a roadmap of how you will release the feature ahead of time, even if it’s a generic roadmap you use for all your canary-deployed features. This will give the team a big and visible end date in sight. They won’t be caught off guard when disruptions happen in the system and they have to triage them. Eventually, you can kill the canary and remove the noise from your code or configuration.

The roadmap should have a timeline of not only when it will end but also how you you plan to scale the feature. For example, maybe your roadmap is that you’re going to roll out a new product line first to China, then to India, then to all of Asia, and then to the world. Most importantly, it should have a rollback plan that your team members clearly understand and can handle.

Focus on Achieving Excellence, Not Avoiding Risk

If you implement canary deployments for your features, you’ll feel a significant mental weight lift off of you. You’ll find yourself thinking less about production outages and disruptions. Instead, you’ll think more about how to push that next exciting feature to your customers.

This post was originally published on the rollout.io blog by Mark Henke.

Blue/Green Deployment: What It Is and How it Reduces Your Risk

Having to take your application offline for updates can be a pain. You can mitigate this with consistent, scheduled downtime, but it’s not something that brings delight to customers. What’s more, some sites can lose thousands of dollars per minute they’re down. There are many reasons an app can go down, but deploying or upgrading your application shouldn’t be one of them! We have a tool we can use to ensure that our deployments create no downtime: blue/green deployments.

What It Is: A Few Wonderful Colors

I don’t know who originally decided to use the colors “blue” and “green.” But the gist is this: you have an instance of your application, a green version, in production. You also have a router that routes your user traffic to the app. You need to get a new version, the blue version, out so that your users can get some new goodies. But you want to ensure that if a user goes to look at one of your screens or presses a button, they can still do so—even while you’re deploying blue. If you can secretly deploy green while blue handles all traffic in the meantime, then you can eventually swap out the connections so that everyone stops going to green and goes to blue instead. So you follow these steps:

You start with the green version in production.

Deploy the blue version to the same environment as the green version. Run any smoke tests, as necessary.

Connect router traffic to the new version alongside the old version (the green version).

Disconnect router traffic from the old version.

Decommission the old version from the environment, if necessary.

Seems pretty straightforward when it’s broken down, but the devil is in the details. Every platform and language has different ways of approaching blue/green deployments, but most have the capability to do it.

How it Reduces Risk

As said above, when we blue/green our deployments, we can deploy without creating application downtime. And when we deploy without downtime, we eliminate or reduce quite a few risks that directly affect our business and our development team.

Here’s what you can enjoy when you eliminate your risk with blue/green deployment:

No Surprise Errors

Put yourself in the mind of your users for a moment. Let’s say you want to order an item. You fill out your billing address and your street address, then you go on to enter your payment information. You agree to the shipping fee and uncheck the “receive spam mail” box. Finally, you press that blessed submit button only to get an error message: “Your order could not be submitted at this time. Please try again later.” And all that precious time filling out your information is lost. If you’re lucky, you get a specific error message like “Application is offline for maintenance.” Most of the time, you get the error message equivalent of ¯\_(ツ)_/¯.

When we blue/green our deployments, we never need this maintenance screen. From your user’s viewpoint, there’s a list of items upon one click, and upon the next click, they see that new menu you added. This will keep furious emails about error screens from flooding your inbox. Let’s give users surprise features, not surprise errors!

Go Ahead, Test in Production!

Often, it’s healthy to ensure your pre-production environments are as close to your production environment as possible. As much as we would like prod to be the same as our QA or staging environment, we don’t always get our way. This can cause subtle bugs in our configurations to seep through. With blue/green, it’s no problem; you can test the app while it’s disconnected from main traffic. Your team can even load test it, if you so desire.

You Accommodate Customers Who Shop at Weird Hours

There’s a constant struggle to find that sweet, sweet deployment window—that time when no one cares. This is tricky, as our customer bases are more global than ever. There’s no longer an internationally good time to do a deployment, especially if you work in an enterprise where the business needs to be running around the clock. If you have a customer-facing application, this means a customer who can’t place an order may place it on some other website. You just lost a sale. If you have an internal application, this means an employee can’t do their job and is actively losing your company money.

By blue/green deploying, you assure your traffic never stops. That customer can place their order just fine without disruption, giving you that sale. That employee overseas can continue to do their job without interruption, saving your company money. The longer your current deploy downtime is, the more valuable this is.

You Get to Sleep Instead of Deploy

We just talked about customers who shop at weird hours. But what about you or your developers—the ones forced to put out fires at those weird hours? Finding the right deployment window can lead to devs doing deployments over the weekend. In extreme cases, it has to be done at four AM or some other absurd hour. I remember being on call and having to wake up because the weekend deployment failed. I was groggy and frustrated, and the furthest thing from my mind was ensuring all the quality checks were in place when I made any fixes. This encourages human error, especially in more manual deployments.

If we apply blue/green, we can deploy whenever we want. More specifically, we can deploy during office hours, when we can bring our full team to bear on any issues that occur. We can deploy while the coffee in our veins is in full effect, giving us that mistake-avoiding brainpower.

Easy Recovery

As much as we like to think we’ve done everything right, sometimes we introduce bugs. We can either spend inordinate amounts of money ensuring deployments will always be defect-free—and still occasionally find them—or we can ensure that when we inevitably find them, we recover quickly and easily. By blue/greening our deployments, we have our older, more stable version of our application waiting to come back online at a moment’s notice, evading the pain of trying to roll back a deployment. This is especially valuable if your deployments have many manual steps.

There Are No Silver Bullets

As great as it is using blue/green deployments to remove downtime, it doesn’t come free. There’s often a cost to supporting two versions of your application at the same time. This ends up significantly affecting your data model but can affect other areas as well. I would only suggest applying blue/green when some of the above risks may apply to your application. If you find that none of them do, go ahead and enjoy those simple swap n’ drop deploys.

The Death of Downtime

Blue/green can be an extremely powerful way to reduce pain and risk in your application lifecycle. If you’re the manager of a development team, I encourage you to assess if any of these risks apply or may apply to your application. If you’re a team member for an application but not the main decision maker, you can use these as selling points to convince your manager to institute zero-downtime deployments. Go ahead. Add a couple steps to your pipeline, and watch as your fears and pain melt away.

This post was originally published on the rollout.io blog by Mark Henke.

Beyond Cryptocurrencies: Blockchain Value Proposition and Benefits

Blockchain? Don’t you mean Bitcoin? Bitcoin is fun, but I’m sure by now you’re a little tired of all the hype around it. We all have that one friend determined to get rich off investing in it or that one relative ready to explain the ins and outs of how money works until you fall asleep. Eventually, we stop caring, viewing it as some fad like tulips or a new dotcom bubble. Despite how much people seem to value or devalue it, the real value is in the protocol it’s built on the blockchain. A blockchain is a secure, append-only log of transactions copied across multiple computers.

An Explanation of Sorts

Let’s break it down a bit. Imagine every person in the US knew everyone else’s ancestry. This ancestry represents our blockchain. Every citizen knows who their parents, great aunts, and second cousins twice removed are. This keeps some shady person from arriving on your doorstep claiming to be your long-lost niece to nab your inheritance.

Now, someone really clever might try to show you a different family tree and tell you yours is wrong. Fortunately, all your neighbors have a copy, so you can just ask them if you have the right one. For a blockchain, this means no one can falsify a transaction. There’s much more to it than this, but hopefully, this helps you understand the concepts beyond the buzzword.

Valuable Areas

Having a distributed ledger that ensures all transactions are recorded and accurate has the potential to reshape many different fields. Anything that deals with transfers, transactions, and ownership has the potential to be touched by a blockchain. Think of how useful it would be to know that every time you made a transaction, it’s accurate and you have a log of everyone who’s touched it!

Ownership, Titles, and Deeds

When I speak of ownership, mortgages immediately come to my mind. If you’ve ever bought a house, you know the paperwork is immense—30 pages or more. A lot of this is to ensure you and the seller are legally willing to change the house’s owner. Groups of people are then paid to double- and triple-check it. With a mortgage blockchain, this could all happen automatically. You can build in smart contracts that use math to validate that both parties agree to the title transfer automatically.

Accounting

Accountants invented double-entry bookkeeping to ensure accounts are balanced properly and to avoid bookkeeping errors. However, when dealing with external accounts, it’s still a manual process. Millions of dollars a year are spent to reconcile and audit the ledgers of companies. With a blockchain, you actually get a fully automated triple-entry system: the buyer, the seller, and the machines validate your transaction by solving a math problem. This could save accountants time and money and save business owners the pain of fees and worry.

Healthcare

The problem of getting someone’s medical records is huge. Doctors and nurses often spend more time dealing with medical records than they do with patients. As a post from Wired explains, imagine that your doctor could easily encrypt and record every visit and treatment you had on a medical records blockchain to which your providers had access. Not only would this save money on storage and retrieval across multiple medical record systems, but your doctors could also spend less time typing and searching for your records and more time treating your illness.

Weird Stuff

The above industries have fairly obvious applications for blockchains. However, there are a couple of areas that seem to push the envelope. Decentralized autonomous organizations are starting to spring up, where no one person or group controls the company. A set of rules and smart contracts control their behavior. Other applications include decentralizing entrepreneurship, which would let people invest in ideas as part of a distributed network instead of something centralized like Kickstarter.

There’s also a niche for artists and artisans who craft, make music or do something unique and creative they want to sell to the world. People are getting tired of moves like Etsy’s payment change, and creatives may look to set up payment systems on a blockchain network like Ethereum.

And…?

No one really knows yet what value blockchains can fully bring. We’re in the early stages of blockchain technology, with developers still working out the kinks. It could be that the most valuable applications are yet to be understood.

Do I Really Care?

This is all shiny, newfangled stuff. But should you care? Honestly, unless you’ve already dove headfirst into the blockchain trend, it might seem most of these things I’ve discussed will only cause a small ripple in the pond of your life. But this is only the first wave, and perhaps something more impactful for you is down the line.

Even with just the benefits mentioned in this post in place, imagine your life. You want a house, you go visit it, you go online and sign the contract right then and there. It’s time to visit the doctor, and you don’t have to wait 30 minutes. As a plus, your bills are two-thirds the cost. You run a business and the books are overwhelming you, so your AI assistant audits your blockchained transactions and saves you a month’s worth of headaches. And imagine whenever you buy anything on the internet, you just pay for it without having to go through a third-party service such as PayPal.

Eventually, blockchains may become so ingrained in our culture as to become boring. After all, the same thing happened with the internet as we know it today. So maybe don’t invest your kid’s college fund in Bitcoin, but leave your mind open to the theory behind blockchain. It could change what we consider normal, and change it for the better.

This post was originally published on techtowntraining.com

Leadership Approach

A friend suggested I blog about leadership approach, so here we are.

I just started my leadership journey last year, I am learning, there is so much to understand and at which to get better. Here are some things that immediately come to my mind when tackling a leadership role.

These tips may be good for leadership roles of all types, but I am focusing on leading a group of people who do creative, knowledge work. This includes fields like software development, plumbing, engineering, accounting, and marketing.

We Need To Care About the Team Over Ourselves

It seems very common for people in leadership positions to view it as a privileged position. I wish I could say this was pure silliness, but I understand the sentiment. You “climbed” the ladder of leadership and earned the position. It makes you feel like you earned a little privilege. Well down that path lies folly. That path breeds contempt and dysfunction on your team.

Instead, we shine as a leader when we embrace the solitary toil of supporting others. Our team’s success becomes our success. It does not sound satisfying, but I learned there is much joy in seeing someone do better, overcome an obstacle, accomplish something great because of your direct influence.

Let’s leave our egos at the door.

It is All About Intent

Along with the feeling of privilege comes the temptation of giving orders. We feel we are now a “master of our domain” and earned the right to boss others around. That is all well and good except this does not work. People who do creative work are competent and need to be freed and equipped, not commanded.

What I find works is to in all thing speak my intent. It is one of the greatest tools to getting people all collaborating effectively to get stuff done.

Our Energy Matters

This is not some New Age philosophy, but we as humans put out an atmosphere around us based on our mood. We also absorb other people’s moods. This means when we feel dejected our team is getting a taste of dejectedness. When we feel confident, our team gets a taste of confidence.

Let’s ensure we are aware of our mood and do our best to keep it positive. We want to show confidence and stability to our team. This does not mean we alway must be chipper our high energy. It does mean we need to keep our temperament and what we say in check.

We are always “on” as leaders. It is alright to let our team know when things are not optimal or if we are feeling sad about something, but it must be done in a way that lets the team know it will be alright in the end. We must have an energy that let’s the team know that we all can overcome our current obstacles.

Beyond just positivity it is also healthy for us to project stability. If we are running around from task to task like a headless chicken, the team will feel adrift and lost. We should ensure that we are setup to project stability and focus in our team space. We shall not let the unending pile of tasks at our doorstep phase us. To tackle this I like to prioritize and execute one task at a time. When new tasks arrive I put them in a personal backlog and groom their priority later. This allows me to focus and keep my temperament more even.

Turn Everyone Into Leaders

If we are good enough we can run a team well, giving orders form on high and ensuring everything is strategized properly. This will work fro awhile, until we leave or go on vacation. Then everything falls apart. Not only this but we are very likely to burnout from all the attention we must give to these things.

In a team of creative workers we can instead turn each member into leaders. A sustainable, high-performing team does not have room for people who only follow. Most creative workers already have initiative and the desire to do good work. Alas, bad bosses, eduction systems, and corporations have beaten this desire out of many. We need to free our team members and give it back to them.

We can free them by getting them to critically think about the problems they regularly solve and to feel safe to solve them in different ways. We need to transform our team from asking us permission to informing us of their intent. We need to transform us answering their questions to asking them how they would solve the problem or how they would answer the question. We need to let them make suboptimal solutions. And once we do all this we need to equip them with the tools and training to make optimal solutions.

This is hard and finding balance for each team member is ever a struggle, but the payout is huge. We will have a high performing team that can run without our constant presence and will continue to perform well after we move on.

We Must Be Diligent

We as leaders are loaded guns. We have the ability to do great good in the team space or great harm. I hope these ideas help you do great good in yours.

This post was originally posted at The Simple Efficiencer blog.

Scrum Master is a Marketing and Sales Role, Not a Managerial One

I recently saw this tweet:

The during Sprint Planning the Scrum Master forced them to create a story for writing the house. Forced them to use the “as a ??? I want to…” format. Then forced them to task that story and estimate the size /2

— Jeff Morgan (@chzy) December 8, 2017

and it spurred me to write this post. Note the word “forced” used multiple times here. This Scrum master likely believes they need to control and manage their team. However, I believe an effective Scrum master (or product manager) role is a marketing and sales role.

Even after shown to be false, the myth pervades that creative workers need to be managed and made to follow certain orders. It honestly surprises me that we still need to write on this topic.

In order to explore this idea more deeply, let’s define what marketing and sales int the team space are:

Marketing – Making known to others that you have something potentially valuable.
Sales – Knowing what someone values and explaining how what you have is either valuable to them or not.

Note that this may not seem like a typical definition of sales. Sales gets a bad rap from all those used cars salesmen tropes and the like. Ethical sales requires very little convincing, instead requiring a deep understanding of what your potential users value. Ethically selling sometimes leads us to choose against selling to a user, a negative sell. We must be alright with this happening.

Let’s look at the exciting world of Scrum mastering to the development team from the site, scrum.org:

Coaching the Development Team in self-organization and cross-functionality.
Helping the Development Team to create high-value products.
Removing impediments to the Development Team’s progress.
Facilitating Scrum events as requested or needed.
Coaching the Development Team in organizational environments in which Scrum is not yet fully adopted and understood.

I see coaching in here twice. Coaching is often focused around introducing new patterns and solutions to the team that may solve their current problems. This falls in line with our definition of marketing. One key to successful coaches is that they can explain how a potential solution is valuable to the team members at their level of competence. That matches our definition of sales.

Helping the Development Team to create high-value products is so generic that we could shoehorn in how marketing and sales fits into this, but I don’t believe it is valuable.

There are other things listed above that the scrum master should do, that do not fit the marketing/sales mode. That is alright, a Scrum master or product manager is not only a marketing and sales role, but it is much more so these things than a managerial role. The only thing that could possibly be seen as managerial is facilitation of Scrum events, but that would be a warped definition of facilitation. I find the best facilitation is one where the facilitator is as objective as possible, not displaying opinions or taking sides in discussions during these events.

To force the team is to lose their trust and a loss in trust loses performance. If you put on a marketing and sales hat during many of your activities as a Scrum master or product manager, you will see significant gains in team ownership and performance. As we described above, marketing and sales can allow us to be fully authentic, honest and transparent. We need not apply them as they are traditionally viewed.

A Newborn Baby as a Finite State Machine, Part 3

In the previous post we created a grid of states, triggers, and transitions. This is a great format to transcribe to a computer, such as your automated test suite. However, it is not easy for humans to reason about or to onboard someone on how a newborn baby works. Fortunately we can easily translate this into a Finite State Machine diagram using a tool like Draw.io.

As you can see, even with our simplifications a newborn baby is quite a complex state machine. I would imagine, if using TDD in isolation, we are all but guaranteed to miss some transitions. But despite the complexity we are able to easily account for all scenarios through this exercise. As a bonus we can also manually trace through any flow with this diagram, allowing us to perform sanity checks.

Another potential benefit beyond performing TDD in isolation is that we don’t need an overabundance of integration or end to end tests for the different flows, or series of transitions. Using a structured approach of triggers, transitions, and states mathematically guarantees that any combination of flows can work, assuming each transition has been unit tested.

One thing we did not cover was side effects. For example the baby may smile when you touch her but it does not affect her overall state. Side effects are useful to annotate and know about but are not necessary to effectively model the program.

I hope this demonstrates how we can tackle complex programs by modeling them as finite state machines.

A Newborn Baby as a Finite State Machine, Part 2

In the previous post I introduced why modeling software as a Finite State Machine can be useful and used a newborn baby as an example. This post will fill in the transitions for the baby.

I realize I missed two key triggers, hopefully this does not reflect on my real-life parenting ability. Feeding the baby and changing the diaper should have an impact on the state of a newborn, so let’s include those. That gives us the following grid of states and triggers:

	Quiet Alert	Active Alert	Crying	Drowsiness	Quiet Sleep	Active Sleep
Hunger
Dirty Diaper
Lack of comfort
30 min interval
Sudden light/noise
Physical Touch
Feed
Change Diaper

Let’s fill in more of the obvious transitions:

	Quiet Alert	Active Alert	Crying	Drowsiness	Quiet Sleep	Active Sleep
Hunger	Crying	Crying	Crying	Crying	Crying	Crying
Dirty Diaper	Crying	Crying	Crying	Crying	Crying	Crying
Lack of comfort	Crying	Crying	Crying	Crying	Crying	Crying
Time passes	Active Alert	Drowsiness	Drowsiness	if was asleep, Quiet Sleep; else Active Alert	Drowsiness	Quiet Sleep
Sudden light/noise
Physical Touch
Feed
Change Diaper

For simplicity’s sake let’s assume that a baby will always immediately cry when an unhappy event occurs. This is a human being and we cannot expect a program to fully represent the nuances of even a newborn, so this is good enough for now. It is common to need to simplify things for the sake of understanding the program, in any case. It is valuable to simplify the unimportant details for the sake of the important ones.

Next let’s look at what happens as time passes:

	Quiet Alert	Active Alert	Crying	Drowsiness	Quiet Sleep	Active Sleep
Hunger	Crying	Crying	Crying	Crying	Crying	Crying
Dirty Diaper	Crying	Crying	Crying	Crying	Crying	Crying
Lack of comfort	Crying	Crying	Crying	Crying	Crying	Crying
Time passes	Active Alert	Drowsiness	Drowsiness	if was asleep, Quiet Alert else Quiet Sleep	Active Sleep	Quiet Sleep
Sudden light/noise
Physical Touch
Feed
Change Diaper

We are saying that a baby will start quiet, become quite active, and then tire herself out into drowsiness until falling asleep, again a simplification. The final simplification we have here is that a baby will get tired if she cries enough without any intervention (not recommended in real life). Note that we put our first conditional transition, when the baby is Drowsy. The baby can be drowsy waking up or falling asleep so we need to know in which “direction” the baby is going to make a decision. Too many conditional transitions can indicate that we need to refactor our States to be clearer. In this case we are fine for now.

Next, let’s cover what I would categorize as the “crying solving” triggers. Most parents are familiar with this checklist of things to check when a baby is crying:

	Quiet Alert	Active Alert	Crying	Drowsiness	Quiet Sleep	Active Sleep
Hunger	Crying	Crying	Crying	Crying	Crying	Crying
Dirty Diaper	Crying	Crying	Crying	Crying	Crying	Crying
Lack of comfort	Crying	Crying	Crying	Crying	Crying	Crying
Time passes	Active Alert	Drowsiness	Drowsiness	if was asleep, Quiet Alert else Quiet Sleep	Active Sleep	Quiet Sleep
Sudden light/noise
Physical Touch	Quiet Alert	Active Alert	if discomfort, Quiet Alert else Crying	Drowsiness	Quiet Sleep	Active Sleep
Feed	Quiet Alert	Active Alert	if hungry, Quiet Alert else Crying	Drowsiness	Drowsiness	Drowsiness
Change Diaper	Quiet Alert	Active Alert	if dirty diaper, Quiet Alert else Crying	Drowsiness	Drowsiness	Drowsiness

We added more conditional transitions as the baby could be crying for different reasons. Note that physical touch does not normally wake the baby, but feeding and changing does. We may not have known this when we first took her home, but after being around her we now know how she will respond. This goes back to ensuring you gain the domain knowledge to answer these questions. If you don’t know up front, that is fine. The willingness and ability to find answers or to make assumptions and move on is one thing that separates efficiencers from average software developers.

Finally let’s fill in that last trigger:

	Quiet Alert	Active Alert	Crying	Drowsiness	Quiet Sleep	Active Sleep
Hunger	Crying	Crying	Crying	Crying	Crying	Crying
Dirty Diaper	Crying	Crying	Crying	Crying	Crying	Crying
Lack of comfort	Crying	Crying	Crying	Crying	Crying	Crying
Time passes	Active Alert	Drowsiness	Drowsiness	if was asleep, Quiet Alert else Quiet Sleep	Active Sleep	Quiet Sleep
Sudden light/noise	Active Alert	Active Alert	Crying	Active Alert	Drowsiness	Drowsiness
Physical Touch	Quiet Alert	Active Alert	if discomfort, Quiet Alert else Crying	Drowsiness	Quiet Sleep	Active Sleep
Feed	Quiet Alert	Active Alert	if hungry, Quiet Alert else Crying	Drowsiness	Drowsiness	Drowsiness
Change Diaper	Quiet Alert	Active Alert	if dirty diaper, Quiet Alert else Crying	Drowsiness	Drowsiness	Drowsiness

We can see from observing the baby for a while that any sudden noise or light will “jump start” the baby into an active alert state in most cases, unless she is crying or asleep.

This is a great format to transcribe to a computer, such as your automated test suite. However, it is not easy for humans to reason about or to onboard someone how a newborn baby works. In the next post we will see the resulting diagram that we can use to help us and our coworkers reason about newborns.

A Newborn Baby as a Finite State Machine, Part 1

TDD is great but sometimes I need an extra oomph into my design to help model a non-trivial problem. I find that modeling things as finite state machines help me to achieve this.

A finite state machine is a system where a set of inputs, called triggers, can cause the system to transition from one state of being to another state of being.

Why is this a useful way to model behavior?

It constrains me to think in terms of a limited set of abstractions: triggers, states, transitions and side effects.
It exposes edge cases I may not have thought of through pure TDD.
It frees from me from thinking about things as sequential series of events, which other models such as Flowcharts can sometimes trap us into doing.

Before I model the state machine I ensure I understand the domain well enough to know the potential states and triggers. A state can be one value (like an enum) or a combination of values. A trigger can be any event, user action, or time passing. Here are the steps I usually take:

I start by making a grid out of these states and triggers, then filling in the transitions (destination State) and any side effects.
Then I diagram the state machine visually using Draw.io or similar tool so that my pair partner or other devs can easily understand the solution being developed. I try not to underestimate the cost of knowledge transfer on a team.
Finally, I test-drive out the functionality in the code itself. Even though I modeled the scenarios ahead of time, this let’s me flesh out those nitty-gritty details, like what the triggers look like, the surface area of the state machine, how it needs to fit into the codebase, etc.

Let’s take an example. I just had a beautiful baby girl, so let’s use a newborn baby as a state machine. A book I am reading, The New Father, says that there are 6 states to a newborn baby: Quiet Alert, Active Alert, Crying, Drowsiness, Quiet Sleep, Active Sleep.

Transitions are a bit trickier and require us to infer a bit, as it will be with most problems we are solving for a customer. Crying is a (well known) state and all parents likely know that babies cry due to a few things: they are hungry, they have a dirty diaper, and they need comfort. The latter covers range of comfort needs, like temperature change and parental affection but I lumped them together for simplicity. From the book we know that Quiet Sleep and Active Sleep switch every 30 minutes so time passing is another trigger. Other triggers may include sudden light or noise and physical contact. Also note that transitions can be both conditional (only happens some of the time based on trigger context) and unconditional (will always happen when the trigger occurs).

So now we have a grid we can setup:

	Quiet Alert	Active Alert	Crying	Drowsiness	Quiet Sleep	Active Sleep
Hunger
Dirty Diaper
Lack of comfort
30 min interval
Sudden light/noise
Physical Touch

In the next post in the series we will fill in the transitions and diagram the machine.