What Rules? The Great Hustle of Exploratory Testing

The tournament will span for about 1 hour with each registrant competing as an individual throughout 3 different games requiring varying levels of skill, speed, and accuracy.   

The line above is the rules, in their entirety, for a darts tournament that was held earlier this year in Boston, during our Go-to-Market-Kickoff (GTMKO) here at SmartBear. This event brings in SmartBears from all over the world to learn, inspire, and in my case this year—to fight like hell to win an employee darts tournament against all odds. 

Those odds, to me, were very low. I hadn’t actually played darts in nearly 20 years, and I’d be competing against a large number of my far younger colleagues visiting from our office in Galway, Ireland.  

When I received the rules above in my inbox, they looked familiar. They reminded me of the vague sets of requirements that lack critical details that software testers are often tasked with using to “assure” the quality of a piece of software. (Note: “assure” is in quotes because that’s not actually what testers do or should even say they can do. It’s a long story.) 

Requirements, by their very nature, will always be incomplete. Exactly how incomplete is a bit subjective. Some testers may receive a set of requirements to test against and excitedly say, “Got it! I know exactly where to start and where to stop.” Others may sigh and wish, again, that they’d been invited to be a part of the requirements gathering process.  

And there are other testers who, to some degree, welcome brief, vague requirements, because it just gives them more room to go exploring. Knowing that what they’ll likely find outside of the straight and narrow rules of the road will be far more intriguing and thought-provoking than what they find by trying to stay on it. 

The insufficiency of rules and requirements

On the afternoon of the tournament, I pulled up the rules one last time to make sure I didn’t miss anything that would disqualify me or would lessen my chance of winning by me failing to note. Then, my English major/major grammar nerd brain went to work: 

“The tournament will span for about an hour” 
I have no clue what time the tournament will start or end.  

“Each registrant will compete as an individual” 
Makes sense. 

“…Throughout 3 different games” 
I really wish I knew what the games will be, but that’s okay. They’ll tell us at some point. Right? 

“…Requiring varying levels of skill, speed, and accuracy” 
Hold on. What? How could the tournament require “varying levels” of those three things to win? Will “poor or “low” levels of skill, speed, and accuracy be rewarded equally as those that are “great” or “high?” Maybe there’s a prize for the worst score? But how could low “speed” even be measured? Will someone get a prize for the fewest darts thrown? I’m slow! Maybe I’ll win! 

A quick—but brief—loss of faith

As someone who is often unreasonably competitive, even in competitions that I have little to no faith in being able to win, I grew restless, and almost a little nauseous as we got closer to leaving for the tournament. It was being held at an upscale darts establishment called Flight Club which was a little over a half-mile from the hotel where we were all staying during GTMKO. Given that it was in the single digits outside that evening, many of us opted for the complimentary shuttle to avoid dying of frostbite if we were to walk. Some did walk and did not die of frostbite, but I’m sure I would’ve. 

I believe it was the third time that the shuttle returned to our hotel that I was able to squeeze on board. By the time our van arrived, Flight Club was packed. Nearly every dartboard had large numbers of people already playing on them. Many bullseyes were hit, high-fives and chest bumps were doled out, and pint glasses were emptied. 

If the competition had already started, I was definitely behind. By the time I grabbed a pint glass of my own, ate a couple of hors d’oeuvres, and found a lonely, open dartboard in the corner of the room, an hour had passed.  

“Oh, well,” I thought. The night would still be a blast, and it felt kind of nice to not have to worry about trying to adhere to the tournament rules I couldn’t make sense of anyway. Plus, maybe I’d get that award for “lowest accuracy,” or “slowest pace of play,” like I still believed at the time might be a thing. 

Screw the rules

My Irish colleague, Frank, made his way over at some point and we decided to play a match against each other. Much to my surprise, this Irish individual was not humiliatingly better at darts than I was. That’s not to say he wasn’t good; he was! But, so was I! As an eternal down-player and excuse-provider of my own actual talents and skills, I was shocked that I wasn’t terrible.  

At some point during the impossibly loud evening, a Flight Club employee grabbed a microphone, and, while he tried his best to shout over the noise of hundreds of dart-flinging SmartBear employees, not much of what he said was understood by anyone, other than that the tournament was now starting. I, for a moment, thought about running up to him and asking where to sign up, how would scores be tabulated, which three games were we playing, and was there any way I could find out which of those, “Varying levels of skill, speed, and accuracy” would qualify for an award. I had no problem being vain enough to throw the competition by playing far worse than I was actually capable of if a prize of any actual monetary amount or bragging rights were on the line. 

I also remembered how much fun I was having by not caring about the rules. I thought about how poor or insufficient software requirements are not only “the norm” for many testers, they’re the last thing that would keep any tester who loves their job from the thing they love most—simply testing

So, Frank and I continued to play each other exactly as we had before the start of the tournament. He would win, and then I would win. I’d occasionally manage to win two straight, and then he would do the same. This went on for dozens of games. Eventually, some friends dropped by to join us for some group competitions, and while Frank and I played for different teams, we each continued to finish around first, second, or third place in each game. 

Reality check

The competition died down at some point over in our little corner, and some from our group left to chat with others at the party. I felt a tap on my shoulder. It was the employee from earlier who’d announced the start of the tournament. Even with him standing right next to me, it was almost impossible to hear him. 

“Just so you know, I’m about to announce the current leaders of the competition!” he said, excitedly. 

“Oh, OK. Sounds…good. This is a really cool spot. Thanks for having us.” 

“I just wanted to make sure you’ll be listening when I announce who’s currently leading.” 

“Um, I’ll try? I really couldn’t hear you at all the last time you were on the mic, but that’s okay, go for it.” 

“I’ll just show you now in case you can’t hear me.” 

I had no idea why he was so insistent on me knowing this information earlier than everyone else. He then held up a cocktail napkin that had the top 5 highest scores at that point in the competition. 

Frank and I were not only tied for first place, but we were also absolutely destroying every…other…player. 

“There’s NO way that’s right. How in the world do we have that many points?” I asked. 

It IS right. You get three points for every first-place finish, two for second place, and one point for third. I just looked at the numbers. They’re right. Anyway, I’m going to go announce it to everyone else; just wanted to let you know first! Great job! Keep it up and you’ll win this thing!” 

You know that scene in the movies that’s done so often, where someone gets some sort of life-changing news, and everything taking place around them goes silent and the camera dramatically zooms in on the person’s dumbstruck face?  

That’s exactly what happened at that moment. 

The entire bar went silent, a truly miraculous feat for a place that loud. I could barely make out the muffled, unintelligible shouts of the Flight Club employee, but I did see him emphatically pointed in our direction, I think I heard my name, and then I instantly snapped back to reality. I rushed to Frank. 

We’re WINNING. By…A LOT,” I said to Frank, not in disbelief, but in a fully coherent understanding of how that had happened. 

How’s that even remotely possible? We’re doing…okay…I would say, but there MUST be others here doing better than us. Some of these guys can really play.” Frank replied. 

“It doesn’t matter. It’s based on everyone’s total number of first, second, and third-place finishes. That’s IT. You and I are the only ones who’ve been playing with only two people to a game for almost the entire night. It’s not that we’ve played ‘better’ than everyone else; it’s that we’ve probably played 10 times as many GAMES as everyone else.” 

Don’t fall for “acceptance”

It suddenly made perfect sense to both of us. And so, we did what anyone with that information at that time would’ve done. We played as many games as humanly possible in the remaining time of the competition, and we continued to volley first and second-place finishes because those were the only places either of us could’ve finished. Even “last,” it turns out, was still worth two points. 

And, at the end of the night, we finished with the exact same number of points, miles ahead of every other competitor. 

Exploratory testers find themselves in these sorts of situations all the time. It’s not that they won’t play by the rules, it’s that they know there are entire worlds of possibilities and invaluable learnings outside of the rules. As the most curious and most-comfortable-amongst-unfamiliar-territory testers on your team, they have to be given the time to explore what others either won’t or would never think to.  

Rules are important. In sports, breaking them often comes with some sort of justifiable repercussion or punishment. But merely following the rules will hardly get an athlete or team rewarded for doing so. Likewise, in software, acceptance criteria is also important. Just remember that meeting acceptance criteria merely makes you “adequate.”  

Lots of companies are willing to be adequate, so let them. Explore what will make you exceptional. There may be a medal in it for you. 

There’s no shortage of answers; what we need are more questions 

There’s a great meme I recently came across, where a wall clock is (insert your own adjective here: intriguingly, maddeningly, hilariously) situated directly behind some sort of pole or support beam that blocks a number of numbers from view. And so, rather than move the clock to another location (because moving the beam seems like far more work, and work which might damage the structural integrity of the building) you can see that someone has painted the blocked numbers 8, 9, and 10 directly onto the beam that’s blocking them. 

The meme’s humorous caption reads, “Bug fixes in production.” A play on the fact that when software bugs slip past the developers and testers, and a quick “fix” is needed, that fix may come across as insufficient, hasty, or sloppy. When this is done, the bug might be gone, but the damage to your reputation or brand can often remain. 

When I saw this meme, I quickly grabbed a screenshot and threw it into our “Corkboard” Slack channel, a catch-all for random water-cooler-type discussions. I then asked what I hoped people would mistake for a “simple” question.

Did this fix the bug? 

I gave people options of “Yes,” “No,” and a shrugging emoji to connote “I don’t know.” 

The final results? 

Yes – 2 
No – 6 
IDK – 3 

I told people not to worry, that there were no “wrong” answers, but I was definitely hoping for more than 3 shrugs. Okay, I was actually hoping for all shrugs. I’m an optimist.

Why was my hope for all shrugs? Because we don’t even know what “the” bug is; how can we confidently determine if it was fixed?  

A variety of comments that some left in response provided a wealth of insight into the types of opinions around problems, and their presumed solutions, that are made in the software world all the time. I picked out a handful that I believe could provide valuable food for thought when brought to light during design, development, testing, and even production at any software organization.

A: “No. The glare prevents me from seeing when it’s ‘5 until the top of the hour.”

I love this answer. It was the first one that came in, and only seconds after I shared the meme with my colleagues. Remember when your teachers would hand you a graded assignment and an answer (or, in my case, many answers) would simply be marked with an “X?” Or, maybe you would ask your parents to review your homework before you turned it in, and they’d say, “You might want to look again at #5.” 

How many times did you stare at that answer and wonder what the determiner of its quality had found so wrong about it? Furthermore, weren’t you so proud of yourself for quickly finding that “bug” within your answer, and then excitedly making what you just knew was the correction it required, and then spinning it back around to your parents, only to have them say, “I didn’t mean that. Look again.” Crushed. 

This is exactly what’s happened in the clock meme, and what happens frequently between those who find software bugs—whether they be customers or employees—and a lack of information or clarity shared with those who are expected to fix bugs. 

My colleague’s issue with this clock was not the 8, 9, and 10 covered up by the pole. His primary issue was the glare that covered up what should’ve been a perfectly visible, unobstructed 11. Someone went through the time, effort, and sunk cost of all that painting, but to the customer, the bug remains. 

A: “The most upsetting part is how wonky the clock is.”

By measure of tacked-on emojis, this was our most popular complaint and bug distinction amongst my colleagues. One replied with a scathing, “I cannot unsee this.” To these focus group “customers,” again, the issue was not the obstructed 8, 9, and 10. The bug is simply the fact that the clock still needs to be turned slightly to the left so that the 12 and 6 are straight up and down.  

How costly would that bug fix be in comparison to the time and money that would’ve potentially been spent on the paint job? We actually can’t say for certain! This picture doesn’t give us a lot of info. Maybe the clock couldn’t just be turned. Perhaps the screw hole on the back of the clock is in the wrong place, and this is as straight as it will hang without some serious modifications, none of which the team has the tools for. Perhaps the clock is not operated by battery and is wired to the building’s electricity (this is very common in schools) and the clock has been bolted to the wall so that the wiring stays undisturbed. On the other side, perhaps this is in an art classroom, the paint was already in plentiful supply, and it took the art teacher not even two minutes to complete the paint job. 

From “the cost of applying this fix vs. that fix,” to “we can apply this fix now…but this fix is going to have to go into the backlog,” to “which fix might break something else entirely,” there are a million discussions to be had and decisions to be made when it comes to determining what actions will improve quality and what won’t. 

A: “We should also make all the small numbers much bigger to improve accessibility.” 

This answer…hits a little too close to home. Anyone else? I can’t tell you the number of press releases, blogs, speaking sessions, etc. that I’ve written countless drafts of, and sought so much feedback on during development…only to be told by someone after it was published, “You know what would’ve been awesome? If you’d said…” 

And, they were exactly right.  

Furthermore, even if I’m able to incorporate their feedback so that missing component is there the next time someone reads or hears a piece that I’ve authored, there’s no recreating the original experience for those who received v1. 

When should we discuss, determine, and build quality, accessibility, security, etc. into our products? Before we release them? Or, only after we’ve built, pushed to production, sold, and distributed our products to our customers? That way, they can then tell us, “I wish I’d have known the numbers were going to be this small. The wall this clock has to go on is so far from the students that nobody’s going to be able to read them. How long will it take to get my refund?” 

We hear all the time how the mean-time-to-repair (MTTR) in pre-prod is exponentially cheaper than once software is in production. That will always be true, whether we’re talking about baking in quality, performance, scalability, security, and accessibility from the start, and not only after our customers have pointed out their absence. 

A: “It’s a ‘workaround,’ but an unnecessary and unwanted one. I could tell the time anyway and it’s just made it apparent that you are aware that there is a bug and couldn’t be bothered to fix it properly. As a user, I’m also very bothered that the clock wasn’t at least straightened in the course of making the bodgy workaround and I’m left wondering what other sloppy shortcuts have been taken on quality.” 

This is my favorite response of the bunch. I’ve always loved getting into heated discussions around, “Who should ‘own’ quality?” but my favorite debate is around a far more important question, “Who gets to define, or determine quality?” For anyone who would like to begin preparing your retorts, I am staunchly in the “Your customers,” camp. 
 
Not the business, not management, not your developers, not testers, not analysts, nor anyone else who looks at what you’ve built, and who then establishes a score, ranking, or even just a general level of satisfaction, or a lack thereof with your products. 

It’s your customers whose voices and opinions matter most. As my colleague stated here, your customers may not view your repair as a “fix” at all. It’s merely a workaround in their eyes, and a “bodgy” one (used in Australia to mean “worthless or inferior”) at that. This customer is not only not impressed with what you’re telling them is a “fix,” they’re even further bothered by the fact that you didn’t straighten the clock while you were up there painting that “unwanted” addition. 

Did you tell them you couldn’t straighten it because the clock is bolted into the wall? Did you ask them if the paint job would fix the problem before you went to the trouble of slapping it on there? As they said, “I could tell the time anyway,” so who was this fix even for? 

How many times has a bug made it into production, where the immediate directive in an organization is, “Just get some (fix, workaround, edit, statement) out there ASAP.” Many of you may have experienced this painful reality firsthand. As my colleague so brilliantly points out, while we might think we’ll win the customer over with how quickly we responded, that customer may think to themselves just as quickly, “As a user…I’m left wondering what other sloppy shortcuts have been taken on quality.” 

How often do your customers provide you with feedback? More importantly, how often, and when do you ask them for it? And, maybe most importantly, what do you do with that feedback when you receive it?

Let’s all start asking more questions. 

Don’t ignore the happy path; know that it was never there

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I
I took the one less traveled by,
And that has made all the difference.

In “The Road Not Taken,” poet Robert Frost describes an interesting scenario. Frost’s poem paints a beautiful picture of someone who stood before “two roads diverged in a yellow wood,” and who felt “sorry I could not travel both.” And, as the closing stanza above tells us, they eventually “took the one less traveled by, And that has made all the difference.” We’re not told what that difference was, and if the traveler chose the “right” or “wrong” path. We only know that the decision was important enough that they see themselves still telling this story “ages and ages” into the future.

This is one of my favorite poems, for many reasons, and I recently came to realize how similar it is to the types of decisions we make every day in the software industry. We, as software testers, developers, architects, and UX designers, regularly find ourselves at metaphoric forks in the road, wandering through colorful aging woods of internal and external hierarchies, and, like the traveler in Frost’s poem, we have important decisions to make. Will we be drawn to the familiarity and presumed comfort of the paths that were chosen by the most before us? Or will we venture down the other? The one that looks like it has, perhaps, never, been chosen, and therefore has a far less predictable endpoint?

Many organizations feel forced to take, and then re-take, the popular, proven, recognizable paths that they, and the companies they’re competing with or aspire to be have taken so many times before. This obligation is often due to the sink or swim nature of our industry and that we can find ourselves constantly teetering between feast-or-famine. “The way we’ve always done it” comes with a natural boost of greater predictability, something every organization is looking for these days, and which individual contributors are often rewarded handsomely for providing.

And, thus, the “happy path” was born.

“A happy path? Sign me up!” you might say to yourself, and, at first glance, why wouldn’t you? Imagine yourself standing before two roads. One of them is well lit, maybe you’ve even taken it before. You know how long it will take to reach the end, maybe you can even see the end from where you’re standing, and you’re completely confident in what will be waiting for you when you get there. Let’s pretend there’s even a cute sign at the trailhead telling you that this is the happy path. The other road…is different in every conceivable way. It’s overgrown, poorly lit, and you can’t see anything past a sharp turn only a few steps in, leaving you with little confidence how long it will take you to walk it, if anyone has ever walked it before you, and if they, or you, will even reach the end.

It’s only natural that more people would choose the sign-described happy path in the scenario above, and in the real world of the software industry, as well. We can’t just go wandering off down shadowy paths in the woods with no estimated time we’ll return. After all, we have places to be and deadlines to make. We owe people things, like new features, bug reports, KPIs, and slide decks.

There’s just one problem. There is no happy path. Not in the woods, and not in your apps. It’s all in your head.

Wait…what?

(Scene from The Matrix, Warner Bros. Pictures)

In The Matrix, Keanu Reeves’ character, Neo, is perplexed when a young child sitting across from him is able to bend a metal spoon in his hand using only his mind. As Neo tries his hardest to bend the spoon in his hand, using only his mind, the child tells him:

“Do not try to bend the spoon — that’s impossible. Instead, only try to realize the truth: there is no spoon.”

This line was pretty deep stuff back in 1999. We all sat there, like Neo, pretty confused, and asking ourselves, “How is there not a spoon? It’s…right there in his hand.” The same can be said for your application. You certainly can design, build, and test a path that takes the user where you want them to end up, and you should! But there’s a very important reason that shouldn’t be your sole focus. There is no sole focus of your users. There is no single path that all users will take or will even want to take. No matter how clearly marked you try to make your path, you cannot guarantee they’ll see it. No matter how efficiently one path takes your users from Point A to Point B, you cannot guarantee your users will prefer it. And no matter how much effort you put into hiding the presence or potential appeal of other paths, you cannot prevent your users from finding something attractive about them, especially those who are looking for a completely different outcome (maybe even something nefarious) than the one you’re hoping they achieve or are able to provide.

It’s only once Neo accepts that there is no singular spoon in his hand—perhaps even that the concept of “a spoon that will only ever be used by its creators for its originally intended purposes is also a myth”—that he observes the object, and his reflection inside it, in an entirely different light.

Visibility ≠ Predictability

The trusty iceberg image above has made its way into roughly 9 million software industry slide decks and blogs over the years. In this story, the small percentage of ice above water represents a (not the) happy path of our application. It’s the part of our app that we’re most familiar with, most easily measured, what was thoroughly tested, and will almost certainly be the most trodden upon. The rest…not so much. It’s dark, it’s surrounded by freezing water, it’s much harder to access or traverse, and it’s far less likely to ever be seen, or measured, or tested. But, wait. Is it possible that those descriptions are simply your opinions, and not those of your users? You might find what lies below the surface “dark,” “freezing,” and “difficult to access,” but do they? And are we talking about the users you have today or your potential future users?

Unfortunately, using the famous image above as a metaphor for comparing software applications to icebergs is pretty disingenuous. Are we gifted this much visibility into our applications on Day One? Never. A much more accurate representation of what we know about our applications, and how we think about users are interacting with them is below.

When compared to the previous image, how much of this iceberg is visible? Much less, right? What a shame! But, how much of this iceberg is observable? Just as much as what’s shown in the last image. For the dose of “frigid waters reality,” how many software organizations ignore observability, and focus only on what’s more easily visible, and what they believe is more easily visible or sensibly desirable by their users? I believe it’s far too many.

You can see why visibility is so appealing. It’s inherently predictable, and it requires such little effort. It’s like the aforementioned fabled happy path. “I can see this happening, right now. It’s happening over and over, and therefore, I can predict, to a fairly certain degree, what will happen next.” But observability never goes away. It’s always right there, too. It requires much more effort, but it results in greater visibility, and then, you guessed it, predictability, the thing we know the rest of the business is looking for.

  1. We have users/seals!
  2. We designed this iceberg to facilitate and encourage the lounging that we know seals like to do, and they look pretty content. Go us!
  3. One appears to be either having a drink of water or they might be about to leave. We should keep our eye on that one to see if/when they return, how long it takes them to return, and if others behave similarly.
  4. We can see some smaller chunks of ice floating around ours, and they look a little small. I don’t think we have to worry about our uses/seals abandoning our app/iceberg for those others.

Far-too-early-prediction: Because we built an iceberg for seals to enjoy, and they’re currently enjoying it, we predict that seals will continue to enjoy our iceberg.

Based on this small amount of visibility, can we actually predict the above with any real confidence at all? We can’t, because we haven’t observed nearly enough. What isn’t currently visible, doesn’t mean it isn’t observable.

What is observable, and would lend to much greater predictability?

  • Are there other icebergs with far more seals lounging on them? Why is that? What features do they have?
  • Are seals easily accessing our iceberg from where we assumed they would, or are they experiencing any difficulties doing so?
  • What percentage of our iceberg lies below the surface of the water? Enough to keep our seals trusting that the portion they lay on will remain stable and intact?
  • What impact are the air and water temperature playing on the state of our iceberg?
  • Are fish, crustaceans, and other seal prey available in the waters surrounding our iceberg?
  • Are pollutants or seal predators in the area?
  • Should we be looking at icebergs for walruses? Penguins? Eco-tours? Research labs?

And, perhaps most importantly:

How often are we observing and measuring these things, and how quickly can we design, build, test, and deploy changes when warranted?

Happiness is Subjective…and in a constant state of flux

When we think about the definition of words, and why we have them, the actual definition of the word “definition” sums it up pretty nicely:

“the act of defining, or of making something definite, distinct, or clear”

Which is why I find the outcome of reading Wikipedia’s definition of “happy path testing” so humorous:

“Happy path testing is a well-defined test case using known input, which executes without exception and produces an expected output.”

Our users’ input is never always “known.”

There is no permanent state of executions “without exception.”

“Expected outputs” fail to be actual outputs all the time.

And any test cases written, requirements gathered, and UX feedback collected were all conducted at a single point in time—time that changes just as frequently as those test cases, requirements, feedback, and outcomes. And, what I, personally, find most “definite, distinct, or clear,” is that the mythical happy path, no matter how well-marked or defined it may be, will always offer less visibility than the entire woods that surround it.

In closing, the greatest level of predictability can be found not in what is immediately obvious, but by allowing yourself, and encouraging others, to explore an entire world of opportunities and potential dangers outside of what may be clearly visible today.

As the boy says to Neo once he begins to accept that perhaps there is no spoon:

“Then you’ll see that it is not the spoon that bends, it is only yourself.”

And that will make all the difference for you, traveler.