From the outset, some models/frameworks inform (or get embedded) in a founder’s way of seeing and approaching challenges. Some invented. Some borrowed from the industry and deftly recalibrated to fit a startup’s unique contexts. All hired for particularly perplexing situations and jobs to be done.
This Relay interview series is a vehicle to document some of those very perplexing moments, the promising models that followed and the change they brought about. A documentation, we hope, would help fellow founders probe some of their own early-stage obstacles more methodically.
In this first exchange, Tactiq’s co-founder and CEO, Ksenia Svechnikova (@ksenia), ably profiles Safe-fail Probes. A decision-making tool they deploy for sensing the “unknown unknowns - things you can’t predict that could damage your start-up.”
The Cynefin decision-making framework, in particular fail-safe probing, has been a big part of our journey at Tactiq.io. It’s how we prioritize our product roadmap, run growth experiments, and bring on contractors/freelancers. We’ve relied on it to develop the strategy for our 20x YoY growth in 2020-2021, scaling Tactiq to 180,000+ users in under 1 year.
When we’re planning and making decisions, we ask the team “what are the risks? What are the unknowns here? Could something unpredictable happen?”. We use the answers to decide on a course of action based on the framework. This practice helps us minimize risks and take safe bets as we grow, whilst still moving fast where we can.
In the Cynefin framework, there are four domains. 1 - simple, 2 - complicated, 3 - complex, and 4 - chaotic (we avoid this). We loosely categorize actions into these domains before greenlighting anything.
Simple startup decisions are where there is a clear expected outcome from the action with little chance of a costly surprise. For example, choosing to spend $50 on Facebook ads to test cost-per-click for Tactiq in a specific geography. We knew what outcomes to expect. It’s a simple decision where founders can apply best practices and standard procedures with expected outcomes.
Complicated startup decisions are where you need expertise or analysis to understand the cause and effect. The design of Tactiq’s data storage model, for example. Data security and storage are complicated with many interrelated processes. Luckily for us, our CTO Alex is an expert in data security and understands data storage implications. When complicated decisions come up, we logic-check them with a domain expert (internal or external) before doing anything.
Lastly, there is the complex domain. These are the decisions involving unknown unknowns - things you can’t predict that could damage your start-up. For us, new acquisition strategies (like TikTok), changing onboarding email flows for thousands of new users, and setting a Zoom integration live are all complex-domain decisions. All of them have unforeseeable reactions that pose risks if not tested for.
For these decisions, we rely on safe-fail probes.
They are very small-scale experiments that test our approach from different angles in small and safe-to-fail ways. We might send 100 emails before 10,000. Or roll out a feature to 100 users and observe what happens. These probes surface any unforeseeable issues with an approach (eg. a new onboarding email causing using churn) in contained, low-risk ways.
Sticking to this approach for all of our feature rollouts, integrations, and growth strategies has allowed Tactiq to scale rapidly whilst mostly avoiding costly mistakes.
We were experimenting with TikTok as an acquisition channel. A TikTok creator from Mexico found our extension from one of our paid TikTok collaborations and created a video that went viral on TikTok in LatAm, where Spanish is the primary language. But at the time our chrome extension didn’t support Spanish. So we had an enormous spike in new sign-ups, which created a tidal wave of support tickets for frustrated users who couldn’t use Tactiq to transcribe Google Meets in Spanish.
It was an “ahhhh” moment - we needed to be able to test for these types of unexpected outcomes in controlled ways. Up until that point, we had not anticipated Spanish users from our TikTok experiment, as our sponsored videos were solely in English. We weren’t ready for that new-sign up volume. There were a few weeks of frantically responding to an onslaught of intercom support tickets, all in Spanish, toggling between intercom and google translate.
It was an unforeseeable outcome we weren’t prepared for that created enormous user churn and resource drain for our team.
It began with deciding that safe-fail probes would be a mandate for us moving forwards. My co-founder Nick became the sort of safe-fail probe administrator.
I think this was crucial - having a co-founder take ownership ensured we had accountability for applying the framework.
We implemented a blanket rule - at every planning meeting, for every decision, we go through a simple risk-checking process based on the framework. We ask our team - for this action, are there any possible risks we don’t know about? Are there any unknown variables? Is there anything we’re unsure about? This process has been great for us because it allows us to crowdsource what domain the decision falls in.
If a decision/action is in the complex domain, our team then spitballs a list of risks and unknowns. “Changing the share-button position might cause a few users to churn”, “we don’t know how users will react to an increase in intercom messages” or “promoting Tactiq on this channel might cause community backlash due to the anti-advertising sentiment”.
We then agree on a tolerable level of risk. Are we ok if we lose a few users (1-2%)? How many angry community members would it take to damage our brand? What’s tolerable?
These qualitative (possible risks/unknowns) and quantitative (what can we tolerate) form our hypothesis: “Changing the share-button should not cause more than 2% of users to churn”. Armed with this statement we set about testing.
All of this happens on the spot in planning meetings as we’re reviewing/prioritizing tasks for an upcoming period. So we embed the framework as a final review before committing to anything. A final safety check before launching anything new.
The scale and volume for tests are proposed by the person responsible for testing (with domain expertise) and agreed on by our founders. We use the smallest possible sample size that is still statistically relevant. That could be 100 emails for email marketing, 10,000 users for a new language roll-out, or $100 dollars for paid ads. Our rule of thumb is: how many people need to interact with this to cover all our bases?
Sometimes that means testing 3 iterations of an update to our email notifications with lists of 100 users. Or finding the smallest possible investment, say $50 at a time, for a new channel and trying it out 3-4 times. We even beta-test most features with groups of 10-50 before pushing them live - even whilst rolling out 3 new features a week.
After the tests, ideas are either accepted as safe to proceed (no unforeseen negative outcomes happened) or unsafe and needing a review (eg. users were unexpectedly frustrated by the changes). The safe-fail probes give us a binary evaluation on whether we can proceed with a possibly risky action or not.
They also surface unexpected positive outcomes. For example, we ran a safe-fail test for an email marketing campaign, tested with 200 users in North America from one company. A few weeks later their Head of Accessibility reached out to us about a tender for a company-wide plan of our product. It was completely unexpected, but a great result.
It’s time-consuming. Running multiple small safe-fail probes for each idea takes time, and slows down the implementation of new things. We build faster than we can test.
As we’re implementing this framework, we found that it’s easy to overstretch the team across too many tests at once. At one point we were running growth safe-fail probes on email campaigns, intercom messaging, new product features, and an email marketing onboarding sequence.
Now we’ve scaled back to limiting safe-fail probes to 1 core area each week (eg. growth probes, product probes).
When our team naturally started to adopt and operate in safe-fail probe thinking during meetings. Every new idea, channel, or campaign would be accompanied by proposed probes to test for unexpected outcomes.
If you have a new brilliant idea and you’re excited about its potential, before over-committing, run some safe-fail probes. We probably spent 2-3 months cleaning up the mess created when our TikTok experiment incidentally led to users signing up who couldn’t use the product, creating a ticketing backlog.
Also, we’re launching Tactiq 2.0 on Product Hunt on the 30th of September and would appreciate any feedback or support! If you have any questions regarding this framework, feel free to ask us here!