Twitter and Slack Product Leader on Eliminating Doubt from Decision-making

Facebook’s News Feed. LinkedIn’s 'Who’s Been Viewing Your Profile' feature. And now Twitter’s move toward 280 characters per tweet. Before these features saw the light of day, there were long debates. Take it from Twitter Group PM and Slack Director of Core Product Tess Rosania (formerly known as Paul), who has made a career out of building the interfaces that millions of people use each day. Even for products that aren’t that widely experienced (yet), a hint of a feature change will often spawn impassioned arguments between proponents and traditionalists on product teams. Regardless of the final decision, there’s always a fine dusting of doubt on every feature choice that gets released, leaving product teams wondering: What if we made the wrong call?

That feeling is nothing foreign to Rosania, who’s spent the last decade building software that she hopes brings people closer together. Today, Rosania leads Slack’s Core Product team, which is responsible for messaging, channels, files and the other parts of Slack used by six million daily active users and 43% of the Fortune 500. Previously, she led the Timelines Team at Twitter, where she spearheaded the transition to an algorithmically ranked feed. And as a Product Co-op partner, she works with a small group of other product leaders to advise and invest in founders starting companies at the earliest stage.

In this exclusive interview, Rosania dissects four key product changes that made her sweat — and the tactics she used to power through the doubt that accompanied them. Whether you’re a platform for millions or just getting off the ground, tweaking or adding features can be paralyzing acts of faith — but they don’t have to be. Read on for Rosania’s advice on how to root out doubt that stems from launching and changing product features.

Beyond a Shadow of a Doubt

There's a lot of contradictory advice around whether products should be built to resonate with or change user behavior. The answer is likely a blend of the two — so the debate rages on. But while that happens, this pendulum swing can have adverse affects on product teams, as they search to bridge the gap between what customers expect and what they might want or need. Here are four major feature changes and how Rosania navigated their attending doubts.

Twitter: Chronological vs. Ranked Timeline

Not every product change prompts Sam Seaborn/Chris Traeger to speak out. But for Rosania, there was measured thinking behind the product decision — one that she felt was bound to happen. “It was inevitable that a ranked timeline was going to be more successful than a purely chronological one,” says Rosania. “First, people only read at best one out of ten tweets in their timeline — and usually it's a lot fewer than that. Second, quality is distributed randomly in regular reverse chronological order. You’re probably not getting higher quality Tweets right when you check every day. You’re getting a random sample, and it’s far more likely to be average. So why wouldn’t we swap in a few more interesting things here and there? Even the smallest tweak is bound to be an improvement.”

Part of the challenge was countering misconceptions from users. “A common counterpoint was: ‘Reverse chronological order is important because it feels fresh.’ People would say, ‘Twitter is about what's happening right now — you ruin it by showing me a Tweet from three days ago,’” says Rosania. “But that's the thing, using an algorithm doesn’t have to mean showing you a Tweet from three days ago. What about a Tweet from 30 minutes ago, if 500 Tweets have happened since then? Let's pull that one forward and send something from ten minutes ago farther back. Every substitution is a small improvement and maybe you get to a tenfold improvement in engagement through continued substitutions like that.”

If it seems as if Rosania has this logic on the tip of her tongue, it’s because it was top of mind in hundreds of conversations over the year and a half before Twitter’s ranked timeline went live after she left. Here are a few tactics that she used to check her intuition — and challenge her doubts — before the new timeline was launched.

Quantify the firing squad.

With the ranked timeline, some of her most fervent detractors badged in with her everyday. “Even internally at Twitter, people were resistant to the change as we explored it. Everyone at Twitter is sharp, curious and willing to share their perspective,” says Rosania. “And of course, many are power users. They feel ownership over what they’ve helped build — both their personal timelines and Twitter itself. So, no surprise that many thought about timelines as these things that are carefully crafted — and would essentially tell me: ‘How could you come into my house and rearrange my stuff?’”

Long email threads led these types of emotional responses to fester rather than resolve. “It turns out that the written form is often a challenging medium for arguments between human beings — especially in a large company where people don’t always have a personal connection, and there may not be preexisting common ground,” says Rosania. “So I set aside dedicated time to meet in person and discuss. I welcomed everyone in the company. If people had feedback, I told them to come and talk.”

You want to separate the firing squad from the mob — those who act on the level of justice, not anger.

These firing squad sessions were not always comfortable — in fact, the thought of them could unnerve Rosania. “Obviously it’s really important to speak with people who disagree with you in a way where you're open to feedback and you're listening. But here were these really smart people who were working to make Twitter successful and thought I was a complete idiot. It can be unsettling,” says Rosania. “The sessions helped with that. In addition to getting feedback, I quantified how many people felt strongly enough to set aside time to debate versus take a quick shot over email. Office hours facilitate that and serve as an initial gut check. If you're a startup of ten, hopefully you don't have five people who think your proposal is stupid. But at larger company — Twitter was probably 3,000 to 4,000 people then — you might find 100 who’ll come forward.”

Plant a hypothesis tree.

Rosania felt so strongly about her idea that she wanted to see it either weatherproofed or withered away after talking to her internal opposition. That meant not just scheduling, but also structuring her sessions with colleagues. “One tool I use a lot but don't hear discussed often enough is the hypothesis tree. If you're not familiar with it, it’s the idea is you have something you believe and, tiered below that, you list the supporting beliefs that contribute to that being true,” says Rosania.

“So, in this case, I believed that a machine-ranked timeline would be a more effective way to consume what's happening in the world. Contributing to that belief was the notion that improvements in machine learning and AI were generating better relevance algorithms than they used to. Also included in that belief was the idea that most people don't read everything that's put in front of them, and that there are things that are interesting to them that they won't see through my follow graph. These are all independently verifiable hypotheses, and evidence for or against them strengthens or weakens the core theory.”

The power of the hypothesis tree structure is that it not only can weaken or strengthen your beliefs logically, but also help you separate logical concerns from change aversion and other fears around your idea.

“So say I’m working off the categorical point that I don’t read everything that’s put in front me. I may argue that, if you're only going to read 10% of your overall timeline, I can give you a better 10% than the random 10% you’d see normally,” says Rosania. “That’s just a simple mathematical conjecture. But what people often say in response is, ‘You’re not going to be able to create an experience that I enjoy. I don't think computers are better at this than I am. You're taking away control.’ Suddenly we’re talking about the fear of lack of control. We shifted away from the hypothesis and onto an argument about fear. The questioning changed from ‘Is this possible?’ to ‘Do I like this?’ Hypothesis trees help you catch that substitution, and focus the debate. You can say, ‘Alright, so if we can get the algorithms to produce a good experience, will that resolve your fear? Or is there another hypothesis hidden here?’”

Rosania seeks conversations that move her down the hypothesis tree, not hovering at the top. “If my hypothesis is ‘I believe that a machine-ranked timeline will be a more effective way to consume what's happening’ and the concern is ‘No machine can know me as well as I know myself,’ we’re not deep enough,” says Rosania. “My goal, in debating a product change, is to get past our gut reactions and into testing the validity of the supporting points down the hypothesis tree. Focusing on lower tiers of the tree cools some of the emotions and makes discussion more productive.”

A hypothesis tree helps dissect doubts. It’s simply a structure for an overarching thesis and supporting points.

Hypothesis trees are not just an exercise to confirm your idea from the start — they can also be used to iterate and tweak existing products. For example, they helped Rosania concede that the compactness and information density of Twitter is actually beneficial. “We had a top level hypothesis about how we handled photos in the timeline, and a teammate raised the point that tall photos take up vertical space and decrease scrollability. This was a new supporting hypothesis for our tree. I've always believed that rich media is a really important attribute of social media. Humans are visual and social networks that exploit that — such as Instagram and Facebook — tend to grow faster than ones that don't,” Rosania says. “I took that as an axiom, but at the time, we’d never proven that showing more, taller photos on Twitter was actually better.”

Through A/B tests and other experiments, Rosania’s team did eventually conclude that taller tweets caused people to read less of their timeline, and updated their top level hypothesis as a result. But the specific results aren’t the point. “The key is breaking down your beliefs and identifying theses that are worth testing. In this case, my belief turned out to be more rooted in faith than actual science,” says Rosania. “The turning point for this potential feature was the hypothesis tree discussion. We drilled down to something concrete and testable, and what we learned tweaked our worldview.”

Slack: Do Not Disturb

Slack launched its Do Not Disturb (DND) feature in late 2015. “We decided to turn Do Not Disturb on for all users by default — so notifications would be off between the hours of 10 p.m. and 8 a.m. in each member’s declared time zone,” says Rosania. “At the time, we had about almost 2 million DAUs [daily active users] and over $50MM ARR [Annual Recurring Revenue]. All to say, there was significant business risk. It’s one thing to do that as part of your first version of a product. It's much harder to do when you're affecting existing customers.”

Before the launch, colleagues worried that the decision was paternalistic. “Engineers would tell me, ‘I'm on on-call rotations. If I was at a company and didn't know about this change, I could miss a page and my site would be offline. People rely on Slack. Are we sure about this?’” recalls Rosania. Conversations like this sparked a lot of debate within the team. “We considered whether or not to grandfather existing teams, so Do Not Disturb was completely off by default. We had a discussion on how to handle people who might have their time zone set to the default, Pacific Time, but live elsewhere. We agonized over many other scenarios and use cases.”

But when it came down to it, Rosania and her team had to tether their stance on the feature to something — otherwise, it would erode under all the possible risks and doubts. “We went back to our mission: to make people’s working lives simpler, more pleasant, and more productive. Or the related, more informal motto often heard around the office: “work hard and go home.” With that in mind, the decision was pretty obvious,” says Rosania.

Having a clear North Star to follow helped Rosania and her team make the ultimate call and more easily cast aside doubt about the decision — because they believed that making DND a default reinforced what Slack was about in a fundamental way.

Simpler, more pleasant, more productive communication. “Say I wake up in the middle of the night and something’s on my mind. I could send you a message knowing it’ll trigger a push notification, but that might wake you up and stress you out,” says Rosania. “What if I could do that knowing that it's not going to bother you because you're not going to see my message till morning unless you explicitly decide to check in? Wouldn't that help you not feel stressed and help me not feel like an asshole? Much better than what a lot of people try to do: remember their idea until the next day when they know they won’t bother you, out of a desire for courtesy.”
Work hard and go home. “Can you imagine working at a company where everyone is always-on, and deciding, ‘I'm going to turn on a setting so people can't bother me after 9:00 PM’? Can you picture doing that and expecting that you're going to be just as likely to be promoted as the person sitting next to you?” asks Rosania. “That’s what we would have been asking people to do if DND was off by default. The only way Do Not Disturb was going to be adopted by our existing customers was if we came out and said, ‘We think this is okay for most teams, and by the way, you can change your team’s defaults if you want.’ As the toolmaker, we had the opportunity to set a standard for how Slack was used in ways that individuals couldn’t necessarily define for themselves.”

Slack offered options for users who didn’t like or want the defaults, but not enough to compromise their mission or doubt their worldview.

“We made everyone start with it enabled, knowing that most people were just going to be thankful. For those who needed something different, we gave them tools to opt out. To ease the transition, we notified administrators in advance and let them turn off the default for their entire team if they wanted,” says Rosania. “Ultimately, everyone got the control they needed, but you could clearly see our mission and beliefs coded into the design of the product and its defaults. To have a more pleasant and more productive workday, you need to come in well-rested and not be stressed out in the middle of the night. In many ways, that applies to existing customers even more than new customers, because existing customers bought that vision when they chose Slack.”

The takeaway? When you are doubting your product decision, step back and ask yourself: what do I believe about the world, and does this push us in that direction?

“For DND if we had A/B tested it and it had failed I wouldn’t have concluded that by all means we should wake people up in the middle of the night,” says Rosania. “Don’t get wrapped around the axle A/B testing parts of your product that reflect your worldview. At Slack, we believe in a certain way of getting work done, and in products that feel delightful and playful, that help you get through your workday and treat you like a human being. That’s why we decided what we did with DND. Humans need eight hours of sleep. Software should not be designed for people who don’t need sleep.”

There’s a fallacy that smart people fall prey to: that we must A/B test. That we can never look at something as craftspeople and know if it's good.

Twitter’s While You Were Away and Slack’s Threads

While You Were Away (WYWA) was Twitter’s first foray into shifting away from a chronological feed, just as Threads was Slack’s first move away from one exhaustive, continuous feed in a channel. At the time, both were big departures for the users of the product.

“WYWA was sticking a chunk of relevant tweets toward the top of users’ feeds. It was a precursor to a ranked timeline, which rolled out about a year later,” says Rosania. “Similarly Threads bring more relevance to channels. If you’re in a 200-person company, you probably don’t need to be in every discussion. Threads allow people to create sidebars so there’s better signal to noise in the main feed, and users can still pop into threads that interest them.”

For both features, they were extensions of what was already in motion at the companies.

“For WYWA, we’d coded ways to insert ads into our timeline, so we already had the mechanisms for putting sections in the feed that were out of order chronologically. It’d have been way harder engineering-wise to start by suddenly generating entirely different timelines for every customer at scale using machine learning, so we started by inserting this unit,” says Rosania. “As for Threads, we heard many, many times from customers about various workflows where they wanted one channel with multiple short-lived discussions, like debating and approving business proposals, or posting and discussing announcements. Threads allows people to chime in on these types of things without creating noise or talking over each other. Like DND, it can provide a more pleasant and productive experience.”

Is it better to fundamentally change everything? One way past that question is to go with a version of fundamentally changing everything that fits in your existing product.

The takeaway here lies in how Rosania made dissent low-lift and gave users “round-trip tickets” with these products. When WYWA rolled out, it was reported by the New York Times that there’d be ‘no way to turn the feature off, but if users dismiss it enough, it will appear less often.’

“The fact that you could dismiss that module easily was really important because it made us confident that what looked like positive engagement wasn't just that people didn't know how to provide negative feedback. If you can handle the ticket volume, another tactic for validating your vision or doubts is just make it really easy for customers to nag the crap out of you if they don't like something,” says Rosania. “With Threads, we built a checkbox that, if selected, caused replies to get broadcasted back in the channel, like people were used to. That way, threads come back full circle and people who aren't in the thread can see the outcome. It’s like a free round-trip ticket for passive participants.”

No Room For Doubt

Building products is hard enough before doubts from oneself or one’s colleagues creep in. Address these hesitations head-on internally before any new or tweaked feature launches.

Start by getting the debates off email and offer direct, human contact to solicit feedback (e.g. office hours).
Quantify the colleagues who carve out time to dissent. Don’t just tally them — get constructive criticism with the help of a hypothesis tree.
Also, address any doubts by asking yourself: what do I believe about the world, and does this push us in that direction? This can help shed some light on whether the feature should be A/B tested or tested against your company’s worldview.
Lastly, make customer dissent low-lift (like a click or two) and allow users to circle back to familiar ground.

“For those out there building products: if you haven’t already, there’ll be a time when you’re working on a feature and you’ll ask yourself ‘Am I crazy?’ or ‘How will I know when I’m way off the deep end?’ Those questions become even more burdensome when you’ve got fewer employees than you can count on two hands, months of funding and a firm deadline of getting a product to market,” says Rosania. “You’ll do yourself a favor if you dive headfirst into the proclaimed doubts from your team and yourself. That process will return you to your product’s purpose, your company’s worldview and, critically, your instinct. I can’t think of a product that serves and endures without the presence of all three.”

Image courtesty of Getty/Sunset Boulevard/Corbis Historical.

Make Product Decisions Without Doubt — My Lessons from Twitter and Slack