Hack+Startup: Mike Nolet, Founder and CTO at AppNexus
At Hack+Startup Brooklyn Beta Edition, Mike Nolet, Co-Founder and CTO at AppNexus talks DevOps and how to overcome difficult processes at scale like deployment and infrastructure.
AppNexus is an advertising technology company that works with some of the biggest companies in the world — Microsoft, eBay, and Orange to name just a few. At its core, AppNexus helps these businesses shift away from a traditional hand-sold advertising market where publishers have to deal with advertisers to one geared for the 21st century. In the process, AppNexus has raised over $65 million in venture capital from firms like Venrock, Khosla and First Round Capital (us!). Today, it has over 400 employees in New York City.
What Is DevOps Anyways?
Wikipedia describes DevOps as software engineering mixed with quality assurance and some technical operations.
As a CTO, you need to realize that DevOps is not just a new name for your sysadmin. If you’re just trying to sex up sysadmin, then call them site reliability engineers.
DevOps, on the other hand, requires dedicated focus and attention. As you scale, you will want internal tools that help you manage your production and you need to staff accordingly.
When AppNexus first began, they had their sysadmin team build code to automate jobs. But, they found most great sysadmins were not a good fit for this tool building focus.
Here’s why: Most sysadmins function in an interruption-driven world — most of their days are spent constantly reacting to the changing state of production environments and the needs of engineering. This makes it very hard for them to find the large chunks of time required to build solid tools and software applications.
Given the unique role, finding dedicated people to do the DevOps job is hard. At AppNexus, they found that rotational programs were most efficient. They’d have engineers do Ops work and vice versa. Once each person had familiarized themselves with their counterpart’s role, the DevOps team was able to write good repeatable code (that had unit tests in SVN, even).
But getting the right people isn’t enough. There are two other keys to success in DevOps.
DevOps Guys on Pager
Don’t do it! Your DevOps people can’t be in the line of fire and take normal Ops tickets. If you’re asking them to build the best internal tools possible, they won’t be able to deliver if they’re simultaneously trying to knock-out a never ending queue of bugs. You need to split off a couple people and have them be dedicated to this type of fire fighting.
Promote the Tools
At AppNexus, they have a tech team of about 150. Even today it’s hard to get people to collaborate. They assumed that if they built this fantastic API driven framework with scripting languages that everyone would just use it... but no one did. Instead, each team just wrote their own.
If you have your DevOps Team promote internal advocacy and have them do individual ride-ons with teams it helps the rest of the company understand how exactly the tools can benefit them. Ultimately, make sure they’re aligned properly and that they’re communicating and advocating for precisely what they’re doing.
So now you have the right people in the room. They’re focused on the right problem. And they’re talking to the right people. Now what?
Treat Them Like an Engineering Team
While AppNexus’s DevOps Team is technically under the company’s IT branch, they’re treated like engineers in the sense that they’re held to the same level of quality and bug control. If not, they couldn't be held accountable for a system that works across 3,000 servers. It's simple math, if you get your accuracy from 90% to 99.9%, that means you’re going to have something go wrong for every deploy across a thousand servers. Do your best to create testable code that actually works 100% of the time.
Using Open Source Effectively
There’s simply too much to build everything from scratch: Monitoring, metrics, production management, etc. So, it might seem obvious, but use open source tools effectively.
The decisions AppNexus made in some cases are driven by the tools that were available five years ago when they started. For example, Maestro was built from scratch before Chef even existed. For each tool you require — production management, continuous deployment, monitoring, metrics — make sure to evaluate open source options before rolling your own. Some of the tools AppNexus uses are Nagios, Ganglia, Graphite, and Puppet (for config management and system level stuff).
A caveat: If you do use these open source tools, treat them like production applications. Don’t just “yum install” and think you’re done. Automate spool-up, make sure you can roll releases and test changes in staging and test environments just like you would with production code.
One More Thing: Metrics
Too many companies consistently under-invest in metrics. AppNexus uses Graphite, which has worked really well: Each team has dashboards that show real status for everything the company has in production. The company is religious about utilizing these from the CEO down to the last engineer.
Read These Next
Top Hacks from a PM Behind Two of Tech's Hottest Products
Todd Jackson was in a small conference room with a handful of designers, engineers and Mark Zuckerberg. The topic at hand: the Facebook News Feed redesign, intended to declutter the Facebook experience and make it even more engaging. They went over the latest mockups, discussing photo sizes, text density, and the redesigned website navigation. Then they honed in on one seemingly minor point: turning people’s names from blue to black. Jackson, the product manager in the room, knew this was more complicated than one might think. In fact, Zuckerberg had a simple philosophical stance on the matter — that people’s names should remain bold and blue because people are at the center of Facebook. The people are what all the content pivots around, and they should stand out, he said. Jackson’s team had a different contention: in order to more deeply engage its audience, Facebook needed to evolve to showcase content first. In this conversation, Jackson had to wear multiple hats. He needed to absorb Zuckerberg’s argument. He needed to advocate for his designers and engineers. And he needed to think through all of the other pieces and people these changes might touch: internal user operations, external news publishers — not to mention the site’s users. This usually boils down to two sides of the same fence: founders or executives pushing for product changes, and the engineers and designers trying to build them. Such is the plight of the product manager. And Jackson knows better than most. As a PM on Gmail during his time at Google, and on News Feed at Facebook — and now as the CEO of his own, newly-launched Android startup, Cover — he’s worked through tough problems with some of tech’s luminaries. If anyone knows how to balance multiple interests, it’s him.
What You Want in a VP Eng from the Recruiters Behind Twitter and Zappos
Years ago, real estate success story and $1.4 billion company Trulia was in its infancy and on the hunt for a VP of Engineering that would take the site to the next level. It wasn’t going to be easy. The initial team was a close-knit group of hardcore developers led by a pair of seasoned founders, and they weren’t going to let just anyone lead the technology organization.