YADA (Yet Another DevOps Article)

yada-yada-yada-seinfeld-elaine-benes Yet another article about DevOps? When will it ever end? Well, I can say from experience, we are very far from a consensus on what it means. So, it will probably not end anytime soon. Much of that has to do with the fact that it is different for every company. Each company needs to determine it’s own definition. But, in order to reach that definition, you must have a framework to work from. When it works and works well, everybody is happy. When it doesn’t work, it can be very ugly. So, here is my attempt at some guidance as I see it from the trenches. Having been through several permutations of DevOps or DevOps-like migrations, I can just tell you what I have seen work and what hasn’t.

My Basic Tenets of DevOps

DevOps, or whatever you want to call it, is fundamentally a shift toward unifying the process of creating software across business units. To achieve this unification you must, absolutely MUST, have buy-in by all. There is no “throwing over the wall” in DevOps. From the time an idea begins to germinate, there should be representation from all parties. This is the place that my vision differs from many. I believe that more than just development and operations should be involved. I don’t see a world where some representative from product comes-up with an idea and hands it off to development and says “build-it” working in the new world of rapidly changing enduser expectations. I believe there needs to be a constant feedback loop. This is because, in the end, we are all the product owners. And must be aware of what is being delivered and what is being experienced by those that are interacting with what is being delivered.

Don’t get me wrong, I don’t believe the “everybody contributes equally to each part of the process” vision works. Especially once you start bringing-in non-technical representatives. The following are some basic tenets that I believe should be present on any team that wishes to carry-out this vision:

  • Product (or whomever is responsible for the curating of products in your organization) should be in close contact with development, architecture and operations from inception to…
  • Every member of the team should be empowered to say they see something going astray or they have concerns about some decision without any fear of shame or repercussions.
  • The technical members of the team should be capable of doing all critical pieces of the process. This is to facilitate an active self-support model. Not every member needs to know every aspect in depth. Just enough to diagnose a problem. They should then be empowered to take actions on that identified problem. Be it write a patch, test and deploy it. Or just raise it to someone on the team with more in-depth domain knowledge.
  • The team should be empowered to deploy their product to all environments. this needs further details as it is a touchy subject that must be evaluated on a case-by-case basis. But the fundamental concept must exist.
DevOps For Managers
 
I want to spend a little time on this subject. this can be a real hang-up to the adoption of DevOps in many organizations. But, I honestly believe that with the proper process and measures taken in advance, there should be no fear in giving a team the “keys to the kingdom” so to speak.
So, I do not advocate for most companies (many online start-ups may be exceptions to the rule) that you just allow all developers to check-in code and deploy it directly to production. You often hear developers and the like espousing “Continuous Deployment” and citing “Well, this is how Facebook does it”. [See their paper here] But, first I should say that because Facebook does this it doesn’t make it right for you. In fact, it makes it right for very few companies other than Facebook. Secondly, this is how Facebook deploys it’s front-end. Not the API that their front-end is built upon and that so many others have become dependent on being up. IF a user notices a glitch in the FB UI, they are not likely to squawk too loudly. If a large consumer of the FB auth API suddenly cannot authenticate their users, that will be noticed and may even make the news.
 
What I do, however, advocate, is empowering teams to do their own deployments. Don’t make this a silted process. This leads to a feeling of responsibility to “own your code”. Developers are much more likely to deliver quality code if they know they are going to be the ones deploying it and, ultimately accountable for any issues it may have. Empower your teams. Treat them like responsible adults that can make sound decisions about the fitness of a product for general release. While there may be some pitfalls along the way, I guarantee you they will return the favor by making you look very good.
 
DevOps For Team Members (The Flip Side)
 
So, up to now, I have been speaking generally. Or maybe more specifically to managers or team leaders. Now I want to speak to the teams directly. The implementers of the product. If you want the level of empowerment that DevOps requires, you had better care about what you put out there. There is no half-assing here. If you screw-up, there is no place to point the finger but right back at you. Here is some advice from the trenches to help you achieve that end-goal. Nothing too new. But way too often overlooked or brushed-aside.
  • Test Driven Development is your friend! Embrace it! Learn to love it! I, of all people, know full-well how hard this can be. And, to be completely honest, I still struggle with it to this day. But, I have developed my own rhythm for achieving TDD. It doesn’t fit anybody’s strict definition. But it works for me. Find what works for you. Just make sure you feel confident in them and their ability to flush-out issues.
  • Automate everything you can! Obviously, the tests mentioned above should be run on each check-in. If you broke it, fix it. Don’t ever go home with a broken build. (I know, all the mantras you’ve been hearing for years. But they’ve all become mantras for good reason.) But, go beyond this. Automate your deployments as well. Use tools like fabric, puppet, chef where available. Even better, look into containerizing your apps with a tool like docker. If you get to the point where you can deploy exactly the same code (or even better, the same container!) over-and-over, you will become more confident in your abilities to deploy to any environment any time. Make sure you also automate the rollback of these deployments. If something goes wrong you will be so grateful to have a quick and easy way to get back to the previous known good state. Also, until you are completely comfortable with your automation, practice in a development environment. I would expect, through the normal process of develop->test->develop you will get comfortable. this will also be a work in progress. But you will eventually get to a point where you feel comfortable. Comfortable enough to even deploy continuously. Or, more likely, much more frequently than you do now.
If you take the above steps. You will, with time, get to the point where the leaders in this arena are. And don’t wait for your organization to tell you to. Start now! there is no reason you can’t put everything in place. Get comfortable with it. Then present it to management. You can be the hero. Or, at the very least, you have prepared yourself to work at an organization that “gets it”.
 
A Benefit To All
 
The end result of all of this is that we will all benefit from this. Features, fixes and changes will get to production and in-front of clients sooner. this leads to a more immediate feedback loop that results in more changes that ultimately end in a better product.

Tackling a new world order

Sun-Rise-On-Earth-From-Space-Wallpaper-600x300So, my company recently (within the past couple years) made a major tactical shift from the world of Microsoft-centered development to a very strict adherence to open-source technologies. Driven primarily by our rapid growth and the inability/difficulty for much of our existing MS infrastructure to grow with us. It has served us well thus far but, we have grown beyond the point where it made sense to keep paying the massive licensing fees and developing creative workarounds for scalability. (I have no doubt many would disagree with this rationale. Just my opinion.) This has caused a lot of confusion and consternation throughout the organization.

For the most part, the development and architecture groups embraced this change. Like myself, we have a crew that is interested in the best solution for the job. Regardless of the fact we have been working, almost exclusively, in the MS .NET world for years and this change means a major commitment and investment by all parties. (Those not interested have moved-on without any hard feelings, hopefully, on either side.) However, this is still a fairly steep hill to climb. It’s not that we just need to switch languages from say, C# to java. That is the easy part. There is a whole new ecosystem to adjust (or re-adjust) to. In addition, we are changing our entire system architecture from traditional, datacenter-deployed, applications to cloud-based, globally-distributed applications. This involves all of us and takes significant learning and knowledge sharing across groups as well as new development practices, QA processes and deployment mechanisms and processes.

However, all of this is far beyond the scope of this blog post. Today I just want to begin by kicking off a series regarding one subject that has come-up a lot and is becoming a pain point for teams across the organization. That is that, now that we are cutting ties with our historical datastore (MS SQL Server), what open source database should I choose for my application?

Part of this decision has been helped by the fact that we have had various members of our development and architecture teams do formal reviews of various options. Currently, we have three solutions that are accepted as zero-barrier options. (By this, I just mean that the use of them has been approved and does not require additional justification. Not that there are not barriers such as a learning-curve.)

The current options are as follows:

  • Cassandra
  • MongoDb
  • MySQL

Great, polyglot persistence. No trouble there. Just use the right store for the job. Right? Well, remember, this is a whole new world for most of us here. (And, arguably, for everybody that might read this post.) We are used to one option, MS SQL Server. We would model our data using RDBMS standards. Normalizing data as best we could while balancing relative performance. Then model our application domain objects to stored procedures that did various joins, subqueries, etc. to get the data in the exact “shape” we needed to work with.

So, now the directive comes along to change your data store with a general preference being the use of the approved NoSQL options. So, the question, predictably, arises “Why do I need to change the way I have always worked?” Well, this is a complicated question to answer. And, really, the answer is you don’t. However, as with every decision you will ever make, you have to be willing to make trade-offs if you want to stay the course. The trade-offs in the new “cloudified” world of sticking with a traditional RDBMS like MySQL vs making the shift to something like Cassandra or MongoDb, are pretty steep. But there are still use cases where it would make sense.

So, in the next few posts I plan on tackling this question as well as some others to the best of my ability. In addition, I would like to lay-out what I feel are good guidelines for choosing among the various types of databases available given a knowledge of the following:

  • The type or shape of the data you plan to store or use.
  • The volume of the data. (How big is the data? Either individual items or number of items.)
  • How you plan to use the data.

Other ideas will likely creep-in as I tend to stray off-topic from time-to-time. But, I will attempt to stay focused and do my best to add value to this overall discussion both for my organization and to those out there that might be going through similar transitions at their work.

I hope this series is informative. Please add responses or ask questions as I go. I have a thick skin and can handle criticism. I tend to learn the most from those that don’t agree with me.

Thoughts on Gluecon 2014

Just trying to digest all of the great conversation from this years Gluecon in Denver. Overall, I’d say it was a great success. Lots of interesting topics covered, not surprisingly, centered on the cloud, big data, DevOps and APIs. However discussions went well-beyond the standard discussions around high-level concepts, or “Use this cool new tool. It will make you the hit of the party!”

There were certainly a broad range of tools and products on display but, what I found (maybe naively?) was that, for the most part the talks were very well-vetted by the hosts to limit marketing spiel and offer really pertinent content to help us practitioners do our jobs in the best, most open, manner possible.

Some particularly strong takeaways for me were the following:

API design is not something you can fudge any longer. It takes serious thought. Real top-down thought. Ahead of time. You must think of how you want your API to be shaped. Meaning, start with your clients. Architects and developers have to start by thinking “If I were a client, with no knowledge of the implementation, how would I expect to interact with it?”

For example, let’s say we have a some content, say books, that we want to offer an API for. The first thing you should think is “Who are the clients of my API likely to be?” Depending on your product, your clients may be web and app developers in your company or some other company that wants to use our API to offer content to their customers or both. Either way, your client is likely to be a web or app developer. So, now that you know this, start designing your API to their needs. If you are lucky enough to know your potential clients, USE THAT ADVANTAGE!

More on this topic in future posts…

Building for scale has never been easier. Or more challenging. Sound like I’m confused? Well, maybe. But, what I mean is that the tools to build highly scalable systems have never been so available to us developers and architects. There was a time, not too long ago, that to build an application to handle the type of load that APIs like those from Twitter, Facebook, and many others are seeing  these days, you had to, typically, overbuild up-front. Over-provision hardware (even if just reserving rackspace and creating standard hardware specs to hasten hardware delivery time), shard your database of choice from the get-go (or at least think logically how you might), build complicated threading and synchronization logic into your code, etc. etc.

Now, while you still need to consider these things up-front, you have choices to ease the burden. Obviously, choosing a hosted cloud solution like aws, rackspace, azure, etc. is, at least in my humble opinion, a no-brainer. At least for most organizations that don’t have the resources of a google or Microsoft. With this decision made, you can start focusing on your app. And in that realm there are more choices than ever as well. From brilliant, auto-scaling, sharding, replicating databases like cassandra or riak (and others), right down to the languages you use. Java 8 comes with new, improved features like completable futures and the new, improved stream class. Then you have options like scala, node.js, etc. Take your pick.

But, this plethora of options also leads to the second part of my statement that this has also never been a more challenging time to build scalable apps. First, you have more to evaluate, and thus learn. Don’t get me wrong. The constant change of this field is the reason I got into it in the first place. I thrive on change. But not everybody does. Even on a given, hand-selected, team you are likely to have dissenters and individuals digging their heels in the dirt. Image how THIS concept scales to large teams and organizations.

That said, I see this as an exciting time of change and progress for our industry. And I/we can’t convince everybody of this. So, get on-board or get out of the way!

Deploying applications to the cloud must be quick, repeatable and predictable. Containers are the future. Learn the concepts. Pick a tool or tools and learn that/them. Then, when (not if) things change. You’ll be better prepared for it. That’s it. (Partially because this is an area I’m admittedly weak in myself.)

API SDKs suck! Ok, so I actually do not buy into this, but it was a very common theme (both sides well-represented) at this years Gluecon. Thanks mostly to an excellent day one keynote by John Sheehan at Runscope “API SDKs Will Ruin Your Life”.

Like I said, I don’t completely agree with this assertion. But, to be honest, I don’t think John does either. However, he and others made some good points. One that hit particularly close-to-home for me was the double-edged sword of how the SDK abstracts developer’s from the actual interface of an API. This abstraction eases adoption by you APIs clients. That is a VERY GOOD thing! However, as John stated, the vast majority of issues that occur with API integrations are “on the wire”. Meaning, more or less, something is wrong with the request or response. However, if you abstract this interaction from your clients, all they know is “my request did not succeed”. More API savvy developers may take the next step of inspecting the request/response before contacting you. But, if they do, barring an obvious issue like malformed requests or forgetting to pass auth, they will likely jus be faced with an unintelligible error message of some sort.

So, my counter to this argument is three-fold. Document you APIs well. Be it the old-fashioned way by manually producing help docs or with something, in my opinion, infinitely better like swagger. Just do it! It will save you many headaches in the future. Secondly, back to my first point, design your APIs intelligently with your clients in mind first. If your API is easy to navigate for an average person (test it out on somebody!), it will make the interaction less painful to begin with so your API may, potentially, need less abstraction by the SDK. Lastly I say you must strive to make the errors your API returns just as comprehensible as the non-errors. By this I mean things like returning proper error codes and human-readable descriptions. Not just a generic 400 with “Bad Request” or what have you. I know all-too-well this is hard to do up-front. You can’t predict all the ways requests may fail. But, if you try, you can think of the more common ones and handle them more elegantly. You are likely coding defensively against them to prevent failures on your end anyway. For those that arise after the fact, adapt. That is why you have that rapid, repeatable, deploy process mentioned above.

Summary

Ok, so I have rambled-on waaaayyyy too long and have not even come close to covering the above topics let alone concepts that piqued my interest but need more research on my part to speak to like cluster management with yarn or mesos. But suffice it to say, this is one of the most relevant, content-packed conferences going for the more technical audience. If you missed it this year, I highly recommend searching for the content and discussions sure to be posted in the coming days. And, see if you can make it next year. It will pay-off in spades.

Links

Excellent list of links to this years notes and presentations online provided by James Higginbotham.

http://launchany.com/gluecon-2014/