Infrastructure as Code and Documentation

Date

As the shift of IT infrastructure from private kit in physical racks to Public Clouds and “everything as a service” accelerates, many businesses are finding out that their Cloud projects are getting stuck when dealing with the documentation.

Some common signs you might have this problem:

  • The documentation took longer to complete than it did to build the Cloud tenancy.
  • Our Cloud tenancy has been in production for a month and the documentation we have for it is already out of date.
  • Our tenancy documentation is just comments in code and we can’t tell at a glance what is what.
  • I can’t get my Cloud tenancy into production because no one knows how to document it.
  • The users design, deploy and then destroy infrastructure in the Cloud before operations even know it exists let alone get to update the design.
  • If any of these sounds familiar to you then it’s probably a sign that the approach to documentation needs to change.

    In the old days of IT, last decade, the infrastructure that was purchased was vastly expensive, very precious and changed very rarely. Infrastructure projects ran for months if not years. The procurement costs were high, and the impact of design errors were large. A few weeks to produce, draft, and approve a high level design (HLD), a low level design (LLD) and ultimately output an as-built was not a serious risk to the schedule.

    None of the conditions that make this approach to design sensible, exist anymore.

    Today, infrastructure services are vastly cheaper and commoditised. This makes infrastructure quick, cheap and easy to change. Infrastructure change is common. Not only has the cost of infrastructure become cheaper but the procurement times have gone from months to minutes. The same fundamental shift in the way infrastructure is documented also needs to happen.

    Cloud Infrastructure is just a small part of what Cloud has to offer organisations and is a small part of business’ IT spending in the future according to Gartner. The majority of the upcoming business spend on IT is going to be in the “something as a service” category. When the operations team is taking care of a SaaS product, a HLD, LLD and as-built for a SaaS product makes no sense.

    Obviously the current approach to documentation needs to change, both in what is produced and captured but also in how the documentation is maintained. The good news is, is that a different approach does exist and it is easy to adopt. The Service Knowledge Base.

    Mature operations teams likely already have some form of knowledge base for turning tacit operation knowledge into explicit knowledge. The Service Knowledge Base extends the traditional knowledge base to also contain facts and information about Business Services that IT provide and manage. This approach also ties in nicely with the shift that IT teams are going through from being solely a provider of IT services to also being a broker of IT services.

    The Service Knowledge Base provides a framework for the capture and description of information related to a business service. It is important to define and agree on what is considered a business service. A useful starting point is to start at a high level and name few standard core IT services that the business users consume. For example:

  • Productivity (Email, Calendar, Voice, Telepresence etc.)
  • Customer Relationship Management (CRM)
  • Financials
  • Service Management
  • If your organisation has a service catalogue or application catalogue it can provide a good source of information on what core services the business uses.

    Services are provided by one or more products or solutions. To test if your list is usable, review each item and see if you have used any vendor names or product names. If you have then you haven’t identified a service, you’ve listed a product. Services have a generic name that should be closely tied to the description of the value the service provides to the organisation. Sometimes services are made up of other small services which together provide the business service. Start with a small simple service that you can easily work with. The next step is to define what critical information about a service should be captured in the knowledge base.

    When defining what information about a service should be captured and described in the knowledge base, there are a few dimensions to consider about the service, this includes:

  • How the service is provided. Is it an “as a Service” service? Is it an app that’s installed in a private data centre on dedicated infrastructure?
  • Who in the organisation is the business owner of the service, i.e. who pays the bills for the service?
  • Who is the nominated technical SME for the service?
  • What are the regular housekeeping and service management chores that must be done to maintain the service in good working order?
  • How do operators gain access to the service to operate it etc?
  • What other services does the service rely on and what other services rely on this service?
  • What interfaces does the service have to other services and systems? Are those systems on the internet, are they internal?
  • These are just a few small examples of the attributes and metadata that can be captured to describe a service. It is important to realise that by completing the knowledge base for a service you have gone from describing a high number of specific configuration values to describing the critical and meaningful information about the service that helps your people operate and provide the service to the business. There are many more attributes that are relevant to all services and there are attributes that may only apply to one or two services. The eighty-twenty rule comes into effect here, you may not get all the information for all the services however if you identify and focus on the core attributes to begin with and iterate quickly over each service you can quickly build up a lot of critical knowledge about each service.

    Finally, it is important to consider how to maintain the metadata and knowledge for each service. Approaches to this need will vary depending on several factors such as sizing of the operations team, how teams get work done and how teams are encouraged to develop good habits and practices. Often new knowledge base systems are built, everyone puts information in and then it’s never used or maintained. Continuous improvement plays a role here however a more fundamental and baked in approach to documentation needs to exist.

    Teams and people generally neglect documentation updates for a few reasons:

  • The documents for that service haven’t been updated for years, they are totally out of date and no longer relevant to the current state of the service. It’s a waste of time to update it.
  • People and teams don’t feel that documentation is important to what they do.
  • Documents are hard to locate, scattered or troublesome to edit.
  • Documentation updates is just forgotten about.
  • Dealing with stale documentation is always a challenge. A key way to address this and resistance to documentation creation and maintenance is to make it quick and easy to create and edit documentation. In general, some form of web based Wiki is the best approach and should also have a “create page from template” capability so staff can quickly create a page for a service. Closely couple the ITSM tooling to the knowledge base. When staff are handling CR/SR and IM tickets they should be able to click a link to access that services Wiki page and quick edit the page from there.

    It is easier to maintain system knowledge and documentation when documentation updates are a baked in stage of the ticket lifecycle. Change the lifecycle from:

    TODO -> IN PROGRESS -> DONE

    to be

    TODO -> IN PROGRESS -> DOCUMENTATION -> DONE

    A change should not be marked as completed until the service knowledge base is updated to reflect what changed. Review the status of the previously completed changes and have the team or person who implemented the change(s) demonstrate what documentation was updated at the following CAB. Celebrate the successful change in the CAB and let everyone see that “It has been XX days of successful change implementation”.

    Service Requests and Incident Management are no different to the approach to change management. Include a step in the SR and IM processes to ensure that documentation updates are completed, peer reviewed and celebrated.

    The goal is to achieve the “its a core part of our approach to how are services are managed” culture in the teams and that good documentation is a valuable part of what the team produces.

    Get in touch with Diaxion today and find out how much easier life could be. With offices in Melbourne and Sydney we’ve helped thousands of people simplify their business with our IT infrastructure services. Become one of them.

    More
    ARTICLES