Monday, November 24, 2014

Ubiquity of data

I've been thinking a lot lately about ubiquity of data, or really the lack of it in today's modern technology.

In the 90's and 2000's most of our information lived on our primary machines, whether that was a desktop or a laptop. As an industry we spent a lot of time and resources trying to make that information portable. In the 80's it was the Floppy drive. In the early/mid 90's it was the Iomega Zip Drive. In the late 90's/early 2000's it was the recordable compact disk. In the mid to late 2000's it was flash memory in the form of a USB stick. All of these technologies focused on one thing, making it easier to move data from one place to another.

In the late 2000's/early 201x's we started to talk about shifting our data to the cloud. The thought was that if we put our data in services like Amazon S3, Amazon Cloud Drive, Dropbox, Box, Microsoft One Drive, and etc. that our information would be ubiquitous. In a way we were right in that we can now access that information in the cloud from anywhere. But fundamentally we're still thinking about and interacting with data as something that we move from place to place.

I think as an industry we need to stop thinking about data as a thing that we move from place to place and instead solve the problems that prevent us from accessing our data from anywhere. So what are the problems that we need to solve to make this a reality? This list is by no means exhaustive, but it's where I think we need to start.

  • Federated Identity Management.
  • Data Access Standards.
  • Networks that do not model their business on whether you're a data consumer or provider.

Federated Identity Management

In the world we live in today, each service (i.e. company) owns authenticating who we are. I.e. they keep a proprietary set of information about us that they use to test us with. If we pass the test they considered us authenticated. Most of these tests come in the form of two questions, what's your name and what's your password.

The problem with this is that it takes identity authentication out of the hands of those being identified and puts that into the hands of those wanting to authenticate. There's nothing inherently wrong with wanting/needing third party validation. The problem comes when we have hundreds of places we need to authenticate with, each with it's own proprietary method of authentication. Not to mention that it passes the buck to the user to remember how each one of these services authenticates them.

Tim Bray has a good discussion on federation that you should read if you're interested in the deeper discussion of the problems of identity federation.

Data Access Standards

We need data access standards that any group (for-profit or not) or individual can implement on top of their data that allows any other system (using the federated identity management) to interact with it. These standards would define CRUD operations (create, retrieve, update, and delete) in such a way that any other system and interact with the data on that system on the users behalf.

We have a good start to this with standards like OPML, RSS, WebDAV, CalDAV, CardDAV, and etc but these standards aren't cohesive. On top of that we don't have a real way to query a service to see what type of CRUD operations it supports. If we had the ability for the service to state what it serves then the clients could more intelligently interact with that service. Currently we put the onus on the user to know what a service offers.

Networks that do not model their business on whether you're a data consumer or provider

Right now the people who provide us access to the internet think about us in two categories. The first category I'll call "data consumers" and the second category I'll call "data providers".

Data consumers have the ability to get things from the internet and put things somewhere else on the internet. But data consumers don't have the ability to provide things to the internet without putting it somewhere else.  A good example of this is email. A customer with a standard "data consumer" internet connection cannot run a mail server for two reasons.

First, they get a dynamic IP address from their their ISP (internet service provider). This means that the address from which they connect to the internet is always changing. Think about this analogy to a dynamic IP address. What if your home address was constantly changing either daily, weekly, or monthly. It would be impossible for anyone to contact you via the mail reliably because anytime your address changed mail sent to the previous address would be delivered to the wrong house. It's the same way on the internet. If you want people to be able to talk to you you need to have a static address for them to contact you.

Second, ISPs block the ports necessary for others to talk to you. Even if you had a static address, often your ISP blocks standard email ports (25, 993, 143, 587, and 465) because they're trying to stop spammers from easily distributing their spam. But as anyone with an email address knows, the spammers are doing just fine even with the ISPs not allowing incoming connections. So I don't buy this as a valid reason to block these ports.

Data providers have all the same access as data consumers except they pay more to have static IP addresses and to not have the ports blocked. Notice anything wrong with this situation? The ability to fully participate in the internet is based on how much you pay your ISP. ISPs hide behind the fallacy that they're trying to protect you in order to be able to charge you more for the ability to truly participate on the internet. Does that extra money you pay actually protect you or anyone else on the internet better? No. Most ISPs will probably tell you that your also paying for more reliability. But you're running on the same system as the data consumers, so I don't buy that argument either.

I truly believe that we're not quite moving in the right direction when it comes to solving these problems. Until we do, you will constantly be battling moving your data from one place to the next when any new interesting service comes into existence.

Monday, November 17, 2014

Transitioning to a professional software development role: part 3

In my first post in this series, Transitioning to a professional software development role: part 1, I started to outline some of the gaps I've seen in people's preparation for entering a career in the software development industry. I started off by focusing on what software development is not about.

In my second post in this seriesTransitioning to a professional software development role: part 2, I took a look at what software development IS about. In the final post in this series I'd like to talk about the tools available that make us more efficient.

Being a good software developer means understanding how to apply agile

For a long time developing software was very much like developing a product on an assembly line. Assembly lines are very rigid and not well suited to respond to change. They run on the assumption that what happens upstream in the assembly line can be built upon and won't change. The moment change is introduced most of the product on the assembly line is ruined and must be thrown away.

Software's assembly line is called Waterfall. Overtime we've come to understand the downfall of waterfall and it's major flaw is that it's very rigid to change. Rigidity to change was okay when the primary delivery mechanism for software was the compact disk. But as software has grown to allow near real time delivery of features and functionality Waterfalls rigidity to change has become a hindrance to delivering high quality software in smaller but more frequent updates and features.

That's where Agile come in. Agile software development is about being able to respond to change in a rapid manner. It teaches us to think about software in a less monolithic manner but instead as a group of features that can be delivered in small chunks frequently over time.

I wrote a post several months ago called Software Craftsmanship: Project Workflow. If you're new to agile it's a good introduction to the anatomy of a project and what I've found useful. While the project workflow I've outlined isn't something you'll see in official Agile books, it is something that I have found extremely useful.

Being a good software developer means understanding how to use Lean

The concept of Lean Manufacturing was invented at Toyota. The primary goal was to reduce waste in the manufacturing cycle. This was done by re-thinking the manufacturing process to identify and remove waste. On example of waste could is parts sitting in a queue waiting to be processed. Toyota was able to show that by re-engineering their manufacturing process they could improve quality, efficiency, and overall satisfaction of customers.

The concepts behind Lean Manufacturing can also be applied to software development. Unfortunately these concepts often are applied incorrectly and have lead to many misconceptions and misunderstandings of Lean Software development. I wrote a post several months ago which outlined common misunderstandings in applying Lean to software development

As a professional software developer it's important to understand Lean and how to apply it to developing software.

Being a good software developer means understanding how to make trade-offs 

The last area I want to briefly cover is understanding how to make trade-offs. As a professional software developer you're going to be asked to make trade-offs all the time. Sometimes it will come in the form of quality (a bad trade-off IMO). Other times it will come in terms of features.

The key to understanding how to make trade-offs is learning to ask a few questions.

  • What am I gaining by making this trade-off?
  • What do I not get that I would gotten if the trade-off was not made?
  • What downstream affects will this decision have on my long term strategy or road map?
  • What additional work will be required later as a result of this trade-off?
The ultimate goal in software development is to provide business value in every part of the process. Understanding how to make trade-offs will help you provide the right business value at each step in the process.

Monday, November 10, 2014

Transitioning to a professional software development role: part 2

In my previous post, Transitioning to a professional software development role: part 1, I started to outline some of the gaps I've seen in people's preparation for entering a career in the software development industry. I started off by focusing on what software development is not about.

In this post I want to take a look at what software development IS about.

Being a good software developer is about understanding data structures

The foundation of a good software developer is understanding data structures and object oriented programming. Data structures like Binary TreesHash Tables, Arrays, and Linked Lists are core to writing software that is functional, scalable, and efficient.

It's not just good enough to understand what the data structures are and how they're used. It's crucial that you also understand WHEN to use them. Understanding when to use particular data structures properly comes with a few benefits. First, it helps others intuitively understand your code. Others will be able to understand your frame of reference better. Second, it helps you avoid "having a hammer and making everything a nail" syndrome. That's when you're learning something new and looking for places to apply your new knowledge, often shoehorning it in to places it doesn't belong.

Being a good software developer is about being able to estimate your work

I can't stress enough how important this is. Your team, your managers, and your customers are going to rely on you for consistency. They're going to make plans around what you do. And because of this learning to estimate your work is crucial in helping you and them meet commitments. Understanding how to estimate your software well also helps you build a regular cadence in what you deliver which is helpful for your customers.

There are a three concepts that I've found that really helped me learn to estimate my work well.  The first is the Cone of Uncertainty. This concept is really helpful because it helps you tease out what you know you don't know as well as what you don't know you don't know. Understanding the cone of uncertainty helps you remove ambiguity in what you're working on which in turn helps you better understand the level of effort it will take.

Once you've teased out the uncertainty in your work you can use Planning Poker as a way to quantify how much work something is. It's important that you try not to tie your poker points to a time scale as it will tend to skew your pointing exercise. Instead, as you get better about learning to quantify how much work something is relative to your other work you'll start to naturally see how much time it takes. For instance let's say you use fibonacci numbers 1, 2, 3, 5, 8, and 13 to quantify you're work. Over time as you get better at pointing your work, you'll also see a trend in how much time certain points take. Only then can you accurately associate a timescale with your pointing.

The last concept that I've found very helpful in learning to estimate how much work I can do in any given period is by tracking my velocity. If you're using planning poker to determine how big the chunks of work are and you're using agile to set a cadence or rhythm for when you deliver your work, then velocity tracking can help you be more predictable in how much work you can deliver in any given agile sprint. Understanding your velocity helps you to set reasonable expectations on what you can deliver and helps those that are planning for the future understand what it would take to decrease the time of a project or make sure that a project is on track and will meet it's deliverable dates.

Being a good software developer is about re-use in order to avoid re-inventing the wheel

As newer engineers we want to solve problems that we find interesting and a challenge. Often as we get into the depths of a particular problem space it will be evident that you're trying to solve an already solved problem. At this point you're at a cross roads where you can continue down the path of solving the problem yourself and re-invent the wheel. Often this is the result of both curiosity and mistrust. You're curious about how to solve a particular problem or curious about whether you could solve the problem better than those that have come before you. This also happens when we don't trust that a particular library actually solves the problem you're trying to solve. Or because another solution solves a slightly different, but compatible problem, we don't trust that our problem is in the same problem space.

This is very detrimental to a project for a few reasons. First, the problem has already been solved so you're going to waste time solving an already solved problem. Second, it's likely the case that the problem is more nuanced than you're aware of. It's also likely the case that the people who have already solved the problem have dedicated themselves to solving that problem. I.e. it's the entirety of their problem domain. This means that they're going to be the subject matter experts in this area. Because this is only one part of your overall problem you won't be able to dedicate the required amount of time solving the problem as well.

I would encourage you to first look to see if someone has already solved your problem either in part or in whole. There's plenty of high quality open source projects on GitHub and SourceForge. These projects have people who are eager for you to use and incorporate their projects into your project.

Being a good software developer is about knowing the limits of your understanding

There are several aspects to understanding the limits of your understanding. One aspect is to know that knowledge about any particular domain has both a breadth and a depth to it. It is impossible to gain both a breadth and depth of understanding in all areas of software development amongst all subject domains. Because if this it's important to be aware of what you have a breadth of understanding in but are lacking depth and what you have a depth of understanding in but don't have a breadth of understanding. Over time you'll develop both a depth and a breadth of understanding in a few particular subject areas. But it's important to know that this takes time, theory, and practice. Without all three of those you won't gain the breadth and the depth.

Knowing the limits of your understanding also involves being able to say you were wrong. There are going to be plenty of times when you thought you had a depth of understanding or breadth of understanding of something only to find out you didn't fully understand or misunderstood the subject. Being able to say you were wrong is the first step to correcting your understanding and being able to build on your new knowledge.

Monday, November 3, 2014

Transitioning to a professional software development role: part 1

I've spent 14+ years in the software industry in either an IC (individual contributor) role, as an engineering lead, or as a manager. I've worked in both the public and private sector. I've worked at companies as large as 100,000+ people and as small as 19 people. One thing that's been pretty consistent over time is that people first entering into the software industry are ill-prepared for what it means to be a professional software developer. This is equally as true for those coming out of college as it is for those transitioning to software from another industry.

What I'd like to do in this post is outline some of the gaps I've seen in people's preparation and try to pave the way toward helping those interested in software development understand what's expected of them in the industry and how to be prepared.

While the following post will focus on being a good software developer, most of what I outline is applicable to other roles in the software industry such as project/program/product management.

This will be a mult-part post. In part one I will focus on what software development is not.

Being a good software developer is not just being able to code

You're part of a team. Software development isn't just about solving problems with efficient algorithms. You're part of a team which is part of a larger ecosystem. There are product people trying to manage the vision of the software. Their are project people trying to manage the cadence of the software life-cycle. There are other engineers consuming the output of your work. There are internal and external customers trying to use your software to make their lives more meaningful either by being more efficient, participating in some sort of community, or just goofing off playing a game you've written.

Because of this people are relying on you to be an effective communicator. They're relying on you to be effective with time management. They're relying on you to ask for help when you get stuck. They expect you not to go dark. And they're relying on you to help them out when they get stuck.

Essentially you're part of a new tribe, each person having different but overlapping responsibilities. It's important to remember to grow your skills both technically AND with soft skills.

Being a good software developer is not about being clever

One of the biggest mistakes I see newer folks in the software industry make is trying to be too clever in their solutions. Writing software that lasts is about simplicity. Learning to write simple code that clearly communicates it's intentions and intended purpose(s) means that it will be used effectively. Writing code that is clear means that it's readable.

In the software industry you're going to spend more of your time reading other peoples code than you will actually writing code. It's important to learn what it means to write readable code.  I would highly recommend you read the book Clean Code: A Handbook of Agile Software Craftsmanship.

Being a good software developer is not about personal style

Every industry has it's own DSL (domain specific language). That DSL helps people to communicate more effectively within the industry by removing ambiguity and subjectivity. Software development has several different layers of DSLs that it is important to learn.

There are language specific idioms and standards that it's important to be familiar with. There are platform specific standards. For instance standard *nix programs tend to do one thing that can be chained (or composed) with other programs (by piping) to serve some larger purpose. Whereas on the other hand, Windows programs tend to be monolothic in nature and self contained. It's important to know what the standards are for the platform you're working on.

In the same way there are going to be general coding standards that are industry accepted as well as coding standards that are specific your new organization. Your organization will also likely have it's own set of standard tooling for development, deployment, and distribution.