Friday, December 30, 2011

Trends of 2011

December 30 feels like a good date to reflect on the trends of the past year. Looking back to major trends of 2010, I realize that 2011 was more of an evolution and consolidation of technologies, rather than introduction of new ideas.

Mobile apps are rapidly becoming the most popular type software. Mobile space now resembles Web of late 90’s when everybody realized that they need a web-site. Furthermore, like e-commerce of late 90’s, mobile-specific business models are also beginning to emerge.  At the same time there are clear signs of market consolidation with iOS and Android being clear winners, and Blackberry, WebOS, Windows phone and others being clear losers.

While Tablet can be seen as just another mobile device, it certainly has unique characteristics and distinct usage patterns. 2011 saw great explosion in tablets’ popularity and adoption. Initially thought of as keyboard-less laptop, the tables are now perceived as completely different type of consumer device. Table market-space is still in its infancy and the jury is still out on what future tablets will look like – a computer, a toy, a media consumption device, etc.

Cloud computing begins to lose its hype because it’s becoming a standard part of software development and operation.  This technology is entering its maturity phase. By and large, Amazon remains the dominant cloud provider. A noticeable trend of 2011 is movement from IaaS to Paas, where infrastructure cloud providers begin to offer more hosted services like ElastiCache, RDS for Oracle, Beanstalk, etc.

NoSQL space also seems to enter a consolidation phase which followed Darwinian explosion of database types and approaches. At the moment, it seems that MongoDB and Cassandra capture most of attention. As with any new technology, people begin to realize its sweet spots and its limitations. Also, a better understanding of the umbrella “NoSQL” term leads to separation to document-based (Mongo), key/value based (Reddis), or table-based (Cassandra) solutions.

Social Platforms are now considered an important part of any consumer-oriented products. A new entrant into the social space is Google+ that competes head-to-head with Facebook. Time will tell whether it has a future or whether it ends up like other failed social products from Google. The interest in Social Platforms is further driven by a slew of high profile IPOs (LinkedIn, Groupon, Zynga).

While future predictions are futile, I have a strong feeling that 2012 will also be year of elaboration of existing trends rather than introduction of something radically different, but only time will tell.

Happy New Year!

Monday, November 28, 2011

Productive Time

As trivial as it seems, I have discovered that leveraging periods of high-energy during the day for most strategically important tasks greatly enhances effectiveness and performance without consuming additional resources of time or effort.

Everybody has their own high-times and low-times during the day. Some people consider themselves morning persons, some people consider themselves evening/night persons. These categorizations usually serve for identifying periods of high-energy and low-energy. Naturally, people like the high points and not so much the low points.

What I have noticed, however, is that there is a certain disconnect between daily biological clock and scheduling of important tasks. Consider this example: I know many “morning” persons that start their day with coffee, emails and their favorite news. The most productive time of their day (7am-10am) is therefore used for routine activities that do not require too much energy or concentration. If, instead, this most productive time of the day was leveraged for strategic tasks such as design, analysis, planning, then the same amount of time would give a quantum leap in terms of productivity.

I've tried it myself. Now when I come to the office, I don’t process emails, I don’t look at the latest tech news, I don’t finish off leftovers from yesterday. I intensely focus on high-priority items.  The emails can get processed pretty much any time during the day, so can the news be read later. After just few weeks I have realized that I made more progress on the important tasks during these few weeks than I did in many months.

Additional benefit of using morning time for “thinking”, is that you are getting less distracted by others. Some people are not in the office yet, while others are busy reading the news:)

I don’t want to leave an impression that I necessarily advocate using morning time for strategic activities. If you’re an evening person, then by all means – leverage evenings. My point is that you need to identify your high-times and your high-importance tasks and make sure that they are in agreement.

P.S. if you're reading this in the morning, you might use this advice to improve your productivity:)

Monday, October 31, 2011

Golden Mean

Even if you’re not into Aristotelian philosophy, you must have heard about the doctrine of the Golden Mean. It advocates finding a middle solution between two opposite extremes.
Yet, it is surprising how often in Software Engineering people forget about the Golden Mean and violently advocate an obviously extreme solution or view.

Here are few examples:
·         100% TDD vs No TDD
·         BDUF vs No Design
·         Strong typing vs Weak Typing
·         Management Theory X vs Theory Y

In each of those pairs of opposites, each option has some merits. Yet, taking it to the extreme and asserting its universality, turns each of them into an absurd. The goal of a successful architecture or design effort should be finding the right tradeoff between competing forces and therefore finding the Golden Mean.

One important note is that the Golden Mean in each of those pairs is heavily dependent on the context and is not something absolute and unchanging. That’s why it’s so funny to read people blindly advocating a particular option with nearly complete disregard to circumstances and context.

Friday, September 23, 2011

Using Maven Profiles for Managing Dependencies in SOA Application

Recently we have enhanced our Continuous Deployment process by creating a bootstrap application that initiates DBs with tables and initial data. We use JPA/Hibernate to auto-create the tables from model objects. Since our application is composed from numerous services, each service carries its own part of the model, and there is a shared model used by all services. The bootstrap module depends on all the services through Maven's pom.xml, and populates the DB with tables from all dependent services.

The problem is that different clients use different combinations of services. Obviously, we don’t want to deploy services not used by a particular client, nor do we want to create tables needed by those unused services.  After some research, we settled on Maven Profiles that allow activating and deactivating different dependencies within the POM based on active profile.

Consider the following example:
The app contains three services (I wish only 3:)): S1, S2, S3, and a shared model M. Client A needs M, S1, S2, while Client B needs M, S2, S3 components. Originally, we would make bootstrap program depend on all 4 components and create all 4 database objects. So, in order not to pollute Client A’s environment with S3's DB objects that are not used, we introduced two Maven Profiles.

Default dependency (profile): M, S2
Profile-Client-A: S1 dependency
Profile-Client-B: S3 dependency


Each client has its own eclipse workspace that contains only components relevant to that client. In each workspace, the same bootstrap project is configured with a corresponding profile as active. The same is done in corresponding Hudson build jobs: build for Client A uses maven –P Profile-Client-A setting which “pulls in” only relevant dependencies.

By leveraging Maven Profiles we’re now able to remove all irrelevant dependencies from each of our clients. The workspaces look much cleaner, especially since in reality we have much more than just 3 services…

Tuesday, August 9, 2011

Caffeinated Coder – Short Circuiting Your Body

In 16 years of my professional career, if there is one thing that’s common between all companies where I worked, then it’s certainly the high amounts of coffee consumed by programmers and other engineers. Understandably, developers need frequent energy boosts to stay focused and productive in this demanding and challenging line of work. 

I was doing the same thing – drinking 3-4 cups of coffee per day, and was considered “low” coffee drinker compared to other coffee-addicts that consumed 6+ cups a day. Then, with the age, I developed IBS, which, in my case, prevented me from drinking coffee. After stopping drinking coffee, I felt extremely low energy and had serious difficulties concentrating and working with any degree of intensity. It was much like any other addiction: when you stop it, it hurts.

Then I discovered the mid-day exercise. The building where I work has an excellent gym, and I decided to try it at lunchtime. I wanted to go there to work on my back pain, but, to my greatest surprise, I discovered that doing intense exercise during the day, gave me more energy than any amounts of coffee that I used to consume. After coming back from gym, I’m full of energy, don’t need any coffee or tea, can concentrate, and generally feel significantly better.

It almost feels like by consuming coffee we short-circuit our brain into feeling energy. Of course, it’s unscientific and speculative, but, based on my own anecdotal evidence, I can recommend at least to try substituting some of the caffeine with a more healthy alternatives like mid-day exercise. The same, by the way, applies to using caffeine as a substitute to decent sleep. Yes, it’s possible to get a temporary boost, but I can’t imagine that in the long run this is sustainable or healthy.

Friday, July 15, 2011

To shard, or not to shard: that is the question

When dealing with very high volumes of data, one usually needs to decide how the system is going to scale when the data grows beyond a reasonable size. In my current project we’re working with tables that reach 1B records.  I’m talking about MySQL 5.5.

After initial load-testing and profiling we had our first “oh, shit” moment. When tables are in hundreds of millions of records, regular “laws of physics” don’t always apply. So, we desperately started looking at ways to alleviate these problems.

Clearly, the most scalable method is sharding. Sharding is usually implemented at the application level in such a way that the application knows on which node a particular piece of data resides. A common example is placing users 1-1M on shard 1, users 1M-2M on shard 2, etc. This method allows for theoretically unlimited amounts of data.

Now, the bad part. Implementing sharding is pretty complex and time consuming. Most importantly, sharding forces breaking up the data by certain criteria (like users_id). All queries that specify the user are OK, because we know where to go. But, queries that don’t specify user IDs must be executed on ALL shards. That is a major point of pain. Of course, you can come up with a smart parallel execution strategy, but then you’re effectively entering a realm of programming of distributed databases.

There are some ready alternatives: ScaleDB and Gizzard are examples of NewSQL, and are layers that reside between the application and mulitiple MySQL nodes, and know where to execute the queries. Then, of course, there is a plethora of NoSQL solutions that are capable of distributing the data: Cassandra, MongoDB, etc. In our case, increased complexity associated with introducing new systems into the project, would outweigh benefits of these systems.

So, what’s a desperate developer to do in such a case? Well, in some cases it’s possible to fall back on time-tested ways of reducing table size by means of archival. Archival does not mean that the archived data is not accessible. It might just be accessed from another table. Also, sometimes it helps to partition the table on MySQL level. While it’s far short of the benefits that complete sharding would yield, it does help with performance of queries with partitioned column.

The conclusion is that while sharding is very powerful and scalable technique for dealing with large data volumes, it involves a lot of complexity and side-effects. A careful examination of alternatives is in order. Be it NewSQL, NoSQL, or plain OldSQL (archival, DB-level partitioning), it’s worthwhile to carefully analyze tradeoffs of each solution. 

Tuesday, June 14, 2011

Agile Change and Rework Waste

One the main benefits of Agile methodologies is adaptability to change. Changes occur in all projects, Agile or not, but Agile projects are better equipped to handle the changes through short iterations and ongoing adaptation to new requirements. The customer does not need to specify everything upfront. As the system is being developed, he can refine the system with the knowledge that was obtained from previous iterations.

Now, there is a difference between refining the product, re-prioritizing certain features, redefining the unimplemented features on the one hand, and discarding an implementation of already developed features on the other hand. Discarding often occurs because “now i see it differently” or because “that’s not what I wanted”. Granted, as people discover more about the product, their needs and requirements change.  However, the proper way to address it is to implement the minimal and agreed agreed functionality first, and then “grow” it according to evolving understanding of the system.

Unfortunately, people often confuse iterative and incremental development. Iterative development means that a team develops bare bones of a feature and then grows it. Incremental development means that one whole feature is implemented after another whole feature.  In such case, if the implementation is wrong, then most of the effort is wasted, and the feature must be reworked.  This is not the Agile change management – this is waste, and it should be rigorously eliminated (think Lean). 

An alternative to wasting effort on implementation of misunderstood, or incorrectly specified functionality, is not to request the customers to sign the requirements in blood. It is to structure the implementation in such a way that least debatable or more clearly understood and agreed upon functionality is implemented first. If there are areas that are likely to undergo significant amount of change, it's worthwhile to prototype them before committing to a full-fledged implementation. Changing prototypes is a lot cheaper than changing actual implementation.

For example, say you need to implement user management functionality in a back office part of your application.  You could try to specify all fields that are required, you could make guesses, or you could torture your customer to divulge the secrets of user’s information. Or, alternatively, you could just implement a very basic set of properties (first/last name, address, birth date) in initial iteration, and later “grow” the user data as requested by the customer.

Furthermore,  it is in customer’s and provider’s best interests to think the requirements through before starting their implementation. Incomplete and incorrect specs cannot be miraculously cured by Agile methodologies. What’s incorrectly specified will be incorrectly implemented. So spend the time analyzing and planning what and how things will be implemented during the upcoming iteration or release.