Wednesday, August 06, 2014

AKKA - Parallel processing

Got a chance to use AKKA in one my recent project, it was a great experience to build a high throughput / parallel processing system with great possibilities.
What it tries to solve ?

  1. AKKA simplifies writing parallel processing code, all the user need is “How to implement an Actor ?" (Pattern/logical/Programming of AKKA).
  2. AKKA simplifies scale out, with remote JVM based clustering mechanism.  Actors can run in parallel not just in one JVM across multiple JVM’s, and let your application to scale out.
  3. AKKA provides effective fault tolerance in terms of reinitializing an Actor on failure, monitoring with supervisor Actor etc. 
  4. AKKA also comes with an in-built Queue/Mailbox per every actor, this will ensure the messages are queued in case of processing delays and it can also be configured for durability (File based Vs simple LinkedBlockingQueue).
  5. AKKA provides different flavors of Actors (Router, Load Balancing, Proxy, RoundRobin etc.).
  6. AKKA also provides schedule jobs support.
  7. AKKA simplifies thread pool configuration, well defined JSON application.conf to define thread-pool and fork-join Pool etc.
  8. AKKA provides support for scala & java platform.

AKKA performace benchmark number...

Extensions of AKKA (

If you are building a standalone application and wish to have a http end point, spray is your best bet. An HTTP layer for standalone JVM, I liked the simplicity of usage and the power it brings in by adding an HTTP endpoint to an high throughput processing system.  How easy and cool to build monitoring and health services on the nodes, felt like it can open up great possibilities.

While haven’t taken a deeper dive on to, would keep you posted as i uncover more about it...

Wednesday, June 18, 2014

An API Economy (B2B Service Platform)

With more and more businesses wanting to expose their services across different channels, API service platforms has evolved into a big need of the hour.Top Players in the market,
  1. Apigee
  1. Mashery
  1. 3Scale
  1. APISpark by Restlet

What do they offer,
  1. API Usage Metering/billing
  1. API Traffic control 
  1. API Auth/Auth
  1. Performance & High Scalability
  1. Logging

Do I use it in any of my projects, well I haven’t. But they seem to bridge niche area in enabling content providers to expose content and build a revenue around the API.
No wonder why API is a most used, abused & powerful terminology is software :)

Wednesday, June 04, 2014

Do you have an entrepreneur in you ?

Yes !, but “ * conditions apply  “
  • Do you believe world around you can’t be the same forever, and you have a role to play ?
  • Do you “Drive your team crazy, but still they love to work for you “ ?
  • Do you get too obsessed & focused on problems & people think you lack the ability to move on ?
  • Do you have the ability to set the compass right & see beyond the obvious for your team ?
  • Do you have the greatest “Attention to Details” ?
  • Do you believe money is yet another byproduct & what you do matters the most ?

Tuesday, January 07, 2014

An Overwhelming Era of Database

Guess this Era is significant in terms of number of databases made their way into the enterprise data centers,

1. Column Based
Column Based Database are not different from RDBMS on a structural point of view, but they do allow columns to be created on the fly unlike RDBMS (it’s bit relaxed). if you define a table with 2 columns & store a record with 3 columns you will see three columns inserted even though the 3rd column wasn’t defined.

Solutions: Cassandra, etc.

* Partially schemaless.
* No ACID properties.
* supports CQL (close to SQL syntax)

2. Graph or Hierarchy Based
Graph database stores data in the form of Vertex & edges, vertices are connected by edge. In a typical world where you want to define relational data, Graph database would fit in well.

Ex: Company -> employee -> family -> etc.

* Works with multiple storage option, Titan works well with Cassandra.
* Very effective in navigate from one node to other, you can visualize like navigating from friends to friends in a social media.

Solutions: TitanDB, Neo4J, etc.

3. Document Based
The data is stored like a document, most of them support JSON format. It’s easy to serialize & de-serialize when you store it as a document. 

Solutions: MongoDB, CouchDB etc.

5. KeyValue Store
This category is a filler for everything else that talks about being NOSQL.

Solutions: Redis, MemcacheDB etc.

4. RDBMS++ & --
This is a new trend, I could say a marketing trend where people claim they provide goods of both worlds of RDBMS & NoSQL. Ex: ACID property, Security & so on.

Solutions: NeoDB, etc.

More the choices more the complication in choosing the right fit.

I am no exception :) , Good luck with it.

Checkout In Dzone

Big Data & NOSQL Buzz…

After working for few years in NOSQL databases & so called Big Data Projects, I realized people use NOSQL as a synonym to Big Data. It’s true most of the people out there believe using NOSQL database makes a project Big Data Project. 

I thought I do my bit of mess by defining Big Data :)

Some Suggest it’s the 3V’s that matters,

1. Volume - The size of the data set (Terra byte, Peta Byte so on…).
2. Velocity - The rate at which your data grows (1GB per day, etc.)
3. Variety - How different is your dataset, is it a dynamic & doesn’t have a standard structure elements.

So what ? 

1. RDBMS & existing infrastructure was handling this problem for a while now, probably with some higher licensing cost, isn’t it ?  
2. But how about variety of data, because RDBMS is much more structural & may not work well for a non structured data.
3. Is ACID properties an overhead, may be may be not but that’s the cost you pay for performance.

Assume that your system falls into one of the above category, what next, 

1. What you wanna do with the data ?
2. What’s the current pain points in Business Analytics ?
3. How real-time is your analytics is expected to be ?

Business analytics is core for any business, some of the traditional tools Business analytics tools aren’t that real time considering the ability to do parallel processing & the licensing cost involved. I do agree this is kind of a motivation to use technologies like Hadoop, open source & runs on a low cost commodity hardwares unlike the proprietary solutions.

I could only attribute Big Data as a filler to the existing problems in Business Analytics & Intelligence space. 

1. It’s critical to have parallelism when it comes to generating BI jobs/reports & so on. 
2. It’s critical to be able to handle huge volumes, Velocity & Variety of data for a realtime analytics.
3. Obviously the cost involved with the traditional BI solutions.

I tend to believe just like the way we normalize database for realtime transactions & de-normalize for business analytics (remember star schema !). The Big Data Architecture puts the analytics hat right from the induction of data & less to worry about offline/de-normalization etc. 

Oops ! your right, I started of saying “Defining Big data”. I am lost, may be intentionally on a quest to see what’s in play for 2014 in Big data space.

Happy new year to you all !

Friday, February 08, 2013

REST-JSON Service Definition Simplified...

In continuation to my previous post on REST-JSON Service Versioning here are my thoughts around service definition for REST-JSON.
Problem Statement:
As a service provider, one is obliged to document service contract (request/response) of the service exposed to consumers.
If you take SOAP/web-service world for an analogy, WSDL acts as a contract for the services exposed. This article is an attempt to see how a JSON schema can be exposed via a generic services/wrappers.
Note: I have chosen Spring MVC + Jackson Library as software stack for the solution approach.
1.  Let’s taken an example of Service below that returns an employee object, given an employee Id.
public class TestController {
 @RequestMapping(value="/employee", method=RequestMethod.GET)
 public ResponseEntity fetchEmployee(){
  Employee employee = new Employee();
  return new ResponseEntity(employee, HttpStatus.OK);
2.  The Employee response object could look something like this,
public class Employee {
private int employeeId;
private String name;
private long primaryNumber;
 public int getEmployeeId() {
  return employeeId;

 public String getName() {
  return name;

 public long getPrimaryNumber() {
  return primaryNumber;
Based on the problem statement, the solution proposed below would seamlessly discover the REST-JSON services exposed in your application.
1.  To do so, I have introduced an implementation/overridden class on top of Json Serialize/De-Serialize -
 * Extending DiscoverableService means the exposed service is enabled for auto
 * discovery.
public abstract class DiscoverableService {

  * @return
 public List getAllExposedServiceInfo() {
  . . . . 
2.  Now any new service developed should extend, to mark itself as exposed service for service discovery.
public class TestController extends DiscoverableService {
 @RequestMapping(value="/employee", method=RequestMethod.GET)
 public ResponseEntity fetchEmployee(){
     Employee employee = new Employee();
       return new ResponseEntity(employee,HttpStatus.O)
3.  I have a controller/service exposed for the service discovery information, which is
Spring Auto wires all the controllers in the context that extend DiscoverableService to this controller class as shown below.
public class ServiceController {

protected List discoverableServices; 

@RequestMapping(value = SERVICE_NAME, method = RequestMethod.GET)
public synchronized ResponseEntity> fetchAllServiceInfo() {
 List serviceInfos = new ArrayList();
 for (DiscoverableService service : discoverableServices) {
 return new ResponseEntity>(serviceInfos,

}. . . . 
4.  Now comes the simple & final step, access your service information
http://localhost:8080/web-context/service/employee will return service definition for fetchEmployee service.
"serviceURI": "/employee",
"requestMethod": ["GET" ],
"methodName": "class com.sample.controller.TestController.fetchEmployee()",
"request": [ ],
"response": {
"employeeId": 0,
"name": null,
"primaryNumber": 0
5.  http://localhost:8080/web-context/service  - returns definition of all services exposed by the system.
With the above approach, exposing service/contract definition to consumer will be seamless.  Just sharing the service definition to the service consumer would do the trick.

Friday, February 01, 2013

REST-JSON Service Versioning – Simple approach

Problem Statement:

As a provider of a service, one is obliged to maintain different versions of a service. One of the possible reasons could be the existing consumers of the service doesn’t want to upgrade to newer version.

Given the above problem statement, the service provider would need to manage & maintain multiple execution path, Request & Response structure / objects.

Note: I have chosen Spring MVC + Jackson Library as software stack for the solution approach.


1.  Service that returns an employee object, given an employee Id.

@RequestMapping(value="/employee/{id}", method=RequestMethod.GET) 
public ResponseEntity fetchEmployee(@PathVariable int id){
//Business operation/search & employee object is responded
 return new ResponseEntity(new Employee(), HttpStatus.OK);
2.  The Employee response object could look something like this,

public class Employee {
private int employeeId;
private String name;
private long primaryNumber;
public int getEmployeeId() {
return employeeId;
public String getName() {
return name;
public long getPrimaryNumber() {
return primaryNumber;
3.  Assume the above service (/employee/{id}) is of version 1.0 & you are expected to add a secondary number as part of the employee response object only for a specific set of consumers. This would mean the service will have two version (1.0 & 2.0) 1.0 without secondary number & 2.0 with secondary number part of the response.

4.  Typical nature of implementation is write another service & expose that to the new consumer, this would also mean duplication of request/response objects in this case employee class.


Based on the problem statement the solution proposed below would introduce a seamless versioning of your REST services on JSON.

I have introduced a new type of annotation to indicate all the supported versions of a REST Service.

1.  Service that needs to be version aware, will be annotated with the custom annotation as show below.

@RequestMapping(value="/employee/{id}", method=RequestMethod.GET)
public ResponseEntity fetchEmployee(@PathVariable int id){
//Business operation/search employee object is responded
return new ResponseEntity(new Employee(), HttpStatus.OK);
2.  It’s just one part of it, how to control the request/response object, by version?
I have introduced a custom annotation @Include & @Exclude, to indicate if a property in the JSON object needs to be included

@Include (versions="1.0,..") - Include property for given versions of the service
@Exclude (versions="1.0,..") - Exclude property for given versions of the service

public class Employee {
private int employeeId;
private String name;
private long primaryNumber;
private long secondaryNUmber;
@Include (versions="2.0")
public long getSecondaryNUmber () {
return secondaryNUmber;
public int getEmployeeId() {
return employeeId;
public String getName() {
return name;
public long getPrimaryNumber() {
return primaryNumber;
3.  Part of the solution is to overwrite JsonSerialize & JsonDeserialize to read the appropriate tags above to interpret the versions & form the request / response.

4.  Now Consumer has the responsibility to communicate the version of interest, which will be passed as request header attribute “version:1.0” or “version:2.0” etc.


I have a working solution that I am thinking of open sourcing in a week or two, this article is more to get a feedback on the problem statement & your thoughts on my solution approach.

Similar approach can be used for REST service on XML (with java Bindings) with some overwrites on XML serialize & de-serialize.

Thursday, October 11, 2012

Impact of Product Back Log on Agile adoption...

Product Back Log ?

Product backlog is a prioritized features list, containing details of all functionality desired by a product as indicated by customer/product manager. 

In my past few year of experience in agile development, one of the critical & most debated topic in retrospection are,

  1. "No stories that are ready to play ?"
  2. "Team burn rate is high, product back log has exhausted"
  3. "BA vs Developer ration doesn't match"
  4. "Couldn't get time from business users, product owners"

Intent of this post  is not to address above issues, since it depends on the context & varies from project to project. All I want to bring up is how big a disaster a non-steady product log to a project/organization adopting agile process model.

So, what is the impact of non-steady product log ?

  1. In one iteration team "Delivered 20 points & in another just 5", why ?
  2. Average velocity goes for a toss, hmm :(
  3. Business/Management uses the Avg. Velocity to promise more delivery, which is already a misskewed number :(
  4. Developer would be spending on tech-debt, since no user stories. Meaning more cost/burn rate of less churn of business values.
  5. Arriving at project/organization level metrics would be a night mare :(
  6. This also affects the development teams effectiveness & ability to deliver as the fluctuation to high.

What have been your experience ? Your comments are most welcome...

Friday, October 05, 2012

Hadoop - Echo System

Think of maintaining 100s, 1000s of instances of Hadoop infrastructure. Here are some of the echo systems that helps/supports management, deployment & monitoring of hadoop infrastructure.

  1. Cloudera Enterprise
  2. Hortonworks Data Platform
  3. MapR

Thursday, October 04, 2012

Hadoop - Single Node Setup

Wanted to try hadoop for a longtime, got a chance to take a test drive today. Primed a fresh  CentOS6 Linux VM on my windows box (thanks to VMware Player) & setup hadoop with the help of below instruction.

Everthing went well, until this error throwed up on ./

2012-10-04 19:18:13,900 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///
    at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(
    at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(
    at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(
    at org.apache.hadoop.hdfs.server.namenode.NameNode.(
    at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(
    at org.apache.hadoop.hdfs.server.namenode.NameNode.main(

Same issue has been reported before -

Command "strace -fe open" tells you exactly where the configuration is getting loaded.


Even thought I configured the core-site.xml, mapred-site.xml & hdfs-site.xml under /usr/local/hadoop/conf/ folder, by default the system is referring to /etc/hadoop/ *.xml. Once I update the configuration files in /etc/hadoop location everything started working.

Very exciting, now that the environment is ready have to try some samples & step into mutli node cluster setup.

Friday, September 28, 2012

Maven Vs Apache Ivy + Ant ?

It's Obvious any project needs a build file...

Would you choose Maven Vs Ant + Ivy ?

Let's understand, what are the general expectation on a build library/solution ?

  1. Dependency management of libraries - Meaning, no need to keep the libraries/jars checked in to your repository, it can be as simple as defining in an external file Ex: Pom.xml / ivy.xml.
  2. Ability to build the application(.war, .ear etc.) - Meaning ability to write scripts to assemble/package an application. Ex: build.xml (Ant), assemble.xml (maven) etc.
To make an apple to apple comparison maven has to be compared with Ivy &  Ant, since maven taken care of both expectation mentioned above.

Pros & Cons,

  1. With Maven less scripting, put the files in the right folder as indicated by the maven archetype.
  2.  With Ant + Ivy, you still have to write your build file !!! Advantage is you can have a flexible folder structure & need not be restricted as in maven archetypes.
I kind of like Ant + Ivy combination, considering the flexibility of structure definition. But "With great powers, comes great responsibility" more scripting can cost you more, hmm.

Think about it !!!

Comments & your experience most welcome !

Thursday, September 27, 2012

Is agile boon or bane ?

In my acquaintance to software engineering for the past 12 years, I have seen various software process models in practice. Most famous of the lot Water-fall, Rational Unified Process (RUP) & Agile (now).

Rational Unified Process (RUP) was a long sustaining one, it was a paradigm shift back then. The idea of being iterative & not to wait for each phases to complete has reduced the software turn around & improved the delivery efficiency.

RUP even though an effective process model,
  1. Lacked early customer/business feedback
  2. Lacked shorter development cycle
  3. Lacked to embrace Test Driven Development
  4. Lacked the geekish/freekish/nerish culture :)
Of all the above it ignored to embrace changes from requirement, which is seen as inevitable with the volatile business conditions.

I know what your thinking, isn't the article on "Is agile boon or bane ?"

Above points are self explanation/boon of why software industry jumped into the bandwagon of Agile development process !!!

If you are a CTO/Enterprise Architect, let me guess what's your next big roadmap.

Let me guess, wasn't an easy guess :)

 Go ..... Agile !!!

Currently, I am into Agile development projects, for a major client in San Francisco. It's been an interesting journey so far, coming from RUP background i could reflect on my past experience & how that's dealt in Agile development process.

Challenges in Adopting Agile (I am talking about real agile, not kind of...)

  1. Agile is an organizational mindset shift.
  2. Building an Agile team is even tougher (right proportion of IQ Vs EQ members is key).
  3. TDD/BDD(now) go hand in hand with Agile, be prepared for it.
  4. Concept of QA is slowly fading off, Developers are QA (It's Just about switching the hats). Make sure your developers are matured enough to accept this very fact.
  5. Business is the key driver irrespective of the process models, make sure they have time for the Agile teams.
  6. Agile advocates flattened hierarchy, make sure it's in your organizational interest.

Some questions dwelling in my mind (around Agile),

  1. Estimation:   
    1. Story points are kind of misrepresented, misunderstood, rather unclear topic to me :). 
    2. By definition story points are based on the value the feature adds from a business standpoint.
    3. Developers estimate it by effort involved in development.
    4.  Management uses it to forecast there projections, strange enough :).
  2. Tech/design/architecture Debt:
    1. In traditional process models, software quality is part of the delivery. No customer would accept extra effort to cleanup mess :).
    2. It's different here in Agile, techical/design/architecture debt is a queue of things gets prioritized & most interestingly you get paid for it. How cool ....
I am still an amateur agile practitioner, didn't mean to hurt anyone or the process. These are the questions dwelling in my mind, inner voice, open for debate :)


Monday, July 02, 2012

JQuery Theme Roller

In one of my recent project, I happen to use JQuery, JQueryUI & JQueryUI Theme roller.

JQuery is an exciting UI framework, with great ease of usage, samples, documentation & user experience. JQueryUI provides you with great UI components such as accordian, tabs, date, progress bar, roller & more. JQueryUI Themeroller, makes it easy to design your css & theme for the component. In a way take away the burden of user experience from a java developer :).


Wednesday, March 28, 2012

"Catch the exception" - syndrome

Two exception handling behavior of developers annoys me time & again,

1. No Exception  propagation - Catch the exception & say e.printStackTrace(), irrespective of the depth of the call ().
2. Suppress the exception at the inner most methods & return null.

Most of this mistake occurs project over project, irrespective of the amount of experience a developer has. I felt this happens because of 2 reasons, 

1. Not understanding the way the java exception stack trace is build & the critical importance of exception propagation.
2. Messy, meaningless compile time exception :) over Run-time exception (Obviously this is a bigger topic to debate).

Monday, March 05, 2012

My Ruby On Rails Notes

My Ruby On Rails Notes,

1. Download/Install Ruby 1.9.3-p125
2. Download/Install Ruby Gems - package deployment tool
2. Setup Ruby Gems /bin to Operating system path.
3. Install Rails - cmd > gem install rails
4. Install DevToolkit / Configure DevToolKit
6. Follow API Documentation

Scaffolding is an interesting feature, it's quicker to get your initial web template. Just did some hello world samples & still playin around...


Sunday, March 04, 2012

Come back

Wasn't able to catch up with my blog for a while, thought of stopping by & making a point to stop by much more regularly. Lately haven't got much chance to catch up with the ever evolving technology stacks, but wanna work my way thro' NoSQL & Distributed High Volume Transactional systems    ( Hadoop, Cassandra, EC2 etc. ). Will write in detail in my next blog on things i have explored & could interest you.

Sunday, November 20, 2011

Cross-Platform Mobile development tools/frameworks

Was exploring some cross-platform mobile frameworks in the market. Even though there are lot of them in the market. Couple of open source solutions sounded very impressive.
1. Rhomobile
2. PhoneGap

Tried some samples on Rhomobile, yet to try phone gap. In a way these frameworks saves lot of time & cost involved in developing application specific to an operating systems / app markets. Instead of working on a specific SDK's it's a no-brainer to choose such frameworks. Write once & port to any device type. How cool it is !!!, way to go...

Sunday, July 31, 2011

Garbage Cat - Analysis of you GC Log

Came across this handy utility(garbagecat) to analyze Garbage Collector logs. Simple to use, output contains nice analysis report & recommendation on your GC.

Refer the documentation for more details on the usage.

Sunday, June 12, 2011

My Test Drive on JBoss AS 6.0.0

Took a test drive of JBoss 6.0.0 this weekend, thought would be nice to summarize my observation (with the background of using JBoss 4.3.0 version for my current projects).

JBoss 6.x.x is a re-architecture of JBoss 4.x with new the kernal (JBoss Microcontainer). Good news for application developers not much of change to the folder structure with exception of few new configuration files & some old ones were moved around.

High Lights
  1. Default support to Java EE 6.0 Specification support.
  2. JSF 2.0 support with integration with Bean Validation (JSR-303)
  3. CDI (Contexts and Dependency Injection) JSR - 299 , Weld is JBoss project to support CDI.
  4. Hornet MQ is the default & recommended messaging infrastructure.
  5. New Apache CXF-based JBossWS stack.
  6. JBoss Embedded AS
  7. New Admin-Console to ease the server administration.
  1. New Admin-Console is a refresher, user friendly with tree based navigatio.
  2. Lot of resembles to Weblogic Admin console :), never mind it does the purpose.
  3. All configuration parameters are persisted for good, unlike the JBoss 4.* version.
  4. "Service Binding Manager" Component externalizes all port configuration for a given/running profile. This saves lot of time fiddling with the raw .xml files :) as in 4.3.0.
  5. Applications (.war, .ear) can be added/updated dynamically using the console.
  6. Queues & Topic can be added dynamically using the console.
  7. Database connection pools can be added dynamically using the console.
  8. You have provision to restart/start/shutdown the server using the admin console.
  9. Very detailed Garbage Collector / Memory pool info on the console.
  10. The Metric tab for every component provides you useful statistics/metrics an the component. Ex: Selecting a queue provides (Message count, consumer count etc.)

  1. JMX-CONSOLE is still available, but with a new look & feel.

  1. Jboss VFS (Virtual File System), help yourself reading this article to understand.
  2. Jboss Web is the default webserver offering from JBoss 5.x.x, build on top of Tomcat.
  3. Felt the Server startup & Shutdown is slower than 4.3.0 GA :( hoping minor tweaks should take care of it.
  4. Couldn't find any documentation on migration from 4.3.0 to 6.0.0, will keep looking...
Jason Greene - What's new in JBoss AS 6.0.0
Admin Console user guide

Friday, June 10, 2011

Power of AccessLog & Tomcat Valve configuration

In recent past we faced quite a performance issues resulting in different modes of collecting performance statistics from web server. One of the handy tool was web server access logs, all I had to do is configure my webserver to write access logs. It dumps all the requested information (URL, Timestamp & Even roundtrip/response time of a request), How cool... with zero additional code you get response time of each request.

Link below details of access log configuration on tomcat, this could vary depending on the type of webserver used. But every webserver would have a access log by default.

All different types of valves supported by Tomcat, most useful of it all is "Request Dumper" valve (dumps all request detail) & "Access Log" Valve.

Example, below configuration (%D %T) will log response time for a request & timestamp of the request.

Valve className="org.apache.catalina.valves.AccessLogValve" prefix="localhost_access_log." suffix=".txt" pattern="%D %T"

Lot of tools can help you analyze the access logs & provide more insight of the usage response time etc.

Conclusion: We all know it's mandatory for a web server to have an access log,
but the good part is this can greatly ease performance study & tune
your web application.

Search This Blog