Monday, November 23, 2009

A real-world example of integration that needs work.

I've been thinking about integration solutions from the user perspective, since it turns out I use a fair amount of these on a day-to-day basis. Particularly, I've found myself thinking a lot about the importance of finding a good solution, one that not only works properly but also functions well, especially from the user's perspective. I have the...um...pleasure (I use the term very loosely!) of using rather sub-par integration solution on a daily basis at work.

We receive large amounts of data about applicants in digital format every day. The program (which I will simply call "Solution X") can be used to import data in a wide variety of formats such as non-delimited text in a flat file, XML, or delimited text. All of the data inputs for which we use Solution X are non-delimited flat file formats. I think there are 3 we use regularly with this program. Solution X is rather extensible in that you can set up a different profile for each format. I haven't done this part myself, but I'm told you don't need any particular level of expertise and you don't even need to be a programmer (though I'm told it helps). The data ends up in an organization-wide Oracle database.

The program examines the data in 2 "passes", the first of which is almost completely automated: the user loads the program, loads the appropriate profile, loads the file and starts the first pass. The execution time for each record varies depending on the file format complexity and the resources of the computer (a desktop pc) running the program. If there are no errors then the total execution time for the first pass is simply the number of records in the file times the length of time per record. If there are errors then a user may have to dismiss a dialog box, which can stall the first pass. Once the pass is done the records are in one of three states: matched, new, or unmatched. Matched records were determined to belong to an existing applicant record and the input data has been added to their file. A new record was determined to not belong to an existing applicant record and a new record was created for them. An unmatched record couldn't be automatically determined to be either a match or new, so the user must make the final decision. The second pass through the file loads each unmatched record and gives the user the opportunity to decide if the potential matches are in fact matches.

So far so good. Well, sort of.

First we have some practical problems. One is that the program is not very forgiving when the input data changes. Even very small differences can require fairly significant work either by IT staff to change the profile or by the user to manually alter the input data. Unfortunately because of project priorities we end up manually changing the input data on a daily basis.

Another problem is the fact that when errors occur during first passes they can easily halt the entire process. This is really a major issue particularly when very large files are received. For example, yesterday evening we received a file with about 1200 records that takes, on average, about 30 seconds per record on the first pass. That's 10 hours if there are no errors in the first pass. I will often start a first pass on a large file at the end of the day, and sometimes I come back in the morning to find that the program has been displaying an error dialog since record 23.

Also difficult is the fact that a record being flagged as unmatched might not mean that the record couldn't be matched or determined to be new. It could be that there was some problem with the input data that is enough to prevent loading but isn't enough to warrant informing the user. This requires the user to look through cryptic log files and fix the problem manually.

As you can see, Solution X is functional. It does what it needs to do, albeit with some effort, and perhaps irritation, on the user's part.

The bigger problem, at least in my opinion, is that Solution X was obviously not designed for processing large numbers of files/records, or even with efficiency in mind! You cannot, for example, have two people working on the same input file at the same time, nor can you run multiple instances of the program at the same time (e.g. to work on two different types of files simultaneously). The only way to reduce that 10 hour initial pass time is to manually split the input file up and then run multiple instances of the program on different computers. This means that to get the first pass of that file done by the end of the day we need to dedicate at least two computers to the job.

The biggest problem is that everything is started by the user. The user must load the program, the profile, the file and then start the process. Because the program can get stuck rather easily the user has to check in to make sure that the "automatic" part is even running! For larger volumes like this it seems that we need a solution that runs as a daemon and takes care of the "easy" stuff itself...

The following is what I think Solution X should have been to begin with. Let's call it Solution Z. Solution Z is a two-part program utilizing a server-side daemon and a front-end GUI application. The server side daemon continually monitors a drop folder. When new data is available it is automatically retrieved, decrypted and put into the drop folder. Sometime in the next minute the daemon automatically loads the file and initiates the matching/loading process. As with Solution X, those files that are clearly matches or clearly new are loaded automatically. Records that are flagged as "unmatched" or with malformed data are placed in a queue. The user can load the GUI app to pull unmatched or malformed records from the queue as time permits. For each record that is pulled the user with make the decision about matching and/or correct any malformed data. Once these operations are performed the records are resubmitted to the daemon for loading. The use of queues for the error corrections means that multiple users can work on a single input batch at the same time. The use of a server-side daemon for loading allows for greater automation and parallelism in the processing of an input file. For high priority records you could load the GUI app right away and pull from the queue frequently in order to complete the whole batch as soon as possible. For lower priority records you could wait for the queue to build up. If the server-side daemon is run as a job on a server (as opposed to running the "server" process on a desktop workstation) you would likely also see a significant decrease in processing time per record. Even if this doesn't change appreciably you would almost certainly benefit significantly from the fact that you don't have to wait for a human to start the "automatic" first pass.

Thoughts?

2 comments:

  1. Peter,

    Though the post is big,you have explained your experience in very simple terms and is easy to read without putting much effort.

    I have a doubt.First of all, Is solution z easy to develop.And also, you have provided so many thoughts to achieve a single procedure solution Z.Is it easy to implement it.

    And also many times manual work is essential, so i think we have to accept it... which is unavoidable.

    I think we have to google more to discover better solutions for this kind of applications...

    ReplyDelete
  2. Asha,

    Ya, that was kind of a long one! ;-)

    I agree that my proposed "ideal" solution wouldn't be the simplest thing to implement, but it is certainly doable and I would argue that it is definitely worth the time and cost involved. I actually have had the opportunity to discuss some other possible solutions with some of our IT staff and one of the options has many of the features I described, so I'm hopeful that we will get a better (though probably not ideal!) solution in place sometime relatively soon...

    This is definitely one of those situations where manually work will be unavoidable, since the raw data we are receiving is self-reported and often hand-entered at the source. It is amazing how many people mistype things like their date or birth! The problem I have with the current system is that it is downright wasteful of the user's time.

    Thanks for the feedback!

    ReplyDelete