Technology & industry

Thursday, October 4, 2012

Data exchange format for enterprise application integration – XML, EDI and JSON?

Design for message exchange between modules of an application and across applications is often an afterthought. My experience is that this afterthought results in severe limitations on functionality, performance and scalability of the applications. In this note I will discuss relative merits of formatting data exchange through XML, JSON and EDI like flat file formats.

Decision about choosing data format depends on multiple factors. Among most important of these factors are size and structure of the message.

Consider a relational database for Orders. It will have multiple tables related to Order header and order lines. It will also have tables for items, suppliers, customers, addresses and currencies that will get referenced from Order header and order lines.

Technically it is possible to design a structure, in all three formats, that allows you to send complete database in a single huge message.

Size of message depends on its content. Size of message has impact on communication throughput, processing throughput, error handling and requirements for disk and memory resources.

Creating such large message is cumbersome but not very difficult. Actual issue comes when a recipient attempts to receive, log, parse and consume message.

Let us consider issues related to size & structure of message in more detail.

Communication time: Messages are exchanged serially. Considering network overheads, it may take several seconds for a large message to travel from source to destination. This communication time increases as message size increases. Any intermittent disruption may require source to resend the message to destination (Network compression can be used to partially mitigate this issue).

Logging: If traceability, retransmission and non-repudiation are a requirement, both source and destination systems will need to log the messages. Writing and reading large messages further reduces the throughput and also requires significant disk space.

Parsing: It is important to consider this aspect of integration. If your application needs to understand complete message before it can take any action, it will need to parse complete message and create an object in memory. Large messages require more memory and time to parse. Standard DOM (document object model) parsing for XML requires significant memory. Parsing of JSON messages will require minimum memory. One reason for need to parse complete message is message structure that do not enforce a specific order in which message elements can appear. You can partially mitigate need to parse complete object by enforcing such order.

Exception handling: As stated, you can transmit content of complete database, multiple orders in above mentioned example, in a single message. If your receiving application handles each message as a single transaction then a single failure will require reprocessing of complete message.

Let us now evaluate relative merits of three formats – XML, JSON and EDI/TEXT.

XML is a well established and popular standard. It is self describing, open and extensible. Extensibility allows you to add elements to message with limited programming impact. Multiple out of box parsing libraries exist in almost all popular languages. XML standards do not enforce a sequence in which elements must appear. This becomes a major weakness as receiving application will need to parse complete message to find element that it is interested in. Often this may require multiple passes through the message. One option is to parse complete message at once as a structure (called Document Object Model structure) that can be used later to find elements of interest. However, DOM parsing requires memory and is also overkill if you only need few elements from message. Self-describing structure of XML is another one of its major weaknesses. XML uses starting and end TAGs around each data element. While these descriptive tags are useful for a human reader, they add to overall size of XML.

JSON is an upcoming message format. This format is also open, extensible and self describing. As this format is a direct representation of Java in-memory object, parsing and loading it to memory requires significantly lower time than that needed to parse an XML. However, as JSON also does not enforce an element sequence, you still need to load complete message. JSON does not require an end tag. Hence size of a JSON message is typically 30% to 40% smaller than corresponding XML message. Parsing libraries exist for Java and for limited number of other language.

I am using term EDI like flat file format for a generic variable length text based message structure where each line in message corresponds to an individual record of data. First few characters of each line, called record identifier, identify record type. Elements within each line (record) are separated by an element separator. This message format is most compact and can be 60% to 70% smaller than a corresponding XML. This format is obviously not self-describing. Hence source and recipients need to share a previously agreed definition of message structure. Parsing libraries exist for standard EDI. However, one may need to write custom parsers for custom message formats. As formats are mutually agreed between senders and receivers, record & element sequences are usually enforced as part of that agreement. This guaranteed sequencing allows extremely large messages to be parsed sequentially using limited system resources at a significantly faster throughput. As messages are not self-describing, programmers need to be careful before changing formats.

I will now show same data in XML, JSON and EDI like format.

XML : size – size 341 characters

<ORDERS>

<ORDER>

<HEADER>

<SHIP_TO>Don Trump</SHIP_TO>

<SHIP_TO_ADDRESS>Atlanta, GA</SHIP_TO_ADDRESS>

<ORDER_DATE>April 21 2012</ORDER_DATE>

<PAYMENT_INFO>XYZ BANK</PAYMENT_INFO>

</HEADER>

<LINES>

<LINE>

<ITEM>Pen</ITEM>

<QTY>10</QTY>

</LINE>

<LINE>

<ITEM>Tablet</ITEM>

<QTY>05</QTY>

</LINE>

<LINE>

<ITEM>Laptop</ITEM>

<QTY>100</QTY>

</LINE>

</LINES>

</ORDER>

</ORDERS>

JSON: size – 241 characters

{

"ORDERS": {

"ORDER": {

"HEADER": {

"SHIP_TO": "Don Trump",

"SHIP_TO_ADDRESS": "Atlanta, GA",

"ORDER_DATE": "April 21 2012",

"PAYMENT_INFO": "XYZ BANK"

"LINES": {

"LINE": [

{

"ITEM": "Pen",

"QTY": "10"

{

"ITEM": "Tablet",

"QTY": "05"

{

"ITEM": "Laptop",

"QTY": "100"

}

]

}

EDI: size-88 characters

*B*ORD*

*H*Don Trump*Atlanta, GA*April 21 2012*XYZ Bank

*L*Pen*10

*L*Table*05

*L*Laptop*100

*E*ORD

You must have noticed that EDI-like format is most compact. Following table summarize pros and cons of three formats

	XML	JSON	EDI like Flat File
Size (scale of 100 to 1)	100	70	25
Can enforce element sequencing	No	No	Yes
Standard-based	Yes	Yes	No
Availability of parsers	Best	Java and few limited languages	Limited, custom parsers may be needed
Ease of parsing huge messages	Need custom parsers		Easiest

Even though EDI like format requires more programming, I would recommend using this format as much as possible. If other application or module can handle only XML or JSON formats then you can implement translators to convert your EDI like format to XML, JSON or any other open standard format.

Note: you can use compression to further reduce size of your text messages by another 80%. I will write another note on message compression.

Tuesday, October 2, 2012

Understanding DVA (Debt Value Adjustment)

Recently there has been a consistent stream of news reports on debt value adjustment (DVA) and possibility of FASB getting rid of this concept.

What is DVA?

In order to explain concept of DVA in layman language, assume that company A has floated long term bonds with face value of $100. Further consider a drop in credit rating of company A and a resultant fair market value of $70 for this bond. Concept of DVA allows company A to recognize difference ($100 – $70 = $30) as revenue.

Concept of DVA was introduced by FASB in 2007 after intense lobbying by financial companies, such as Merrill, Morgan Stanley, Goldman Sachs and Citigroup, which wrote letters to FASB arguing that it wasn’t fair to make them mark their assets to market value if they couldn’t also mark their liabilities. Their argument was that firm’s liability on day of reporting is lower because it can buy back its debt instruments from market at reduced price. (ref: Wall Street Says -2 + -2 = 4 as Liabilities Get New Bond Math )

Detractors of DVA find it against principle of conservatism, which states that accountant must choose reporting alternative that will result in less net income and/or less asset account. Argument against DVA is that if firm attempts to buy back its instruments,

1. It will need to borrow money at a higher rate in line with its reduced credit rating, and

2. Bond holders may decide not to sell their instruments and to wait for duration of bond and wait for full payment.

How to identify DVA in annual report of a company

Unfortunately DVA is not shown on balance sheet as a line item. You will need to read full annual report and search for word like “DVA” or “Fair market value of liabilities”. For example, following is an extract from page 55 of Bank of America Corporation’s (Ticker: BAC) annual report for 2011

[“Net income decreased $3.3 billion to $3.0 billion in 2011 primarily driven by a decline of $4.2 billion in sales and trading revenue. The decrease in sales and trading revenue was due to a challenging market environment, partially offset by DVA gains, net of hedges. In 2011, DVA gains, net of hedges, were $1.0 billion compared to $262 million in 2010 due to the widening of our credit spreads.”]

BAC’s net income, inclusive of DVA for 2011 and 2010 was $2.967B and $6.297B respectively. If you get rid of DVA fluctuations, actual income numbers for two years are really $1.967B and $6.035B!

Today most commentators agree that DVA rule must go. I fully agree.

I look forward to your feedback, comments and suggestions on this post.

Friday, September 28, 2012

Detecting Fraudulent Financial Reporting

Recently I came across a very interesting paper on detection of fraud in financial reporting. This paper “Major Financial Reporting Frauds of the 21st Century: Corporate Governance and Risk Lessons Learned”, authored by Hugh Grove and Elisabetta Basilico was published in Journal of Forensic & Investigative Accounting (Vol. 3, Issue 2, Special Issue, 2011 of).

Authors of this paper start by explaining 10 red flags in corporate governance. Their premise is that if one was looking at these flags or factors, it should have been possible to detect several large corporate accounting scandals. Authors discuss nine accounting frauds (Citigroup, Worldcom, Enron, Qwest, Tyco, Global Crossing, Lehman Brothers, Satyam and Paramlat) to illustrate their idea.

I liked very user friendly tone of this article which doesn’t assume prior knowledge of complex accounting ideas. However, best part of this article is its appendix. Authors have provided a very readable and concise summary of important accounting metrics and ratios.

When I read it, this paper was available at following link.

http://www.bus.lsu.edu/accounting/faculty/lcrumbley/jfia/Articles/FullText/2011_v3n2a7.pdf

I look forward to your feedback, suggestions and comments.

Thursday, September 27, 2012

Evaluating financial statements–few important metrics

I have burnt my fingers few times by blind reliance on analyst reports or my own intuition. Today I will like to discuss few financial metrics that I have found of immense help in navigating through maze of financial numbers and analyst reports. These metrics allow me to view financial numbers in perspective and identify areas that need further exploration.

Before I delve further into what I know so far, I must caution that these metrics are empirical & should be used only for guidance. They should not be treated as an absolute statement on a stock’s worthiness.

First metric that I find very useful is Beneish M-Score. This number, devised by Professor Messod Beneish, highlights probability of earning manipulation. This number, in its larger form, is based on a weighted sum of eight factors (called indexes) that are derived by comparing past and current financial statements. Another version of this number uses five of those indexes. Professor Beneish’s analysis showed that there is a very high probability of manipulation in financial statements if Beneish number is greater than -2.22. One should analyze financial statements with much caution and further dig into each of 8 factors if Beneish analysis raises an alert. It is also important to do this analysis over several years to identify a pattern and to check any possibility of financial manipulation in past. Further details of 8 factors are available from multiple sources on internet.

Second metric that I like is Altman Z score. This score measures financial health of a company and indicates probability that a firm will go into bankruptcy within two years. This score depends on four or five business ratios. There are few variants of Altman-Z formula for different industry segments. In general, a score above 3 indicates that a company is unlikely to enter bankruptcy. A score below 1.8 indicates a highly likelihood of financial distress within next two years.

Third metric that I will like to discuss is Piotroski F-score. This score measures relative financial health of firms. It is based on 9 factors. A score of 0 or 1 is assigned to each of these factors and total of all 9 factors if F-score for the firm. For example, Net income is one of the Piotroski factors. It will get a score of 1 if net income for that year is positive. A score of 7 or more indicates a financially strong company.

Three metrics written above should be used during preliminary analysis to identify areas that need deeper investigation. One should look at those factors of these scores that indicate any kind of distress. Ideally, these scores should be calculated over several years. These scores, coupled with other financial ratios are helpful in weeding out risky companies.

In near future I will share my spreadsheet tool that I created for my own use. Till then, Arivederchi!

PS: Following is result on an analysis I did on September 27 2012

Wednesday, September 26, 2012

My favorite web sites on investing and finance

I routinely struggle with my savings and investments. Few months ago I found few excellent sources of information that have helped me in navigating maze of confusing financial statements. I would like to share these sites with whoever is reading this blog. I hope that you will also benefit. If you know of other sites, please feel free to share that information.

1. Grumpy old accountants - http://blogs.smeal.psu.edu/grumpyoldaccountants/ - I can’t stop praising this site. Professor Anthony H. Catanach, and Professor J. Edward Ketz do an excellent job of explaining nuances of financial statements. Check for series of blogs on groupon. Also look for a recent blog on intangible assets.

2. Seeking Alpha –http://seekingalpha.com - This site is actually a complete portal on financial information. You should subscribe to their newsletters. Check an excellent article on “Dividend paying companies” at http://seekingalpha.com/article/836971-not-all-dividends-are-worth-it?source=yahoo.

3. Old School Value - http://www.oldschoolvalue.com – This site, owned by Jae Jun, is another excellent source of education for newbie investors. His newsletters contain excellent implementable information. Jae Jun also provides several free stock evaluation spreadsheets. For more serious and professional investors it has an option of upgraded priced version of tools. Best section of this site is blog at http://www.oldschoolvalue.com/blog/. While you are at this site, look for blog on financial ratios.

This is all for today. I will share more on what I learnt through next few blogs. Till then, ciao!

Wednesday, May 6, 2009

Hermes - JMS browser - Configuring for WAS JMS Provider

Finally I could configure it with WAS embedded JMS provider

First I defined a classpath group WAS and added WAS jar files
sibc.jms.jar
sibc.jndi.jar
sibc.orb.jar
idl-6.1.0.1.jar

Then I created a context with Loader=WAS and other properties (provider URL, initial context factory, credentials, principal)
Added a user properties file with following line
com.ibm.CORBA.ORBInit=com.ibm.ws.sib.client.ORB
Added a new session WAS
with asession=WAS, Loader=WAS, Class=hermes.JNDIQueueConnectionFactory, binding=jms.connection.OFRQueueFactory, initial Context Factory - com.ibm.websphere.naming.WsnInitialContextFactory, user properties file=

and it worked like magic!!

Tuesday, July 1, 2008

webMethods - some useful internal services - 1

webmethods has added a service "wm.server.admin:getDiagnosticData". one can run this service from administrator UI. This service generates a zip file that contains all diagnostic information, as well as server logs.

Pretty useful stuff!!

I plan to add details on following additional services. I will appreciate if someone can help me in creating this content.
wm.server.query:getServiceStats
wm.server.soap:registerProcessor
wm.server.soap:unregisterProcessor
wm.server.ui:addMenu
wm.server.ui:addSolution
wm.server.ui:getMenuTabs
wm.server.ui:getMenus
wm.server.ui:mainMenu
wm.server.ui:removeMenu
wm.server.ui:removeSolution