It’s all numbers everywhere and most of us in the auditing environment are trying to make sense out of numbers only.

“Accounts are just presentation of Business numbers”.

While trying to make sense / auditing these numbers there are various models / methods/ procedures we perform. Benford’s Law has got a special place in world of auditing. All specialized auditing software like – ACL, IDEA, TeamMate have got test related to Benford’s Law in built. However, what seems to lacking is availability of relevant theory explaining how this law can be used in auditing. In series of blog I will be explaining about Benford’s Law and its related tests. I will also show you how you can perform these tests directly in Excel with few manual steps and thus you won’t require a sophisticated and costly software to start using this amazing law.

**History related to Benford’s Law**

Benford’s Law is named after Frank Benford, was a physicist at the General Electric Research Center in Schenectady, New York. Benford had 20 patents issued to his name that were assigned to General Electric, and he was the author of over 100 papers on light and matters related to optics. His digits paper dealt with his hobby, which was mathematics. Benford’s patents have long since expired, but the digits paper written as a hobby lives on, with 1,000 published book chapters, articles, and papers on Benford’s Law.

The Law of Anomalous Numbers paper (Benford, 1938) begins with a note that in a book of logarithm tables, the pages show more stains and wear on those giving the logarithms of numbers with low first digits (1 and 2) than on those giving the logarithms of numbers with high first digits (8 and 9). Benford then speculated that this was because more of the numbers used (or “in existence”) had low first digits. Thus the idea of Benford’s Law was started.

**Overview of Benford’s Law**

It is a law about the frequency distribution of leading digits in many real-life sets of numerical data. The law states that in many naturally occurring collections of numbers, the leading significant digit is likely to be small. e.g. when we look for India’s population we get 1.252 billion. Here first digit is 1, second digit is 2 and so on.

Thus, in sets which obey the Benford’s law, the number 1 appears as the most significant digit about 30% of the time, while 9 appears as the most significant digit less than 5% of the time. By contrast, if the digits were distributed uniformly, they would each occur about 11.1% of the time. Benford’s law also makes (different) predictions about the distribution of second digits, third digits, digit combinations, and so on. [source : Wikipedia]

Benford had derived the expected frequencies of the digits in lists of numbers and these frequencies have now become known as Benford’s Law.

**Mathematical Formula for Benford’s Law**

The formulas for the digit frequencies are shown next with D1 representing the first digit, D2 the second digit, and D1D2 the first-two digit of a number.

Where Prob indicates the probability of observing the event in parentheses. First formula is for the first digit proportions, next for second digit and last for first-two digit proportions.

Using above equations proportions for the digits in the first, second, third and fourth positions are calculated and shown in below table:

**What type of data is expected to conform to Benford’s Law**

Just getting any data set and applying this law would not be useful as it will give you incorrect result. There are certain characteristics which we should look in data sets where we are applying this law. These are enumerated below:

- Data set should have at least 1,000 records before we can expect a good conformity.
- Data needs to form a near perfect geometric sequence.
- The records should represent the sizes of facts or events.
- There should be no built-in minimum or maximum values for the data, except perhaps for a minimum of 0 for data that can only be made up of positive numbers.
- The records should not be numbers used as identification numbers or labels.
- There are more small records than large records in the data table.

All of these characteristics are not mandatory but these are preferable to have in data set on which we would apply this law.

A weak fit to Benford’s Law is a red flag that there is a high risk that the data contains abnormal duplications and anomalies. Assessing whether the data should conform to Benford’s Law is a necessary first step.

**Benford’s Law test**

We will see more on above test in next post.