... JAVA and DSP by Richard G Baldwin

Message Digests 101 using Java

Learn how to write Java code to implement the SHA-1 algorithm and how to produce message digests using that code. Also learn how to use Sun's MessageDigest class to produce message digests for both the SHA-1 and MD5 algorithms.

Published: March 8, 2005
By Richard G. Baldwin

Java Programming, Notes # 729

Preface
Background Information
Preview
Discussion and Sample Code
Run the Programs
Summary
What's Next
Complete Program Listings

Preface

Third in a series

This is the third lesson in a series designed to teach you something about the inner workings of cryptography using Java. The previous lesson was entitled Digital Signatures 101 using Java. Hopefully, when you finish studying the lessons in this series, you will have learned a little about what goes on "under the hood" when you apply a cryptographic process to a message.

Not a lesson on JCE

The lessons in this series do not provide instructions on how to use the Java Cryptography Extension (JCE). The purpose of this series is to teach you how to implement common cryptography and secure hash algorithms for small cases without the use of a cryptography API.

Not intended for production use

The programs that I will provide and explain in this series of lessons are not intended to be used for production cryptography. If you need to do cryptography using Java, you should use Sun's Java Cryptography Extension (JCE).

The programs that I will provide in this series are intended to help you to experiment with and to learn about various cryptographic and secure hash algorithms and to gain a better understanding of how they work, and why they do what they do.

Viewing tip

You may find it useful to open another copy of this lesson in a separate browser window. That will make it easier for you to scroll back and forth among the different figures and listings while you are reading about them.

Theoretical basis and practical implementation

I will provide some of the theoretical basis for cryptographic and secure hash algorithms while discussing the lessons in this series. In addition, I will show you how to implement several common cryptographic and secure hash algorithms in Java.

I will also discuss and illustrate some of the modern uses of cryptography, such as the use of digital signatures.

Supplementary material

I recommend that you also study the other lessons in my extensive collection of online Java tutorials. You will find those lessons published at Gamelan.com. However, as of the date of this writing, Gamelan doesn't maintain a consolidated index of my Java tutorial lessons, and sometimes they are difficult to locate there. You will find a consolidated index at www.DickBaldwin.com.

Background Information

Cryptography in one form or another has been around throughout history. Please see the earlier lesson entitled Public Key Cryptography 101 Using Java for some background information on cryptography.

New uses in modern times

Bringing the field of cryptography up to date, Wikipedia tells us, "In recent decades, the field of cryptography has expanded its remit in two ways. Firstly, it provides mechanisms for more than just keeping secrets: schemes like digital signatures and digital cash, for example."

Secure communications

The lessons in this series will present and explain programs that implement cryptographic and secure hash algorithms intended to achieve secure communications between two parties. The cryptographic algorithms fall into two broad categories:

Public (asymmetric) key cryptography
Symmetric key cryptography

The first several lessons deal with public key cryptography. Subsequent lessons deal with symmetric key cryptography.

Please see the previous lesson entitled Public Key Cryptography 101 Using Java for a discussion of the difference between the two.

Secure hash algorithms

In addition, cryptographic algorithms are commonly supplemented by secure hash algorithms. Several secure hash algorithms are in common use. This lesson explains the SHA-1 algorithm as defined in FIPS Pub 180-2, currently available at http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf. FIPS Pub 180-2 is the primary source for technical information contained in this lesson.

Message signing and digital signatures

As explained in the previous lesson entitled Digital Signatures 101 using Java a variety of protocols are available for the use of public-key cryptography and digital signatures to protect the authenticity, integrity, and confidentiality of a message. That lesson explained a simple and easily understood protocol. However, it was pointed out in that lesson that an improved protocol can be achieved through the use of message digests. The purpose of this lesson is to explain how to create a message using the SHA-1 Secure Hash Algorithm.

A future lesson will explain how to use a message digest in an improved digital signature protocol.

Three important aspects

There are at least three important aspects of electronic communication:

Authentication: Confirming the identities of the parties involved.
Confidentiality: Making certain that only authorized parties can understand the message, even if it is intercepted by unauthorized persons.
Integrity: Confirming that content of the message wasn't modified during transmission.

The primary purpose of a digital signature, whether or not it is implemented using a message digest, is to achieve authentication and to validate the integrity of a message. Message digests, on the other hand, have a far greater role than simply being used in digital signatures. Therefore, even though the lead-in to this lesson has to do with digital signatures, the scope of this lesson is somewhat broader than the use of message digests in digital signatures.

What is a message digest?

According to Brett C. Tjaden, author of Fundamentals of Secure Computer Systems,

"Checksums are typically used to produce a fingerprint of a message so that if the message changes the checksum will not match, indicating an error. Most checksum functions ... are not designed to prevent an adversary from intentionally changing a message so that the resulting message has the same checksum as the original message, and thus avoiding detection."

"Cryptographic hash functions (also called message digests) are designed to protect against this possibility. These techniques are often based on collision-resistant, one-way, hash functions."

Thus, message digests are a special form of hash function that are designed to protect a message against attack

What is a hash function?

There are many kinds of hash functions. One large category of hash functions includes the hash functions that are used with the keys in a hashtable data structure.

Another large category includes cryptographic hash functions mentioned above.

A cryptographic hash function takes an input of arbitrary (but finite) length and produces an output whose size is always the same regardless of the length of the input. For example, the MD5 function, which will be discussed later, always produces an output consisting of 128 bits. The SHA-1 function, which is the main topic of this lesson, always produces an output consisting of 160 bits.

A value between 0 and M

Thus, one way to think about the output of a cryptographic hash function that produces an N-bit output is that the function always produces an output consisting of a positive binary integer whose value ranges between 0 and M. The value of M is one less than 2 raised to the Nth power.

For example, when N is 3, M has a value of 7. While easy to demonstrate with pencil and paper, this is not too exciting from a cryptography viewpoint. When N is equal to 128, M is a very large number. When N is equal to 160, M is an even larger number. Therefore, the number of possible outcomes when either the MD5 algorithm or the SHA-1 algorithm is used to hash a message is very large.

What is a one-way hash function?

One of the characteristics of a good cryptographic hash function is that it is a one-way hash function. One description of a one-way hash function is that even if the algorithm and the output produced by hashing a given secret message is known, it is impossible, impractical, or at least extremely expensive to determine anything about the secret message solely on the basis of knowledge of the algorithm and the output. Thus, knowing the output produced by the hash function does not divulge any of the secrets contained in the message.

Another description says that given an output for a one-way hash function, it should be impossible, impractical, or at least extremely expensive to devise any specific message that will produce that output.

From a practical standpoint, this implies that a good one-way hash function will have a very large number of possible output values (as is the case with MD5 and SHA-1). Otherwise, it would be easy to write a program that creates and hashes random input messages until at least one input message has been identified that produces each of the possible output values.

What is a collision-resistant hash function?

A collision-resistant hash function is one in which it is extremely unlikely that the application of the function to any two or more different messages would produce the same output value. A change as small as replacing one of the characters in the message with a different character would produce an entirely different output in a collision-resistant hash function.

(Pairs of different messages that produce the same output for a given hash function are said to cause collisions.)

A simple hash function

As I recall, the original ASCII encoding scheme was a 7-bit code that represented 128 characters and control codes as the values from 0 through 127.

(For example, the upper-case A character is represented by the decimal value 65 in seven-bit ASCII.)

One could devise a hash function by adding all of the ASCII values of the characters in a message and throwing away the carry whenever the sum exceeds 255. Thus, this hash function would produce an eight-bit output with values ranging from 0 to 255, for a total of 256 possible outputs.

Not collision resistant

This would not be a collision-resistant hash function. We immediately know this because the number of possible input messages is significantly greater than the number of possible output values (256). Thus, there must be many different messages that would produce the same output value.

(If a hash function always produced an output value of 1, then every possible input message would collide with every other possible input message.)

For example, applying this hash function to the three-character message consisting of the characters "BDF" would produce an output value of 204. Similarly, applying this hash function to the message consisting of the characters "CDE" would also produce an output value of 204. Thus, there is a collision between these two messages because the application of the hash function to either message produces the same output value.

While very easy to program, this hash function is sorely lacking in terms of being collision resistant.

(I will leave it as an exercise for the student to decide the degree to which this is a one-way hash function.)

Good collision-resistant one-way hash functions

According to Tjaden, given the functional relationship H(M) = h, good collision-resistant one-way hash functions exhibit the following properties:

Given the message M, it is easy to compute the hash value h by applying the hash function H to the message M.
Given any specific hash value, hx, it is difficult to find any message Mx such that H(Mx) = hx
Given a specific message M1, it is difficult to find another message M2 (M2 not identical to M1) such that H(M1) = H(M2).

According to Tjaden,

"Functions that satisfy these criteria are often called message digest functions, because they can be used to produce a fixed-length digest (or fingerprint) of an arbitrary-length message."

Validating integrity, a practical example

It is probably safe to say that most uses of message digests involve the validation of the integrity of a body of digital information even when authentication is not a big issue.

For example, I normally use Norton Antivirus software to protect my computer from viruses. I have the antivirus program configured so that it updates the local virus definition database on a weekly basis.

In the event that I feel the need to update more frequently, I can visit http://securityresponse.symantec.com/avcenter/download/pages/US-NAVCE.html and manually download an executable file which, when executed on my computer, will update the local virus database to contain the latest virus information.

Should I execute the file on my computer?

Since I download the file directly from the Symantec website, I'm not too concerned about authenticating the actual source of the file. (I'm reasonably confident that the website was created by and is maintained by Symantec.)

However, I may be concerned about the possibility that the file has been maliciously tampered with after it was made available for downloading on the Symantec server.

While the likelihood of tampering is probably very low, since it is an executable file the damage that could be done by a maliciously tampered file could be great.

Tampering could happen if someone managed to break into the Symantec server and replace the file with a different file having the same name. Tampering could also happen through interception and replacement of the file by an attacker during the transmission of the file from the Symantec website to my computer.

A file name and an associated message digest

Each day a new downloadable executable file with a different name is provided on the Symantec website, where the name is based on the date.

(On the date of this writing, the downloadable file is named 20050120-008-x86.exe.)

In addition to the downloadable executable file, the Symantec web site also provides a 128-bit message digest (in hexadecimal format) based on the MD5 hash function. This message digest serves as a fingerprint for the executable file.

I can validate the integrity of the file

If I have any concerns about the integrity of the executable file (such as whether or not it may have been tampered with), I can compute the message digest for the file after downloading and before execution. I can compare the message digest of the downloaded file with the message digest given on the Symantec website and validate (or fail to validate) the integrity of the executable file.

(In this case, the message digest is based on the use of a hash function named MD5, which is a different hash function from SHA-1. As an example, on the date of this writing, the MD5 message digest for the executable file listed above was 8F595216E577AAB670D1550EE96402EB.)

What does Symantec have to say?

Figure 1 contains some of what Symantec has to say about the use of the MD5 message digest. This information is useful because it expresses much of what this lesson is all about.

A hash function such as MD5 is a one-way operation that transforms a data string of any length into a shorter, fixed-length value. No two strings of data will produce the same hash value.

An MD5 checksum verifies the data integrity by running a hash operation on the data after it is received. The resultant hash value is compared to the hash value that was sent with the data. If the two values match, this indicates that the data has not been altered or tampered with, and its integrity may be trusted.

Click here to learn more about MD5 and to download an MD5 checksum utility.

Figure 1

An MD5 checksum utility

As you can see, in addition to providing the MD5 message digest for the executable file, Symantec also provides a free utility program that can be used to validate the file against the message digest after the file has been downloaded and before it is executed.

(Although I haven't used the program to generate a message digest, the documentation indicates that the program can be used to generate MD5 message digests as well as to check them. In addition, the program can be used with either files or strings as input. Therefore, this might be an interesting utility program for you to download and experiment with.)

Hashtable hash functions versus cryptographic hash functions

Programmers have been using hash functions for many years to deal with the key values in hashtables. Cryptographic hash functions are a close first cousin to the hash functions used with hashtables. However, there are some important differences having to do primarily with the application.

Number of possible values

As mentioned earlier, the number of possible values produced by a cryptographic hash function should normally be very large for the reasons given. On the other hand, the values produced by a hashtable hash function normally refer to storage locations in some sort of storage space. Therefore, while in extreme cases, it might be useful for these hash functions to produce millions of possible values, the useful range is probably much smaller than is the case for cryptographic hash functions.

Collision resistance

As mentioned earlier, collision resistance is extremely important for cryptographic hash functions. While collision resistance is probably also important for hash functions used with hashtables, people who design hashtables have developed a variety of ways to deal with collisions. Therefore, collisions are probably less of a problem with hash functions used with hashtables than with cryptographic hash functions.

One-way hash functions

Also as mentioned earlier, it is extremely important that cryptographic hash functions be one-way functions. This is probably much less important for the hash functions that are used with hash tables. Except for the fact that the ability to deduce the input from the output of a hash function implies poor resistance to collisions, it probably doesn't matter if it is possible to work backwards from a hashtable hash value and deduce the value of the key that was used to produce that hash value. I doubt that would be a security issue with hash tables. On the other hand, it would be an extreme security problem for cryptographic hash functions.

Preview

Two different programs

I will present and explain two different programs in this lesson. The first program named Sha03 will show you how to create a message digest from the ground up through the implementation and use of the SHA-1 algorithm. The purpose of this lesson is to teach you the inner workings of the SHA-1 algorithm.

The second lesson named Sha01 will show you how to create a message digest using an instance of Sun's MessageDigest class.

Each of the programs will be used to produce message digests for the same three messages, and the results will be shown to be identical.

Discussion and Sample Code

The program named Sha03

Before getting into the details of this program, I will provide a verbal description of the implementation of the SHA-1 algorithm.

The maximum length of a message (in bits) that can be hashed using the SHA-1 algorithm is one less than 2 to the 64th power.

Implementation of the SHA-1 algorithm consists of six major steps and a variety of minor steps.

This implementation is based on the description of the SHA-1 algorithm beginning in Section 5.0 of the Federal Information Processing Standards (FIPS) Publication 180-2 available at: http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf.

Step 1 Preprocessing

The first major step is to prepare the message for hashing. There are three minor steps included in this preprocessing step.

The first minor preprocessing step is to pad the length of the message so as to guarantee that the final length is a multiple of 512 bits or 64 bytes.

The second minor preprocessing step is to set the initial hash value to a standard 160-bit value. (You will see the standard value later in the discussion of the program.)

The third minor preprocessing step is to parse the padded message into blocks of 512 bits or 64 bytes each.

Each block is processed separately

Each block is processed separately and the hash value is updated during the processing of each block. The initialized hash value from above is the input hash value for processing the first block. The updated hash value produced by processing each block serves as the initial hash value for the processing of the next block.

Padding the message

All messages are padded regardless of the original length of the message. The number of bits in the pad must be such as to cause the final length to be a multiple of 512 bits. The first bit in the pad must have a value of 1. The last 64 bits in the pad must contain a binary representation of the length of the original message. All other bits in the pad must have a value of 0.

Resources used by the algorithm

The algorithm uses the following resources:

A message schedule consisting of eighty 32-bit words, known in this program as W[0] through W[79].
An array of eighty 32-bit constants, known in this program as K[0] through K[79].
Five working variables of 32 bits each, known in this program as A through E.
A 160-bit hash value consisting of five 32-bit words known in this program as H[0] through H[4]. The final hash result consists of five 32-bit words, which constitute a 160-bit message digest. The digest is a unique fingerprint of the message used to compute the hash value.
A temporary word known in this program as temp.

Processing the message blocks

Each message block, consisting of 64 bytes, is processed in sequence. The hash value output from processing one message block forms the initial hash value for processing the next message block. The hash value output from processing the final message block is the message digest for the message.

The five remaining steps

The five remaining major steps take place and are repeated during the processing of each message block.

Step 2: Initialize the message schedule

Initialize the message schedule W by using the incoming 64 bytes that constitute a message block to populate the first sixteen 32-bit elements of the 80-element message schedule.

Step 3: Populate the remainder of the message schedule

Populate the remaining 64 elements of the message schedule W by propagating the values in the first sixteen elements upward into the remaining 64 elements.

Step 4: Initialize the working variables

Set the initial values of the variables A through E to the five 32-bit segments of the incoming 160-bit hash value.

Step 5: Process the message schedule

Individually process each of the elements in the 80-element message schedule. A different process is applied to the elements in each group of 20 elements. The processing of each element results in an updated set of values for the working variables A through E.

(These processes require the use of 32-bit unsigned addition, which is not directly supported by a Java primitive operator. Thus, a special method is required to accomplish this addition.)

Step 6: Update the hash value

When all 80 elements in the message schedule have been processed, use the values in the working variables A through E to update the five 32-bit segments of the hash value. This updated hash value is used as the input hash value for processing the next message block. If all message blocks have been processed, the updated hash value is the message digest for the message that has been processed.

I will refer back to these steps during the discussion of the code in the program.

Description of the program

This program implements the SHA-1 secure hash algorithm as defined at: http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdf. This resource will be referred to hereafter simply as FIPS Pub 180-2.

Also see page 67 of the book entitled Fundamentals of Secure Computer Systems by Brett C. Tjaden.

Testing the implementation

The program tests the implementation, producing an output that should exactly match the output produced by the program named Sha01 (to be discussed later).

The program creates and displays the message digests for three different strings of 8-bit character data. One of the strings is very short, consisting of only three characters or 24 bits.

(This matches one of the test cases given in Appendix A of FIPS Pub 180-2. The program was also tested successfully against the other two test cases given in Appendix A of FIPS Pub 180-2, but the results of those tests are not shown in this lesson.)

A second string that was used for testing the program consists of 50 characters or 400 bits. The third string consists of 65 characters or 520 bits.

The two longer strings were selected to confirm correct behavior of the code across the 512-bit processing boundary of the SHA-1 algorithm.

The program was tested using JDK 1.5.0 and WinXP.

The program output

The output produced by the program is shown in Figure 2.

Digests are displayed in hex format.
Data Length: 24 bit message.
Message Digest:
A9993E364706816ABA3E25717850C26C9CD0D89D
Data Length: 400 bit message.
Message Digest:
804AA5C1DE1C74C10C37F36327A12924B87DD3A7
Data Length: 520 bit message.
Message Digest:
E2220BDED2A3E23A44E883401042123A790AE21D

Figure 2

The message digests in Figure 2 are displayed in hexadecimal format.

As you can see from Figure 2, the message digest is always 160 bits in length regardless of the length of the message being digested.

I will refer back to Figure 2 during the discussion and explanation of the program.

The class named Sha03

Listing 1 shows the beginning of the class named Sha03 and the beginning of the main method.

class Sha03 {

  public static void main(String[] args){
    Sha04 digester = new Sha04();
   
    System.out.println(
         "Digests are displayed in hex format.");

Listing 1

As you will see later, the class named Sha03 serves simply as a driver program to exercise the capabilities of an object of a top-level class named Sha04.

(I didn't declare Sha04 public simply because I didn't want to be forced to put it in a different source code file.)

Listing 1 instantiates a new object of the class named Sha04, and saves its reference in a local reference variable named digester.

Then Listing 1 displays the first line of text shown in Figure 2.

Create a very short message

Listing 2 creates a very short message consisting of only three characters and then displays the length of the message in bits, producing the output shown in line 2 of Figure 2.

    byte[] dataBuffer = ("abc").getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
Listing 2

Digest the very short message

The code in Listing 3 digests the message by passing its reference to the method named digestIt belonging to the object instantiated earlier from the class named Sha04.

    String theDigest = digester.digestIt(
                                     dataBuffer);

Listing 3

As you can see in Listing 3, the digestIt method of the Sha04 class returns the message digest as a reference to a String object. Later you will see that this String object encapsulates the 160 bits that comprise the message digest expressed in hexadecimal format.

The digestIt method of the Sha04 class

At this point, I am going to set the main method aside for awhile and explain the implementation of the SHA-1 algorithm, as embodied in the method named digestIt.

Listing 4 shows the beginning of the Sha04 class and the beginning of the digestIt method.

class Sha04{
  String digestIt(byte[] dataIn){
    //Step 1:
    byte[] paddedData = padTheMessage(dataIn);

Listing 4

The Sha04 class generates and returns a message digest for an incoming array of bytes of arbitrary length. You should invoke the instance method named digestIt on an object of the Sha04 class to digest a message. Pass the message to the digestIt method as an array of bytes.

The message digest is returned as a reference to a String object, which encapsulates a 160-bit message digest expressed in hexadecimal format.

Six major steps

As explained earlier, the algorithm consists of six major steps and a variety of minor steps. The first of the six major steps is shown in Listing 4. Listing 4 invokes the method named padTheMessage in order to pad the incoming data and guarantee that the message length is a multiple of 512 bits.

Padding the message

Recall that all messages must be padded regardless of the original length of the message. The number of bits in the pad must be such as to cause the final length to be a multiple of 512 bits. The first bit in the pad must have a value of 1. The last 64 bits in the pad must contain a binary representation of the length of the original message. All other bits in the pad must have a value of 0.

You can view the method named padTheMessage in Listing 23 near the end of the lesson. The method is straightforward and is not deserving of a detailed discussion in this lesson.

Set the initial hash value

Listing 5 sets the initial hash value, expressed in hexadecimal format as five 32-bit words, to the values specified in Section 5.3.1 of FIPS Pub 180-2.

    int[] H = {0x67452301,
               0xEFCDAB89,
               0x98BADCFE,
               0x10325476,
               0xC3D2E1F0};

Listing 5

Set constant values

Listing 6 sets the values of eighty 32-bit constants as specified in Section 4.2.1 of FIPS Pub 180-2.

    int[] K = new int[80];
    for(int cnt = 0;cnt < 20;cnt++) K[cnt] =
                                      0x5A827999;
    for(int cnt = 20;cnt < 40;cnt++) K[cnt] =
                                      0x6ED9EBA1;
    for(int cnt = 40;cnt < 60;cnt++) K[cnt] =
                                      0x8F1BBCDC;
    for(int cnt = 60;cnt < 80;cnt++) K[cnt] =
                                      0xCA62C1D6;

Listing 6

Determine the number of blocks

Listing 7 begins by making certain that the length of the padded message is a multiple of 512 bits or 64 bytes. If not, the program terminates.

    if(paddedData.length % 64 != 0){
      System.out.println(
                  "Invalid padded data length.");
      System.exit(0);
    }//end if
    int passesReq = paddedData.length/64;

Listing 7

Then Listing 7 determines and saves the number of 512-bit blocks that must be processed. This value is used as a loop counter in the for loop shown in Listing 8.

Process the blocks

Listing 8 shows the control structure used to process each of the blocks by invoking the method named processTheBlock once for each block.

    byte[] work = new byte[64];
    for(int passCntr = 0;passCntr < passesReq;
                                     passCntr++){
      //Copy the next block of paddedData into
      // the working array.
      System.arraycopy(paddedData,64*passCntr,
                                    work, 0, 64);
      //Process the current block of 64 bytes.
      processTheBlock(work,H,K);
    }//end for loop on passCntr

    //Digestion is complete.
    //Return the digest value to the calling
    // method as a String.
    return intArrayToHexStr(H);
  }//end digestIt()

Listing 8

Each time the processTheBlock method is invoked, it receives:

The padded message data for the next block
The 80 constants in the array named K
The updated hash value produced by the previous invocation of processTheBlock and saved in the array named H

When the processTheBlock method returns, the array named H contains an updated version of the hash value. When the final invocation of processTheBlock returns, the final hash value is stored in the array named H.

Convert the hash value to a hexadecimal string

At this point, the digestIt method invokes a method named intArrayToHexStr to convert the hash value from an array containing five values of type int to a String object encapsulating forty hexadecimal characters. (Each hexadecimal character represents four bits for a total of 160 bits.) The digestIt method returns a reference to the String object as the message digest.

The intArrayToHexStr method

The intArrayToHexStr method converts an incoming array of int values into a String that represents each of the int values as eight hexadecimal characters including leading zeros if necessary.

The method can be viewed in its entirety in Listing 23 near the end of the lesson. The method is straightforward and doesn't warrant further discussion in this lesson.

The processTheBlock method

Listing 9 shows the beginning of the method named processTheBlock. This method, which is invoked once for each 512-bit block of the padded message, is the workhorse of the program.

  private void processTheBlock(byte[] work,
                                int[] H,int[] K){

    //Step 2:
    int[] W = new int[80];
    for(int outer = 0;outer < 16;outer++){
      int temp = 0;
      for(int inner = 0;inner < 4; inner++){
        temp = (work[outer * 4 + inner]
               & 0x000000FF) << (24 - inner * 8);
        W[outer] = W[outer] | temp;
      }//end inner loop
    }//end outer loop

Listing 9

The processTheBlock method processes a message block consisting of 512 bits (64 bytes) of padded message data updating the ongoing calculation of the message digest, which is stored in the incoming array named H.

Steps 2 through 6 of the six major steps described earlier are implemented within this method.

Do Step 2

Step 2 is implemented in Listing 9. This step combines the incoming 64 bytes of padded message data into sixteen 32-bit values of type int, and deposits those 16 int values into the first 16 elements of an eighty-element array of type int named W.

The code required to accomplish this is straightforward and won't be discussed further.

Do Step 3

Step 3 is implemented in Listing 10. At this point, the first 16 elements of W contain the data from 64 combined bytes of padded message data. The code in Listing 10 uses this data to populate the remaining 64 elements in W.

(Note the use of the XOR operator (^) in Listing 10. Also note the use of the rotateLeft method of the Integer class to perform a circular left shift on the data.)

    //Step 3:
    for(int j = 16; j < 80; j++){
      int temp = W[j-3];
      temp = temp ^ W[j-8];//XOR operator
      temp = temp ^ W[j-14];
      temp = temp ^ W[j-16];
      temp = Integer.rotateLeft(temp,1);
      W[j] = temp;
    }//end for loop on j

Listing 10

The methodology for populating the upper 64 elements in W is specified in Section 6.1.2 of FIPS Pub 180-2.

Do Step 4

Step 4 is implemented by the code in Listing 11.

    //Step 4:
    int A = H[0];
    int B = H[1];
    int C = H[2];
    int D = H[3];
    int E = H[4];

Listing 11

Listing 11 creates values for the variables A through E based on the current value of the message digest stored in H.

Do Step 5

Step 5 is implemented in Listing 12. This step applies four different processes to the data in the eighty-element W array.

The same process is applied to each individual element in each group of 20 elements of W. However, the process changes from one group of 20 elements to the next group of 20 elements.

The specification for the process implemented in Listing 12 is provided on page 16 of FIPS Pub 180-2.

    //Step 5:
    int temp;
    //Process first 20 elements
    for(int j = 0;j < 20;j++){
      temp = uAdd(
           Integer.rotateLeft(A,5),f0_19(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    //Process next 20 elements
    for(int j = 20;j < 40;j++){
      temp = uAdd(
          Integer.rotateLeft(A,5),f20_39(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    //Process next 20 elements
    for(int j = 40;j < 60;j++){
      temp = uAdd(
          Integer.rotateLeft(A,5),f40_59(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    //Process last 20 elements
    for(int j = 60;j < 80;j++){
      temp = uAdd(
          Integer.rotateLeft(A,5),f60_79(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop

Listing 12

Four different functions

The difference in the processing from one group to the next in Listing 12 is embodied in the behavior of the four functions named f0_19, f20_39, f40_59, and f60_79.

(The code in these four functions is straightforward. Each of the functions can be viewed in its entirety in Listing 23 near the end of the lesson. The specifications for the four functions are provided in Section 4.1.1 of FIPS Pub 180-2. Note that there is a disagreement between FIPS Pub 180-2 and the Tjaden book with respect to two of these functions. This program implements the version specified in FIPS Pub 180-2.)

Other than the use of the different functions in the first statement inside each of the for loops in Listing 12, the process is the same for each group of 20 elements.

The uAdd method

The method named uAdd is invoked several times within each for loop in Listing 12. The method named uAdd performs unsigned addition of two 32-bit integer parameters that are passed to the method. The use of such a method was necessary because Java does not have an unsigned addition operator for 32-bit data.

The code in the uAdd method is relatively straightforward. Each of the two 32-bit integer parameters is treated as a 32-bit unsigned value in the lower 32 bits of an integer of type long. The two long integers are added. Any overflow into the 33rd bit is discarded upon return when the long result is converted back to type int.

The uAdd method can be viewed in its entirety in Listing 23.

Do Step 6

Step 6 is implemented in Listing 13. This step updates the hash value stored in H by adding the current hash value to the values of the variables A through E produced by the code in Listing 12.

    //Step 6:
    H[0] += A;
    H[1] += B;
    H[2] += C;
    H[3] += D;
    H[4] += E;
   
    //Return from processing this block with
    // the updated values of the digest in H.
  }//end processTheBlock

Listing 13

Return with new hash value in H

This completes the processing of a 512-bit block of padded message. At this point, the updated hash value has been stored in the array named H, and the processTheBlock method returns.

If this is the final block in the padded message, the hash-value contents of H will be the message digest. If not, the hash value stored in H will be used as input to the processing of the next block of padded message.

Return to the main method

Now let's turn our attention back to the main method, picking up where we left off in Listing 3.

At this point the message digest for the very short message has been produced and stored in hexadecimal format in a String object referred to by the reference variable named theDigest.

Display the message digest

The code in Listing 14 displays the digest producing lines 3 and 4 in Figure 2. Note once again that even though the message contained only three characters, the message digest contained 160 bits represented by 40 hexadecimal digits in Figure 2.

    System.out.println("Message Digest:");
    System.out.println(theDigest);

Listing 14

Digest a longer message

Continuing in the main method, Listing 15 digests and displays the digest for a longer (400-bit) message, displaying the results in lines 5 through 7 in Figure 2.

    //Digest and display the digest for a longer
    // message of 400 bits, which is still less
    // than  the crossover point of 512 bits.
    dataBuffer = ("01234567a\n" +
                         "01234567b\n" +
                         "01234567c\n" +
                         "01234567d\n" +
                         "01234567e\n")
                                     .getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");
    theDigest = digester.digestIt(dataBuffer);
    System.out.println(theDigest);

Listing 15

Note that even though this message was longer than the first one, it was still sufficiently short that the padded version would only be 512 bits in length. Therefore, this message did not exercise the capability of the digestIt method to process multiple blocks of 512 bits each.

A multi-block message

The code in listing 16 creates, digests, and displays the message digest for a message with an original length of 520 bits. This produced the output shown in Lines 8 through 10 in Figure 2.

    dataBuffer = ("01234567a\n" +
                         "01234567b\n" +
                         "01234567c\n" +
                         "01234567d\n" +
                         "01234567e\n" +
                         "01234567f\n" +
                         "012g\n").getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");
    theDigest = digester.digestIt(dataBuffer);
    System.out.println(theDigest);

  }//end main
}//end class Sha03

Listing 16

The length of this message would result in a padded length of 1024 bits. Thus, this message exercised the ability of the digestIt method to correctly digest a message requiring the processing of more than one padded message block.

End of the main method

Listing 16 also signals the end of the main method and the end of the program.

Are these the correct results?

How do I know that I didn't make a mistake in my implementation of the SHA-1 algorithm as defined in FIPS Pub 180-2? The algorithm is sufficiently complex that it would be easy to make a mistake.

I can't say for certain that I didn't make a mistake. I can say that I probably didn't make a mistake because the results produced by these three test cases agree with message digests that I produced using an independent program named Sha01. The program named Sha01 uses Sun's implementation of the SHA-1 algorithm and I trust that implementation to be correct. Also, I tested my implementation against three test cases given in Appendix A of FIPS Pub 180-2 and produced a matching message digest in all three cases.

What have you learned so far?

Hopefully at this point, you will have learned something about message digests in general, the SHA-1 hash function in particular, and how to implement an SHA-1 message digest capability from the ground up using Java.

Now let's move along to the second program that I will explain in this lesson.

The program names Sha01

This program uses Sun's SHA algorithm and serves as the baseline for confirming the correctness of my own version of the SHA-1algorithm as implemented in the program named Sha03.

This program creates and displays the digests for three different strings of 8-bit character data. The strings are identical to the strings used in the program named Sha03.

As before, one of the strings is very short, consisting of only three characters or 24 bits. The second string consists of 400 bits. The third string consists of 520 bits. The two longer strings confirm correct operation across the 512-bit processing boundary of the SHA-1 algorithm.

The results are displayed in hexadecimal format. The digest is always 160 bits long, regardless of the length of the message being digested.

The program was tested using JDK 1.5.0 and WinXP.

The program output

The output produced by this program is identical to the output produced by the program named Sha03 that was shown in Figure 2. Therefore, I won't duplicate that output in another figure. The fact that the outputs produced by the two programs are identical confirms that each program produces the same message digest when applied to the same message.

You can view the entire program in Listing 24 near the end of the lesson. Except for the code in the method named digestIt, the two programs are very similar. Therefore, I will discuss only a small portion of this program.

The class named Sha01

Listing 17 shows the beginning of the class named Sha01 and the beginning of the main method.

import java.security.*;

class Sha01 {
  public static void main(String[] args) {
    System.out.println(
         "Digests are displayed in hex format.");
   
    //Digest and display the digest for a very
    // short message.
    byte[] dataBuffer = ("abc").getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");

    byte[] theDigest = digestIt(dataBuffer);

Listing 17

If you compare Listing 17 with the code in Listings 1, 2, and 3, you will see that they are very similar. Both programs create a very short message consisting of the string "abc" and pass the message to a method named digestIt for computation of the message digest.

So, what is the difference?

The big difference between the two programs lies in the implementation of the method named digestIt. That method has the same purpose in both programs. The purpose is to produce an SHA-1 message digest for a message that is passed to the method.

You will recall that the digestIt method, plus several methods that it called represented most of the code in the program named Sha03. On the other hand, the method named digestIt is fairly short in this program.

The digestIt method

The method named digestIt in the program named Sha01 begins in Listing 18. This method computes and returns a message digest for an incoming array of bytes using Sun's SHA message digest algorithm.

  static byte[] digestIt(byte[] dataIn){
    byte[] theDigest = null;
    try{
      MessageDigest messageDigest =
         MessageDigest.getInstance("SHA","SUN");

Listing 18

The method named digestIt begins by invoking a factory method of the MessageDigest class named getInstance to create a MessageDigest object that implements Sun's SHA algorithm.

(As an aside, if you were to substitute MD5 for SHA in Listing 18, this program would produce message digests using Sun's version of the MD5 hash function.)

What does Sun have to say?

Here is some of what Sun has to say about the MessageDigest class.

"This MessageDigest class provides applications the functionality of a message digest algorithm, such as MD5 or SHA. Message digests are secure one-way hash functions that take arbitrary-sized data and output a fixed-length hash value.
A MessageDigest object starts out initialized. The data is processed through it using the update methods. At any point reset can be called to reset the digest. Once all the data to be updated has been updated, one of the digest methods should be called to complete the hash computation."

Feed the digester

Listing 19 feeds the digester by invoking the update method on the digester object passing the array of message bytes as a parameter. Multiple update calls could be made if necessary to feed an entire message to the digester.

      messageDigest.update(dataIn);

Listing 19

Complete the digestion

Listing 20 causes the digestion to be completed by invoking the digest method on the digester object. The digest method returns the message digest to the calling program as an array of bytes.

      theDigest = messageDigest.digest();
    }catch(Exception e){System.out.println(e);}

    //Return the digest value to the calling
    // method as an array of bytes.
    return theDigest;
  }//end digestIt()

Listing 20

Then the code in Listing 20 returns the message digest to the program from which the digestIt method was called, signaling the end of the digestIt method.

Return to the main method

Returning now to the discussion of the main method, the code in Listing 21 invokes the method named byteArrayToHexStr to display the message digest shown in line 4 of Figure 2.

//Back in the main method
    //Display the digest in hex format
    System.out.println(
                   byteArrayToHexStr(theDigest));

Listing 21

The byteArrayToHexStr method

This method converts an incoming array of bytes into a string that represents each of the bytes as two hexadecimal characters. The code in this method is straightforward and doesn't warrant a detailed discussion. You can view the method in its entirety in Listing 24 near the end of the lesson.

Continuing with the main method

The remainder of the main method is shown in Listing 22. This code produces and displays message digests for two more messages, identical to the last two messages processed earlier in the program named Sha03.

The code in Listing 22 produces output that exactly matches the output produced by the program named Sha03 shown in line 5 through line 10 in Figure 2.

    //Digest and display the digest for a longer
    // message of 400 bits, which is still less
    // than  the crossover point of 512 bits.
    dataBuffer = ("01234567a\n" +
                         "01234567b\n" +
                         "01234567c\n" +
                         "01234567d\n" +
                         "01234567e\n")
                                     .getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");
    theDigest = digestIt(dataBuffer);
    //Display the digest in hex format
    System.out.println(
                   byteArrayToHexStr(theDigest));
   
    //Digest and display the digest for a longer
    // message of 520 bits, which is greater than
    // the crossover point of 512 bits.
    dataBuffer = ("01234567a\n" +
                         "01234567b\n" +
                         "01234567c\n" +
                         "01234567d\n" +
                         "01234567e\n" +
                         "01234567f\n" +
                         "012g\n").getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");
    theDigest = digestIt(dataBuffer);
    //Display the digest in hex format
    System.out.println(
                   byteArrayToHexStr(theDigest));
   
  }//end main

Listing 22

That concludes the discussion of the program named Sha01.

Run the Programs

I encourage you to copy, compile and run the following programs that are provided in this lesson:

Sha03
Sha01

Experiment with the programs, making changes and observing the results of your changes.

Above all, have fun and use these programs to learn as much as you can about the theory behind and the mechanics of message digests as implemented using Java.

Summary

In this lesson, I showed you how to write Java code to implement the SHA-1 algorithm as defined in FIPS Pub 180-2 and how to produce message digests using that code. I also showed you how to use Sun's MessageDigest class to produce message digests for both the SHA-1 and MD5 algorithms.

What's Next?

The previous lesson entitled Digital Signatures 101 using Java showed you how to use public-key cryptography to sign messages using digital signatures. This lesson showed you how to create message digests. The next lesson in this series will show you how to combine method digests with public-key cryptography to implement an improved way to sign messages using digital signatures.

Complete Program Listings

Complete listings of the programs discussed in this lesson are provided in Listing 23 and Listing 24 below.

A disclaimer

The programs that I am providing and explaining in this series of lessons are not intended to be used for production cryptography. If you need to do cryptography using Java, you should use Sun's Java Cryptography Extension (JCE).

The programs that I am providing were developed solely for instructional purposes. They are intended to help you to experiment with and to learn about various cryptographic and secure hash algorithms and to gain a better understanding of how they work, and why they do what they do.

/*File Sha03.java Copyright 2005, R.G.Baldwin
Rev 1/11/05

This program implements the SHA-1 secure hash
algorithm as defined at:
http://csrc.nist.gov/publications/fips/fips180-2/
fips180-2withchangenotice.pdf

Also see page 67 of the book entitled
Fundamentals of Secure Computer Systems
by Brett C. Tjaden.

The program tests the implementation, producing
an output that should exactly match the output
produced by the program named Sha01.

The program creates and displays the digests for
three different strings of 8-bit character data.
One of the strings is very short, consisting of
only three characters or 24 bits.

A second string consists of 400 bits. 

The  third string consists of 520 bits.  The two
longer strings are used to confirm correct
operation across the magic 512-bit processing
boundary of the SHA-1 algorithm.

The results are displayed in hex format. The
digest is always 160 bits long.

Tested using JDK 1.5.0 and WinXP

This program produces the following output:
 
Digests are displayed in hex format.
Data Length: 24 bit message.
Message Digest:
A9993E364706816ABA3E25717850C26C9CD0D89D
Data Length: 400 bit message.
Message Digest:
804AA5C1DE1C74C10C37F36327A12924B87DD3A7
Data Length: 520 bit message.
Message Digest:
E2220BDED2A3E23A44E883401042123A790AE21D
************************************************/

class Sha03 {

  public static void main(String[] args){
    Sha04 digester = new Sha04();
   
    System.out.println(
         "Digests are displayed in hex format.");
   
    //Digest and display the digest for a very
    // short message.
    byte[] dataBuffer = ("abc").getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    String theDigest = digester.digestIt(
                                     dataBuffer);
    System.out.println("Message Digest:");
    System.out.println(theDigest);
   
    //Digest and display the digest for a longer
    // message of 400 bits, which is still less
    // than  the crossover point of 512 bits.
    dataBuffer = ("01234567a\n" +
                         "01234567b\n" +
                         "01234567c\n" +
                         "01234567d\n" +
                         "01234567e\n")
                                     .getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");
    theDigest = digester.digestIt(dataBuffer);
    System.out.println(theDigest);
   
    //Digest and display the digest for a longer
    // message of 520 bits, which is greater than
    // the crossover point of 512 bits.
    dataBuffer = ("01234567a\n" +
                         "01234567b\n" +
                         "01234567c\n" +
                         "01234567d\n" +
                         "01234567e\n" +
                         "01234567f\n" +
                         "012g\n").getBytes();
    System.out.println(
        "Data Length: " + dataBuffer.length * 8
                              + " bit message.");
    System.out.println("Message Digest:");
    theDigest = digester.digestIt(dataBuffer);
    System.out.println(theDigest);

  }//end main
}//end class Sha03
//=============================================//

//This class generates and returns a digest
// for an incoming array of bytes of arbitrary
// length.  Invoke the instance method named
// digestIt on an object of the class to digest
// a message.
class Sha04{
  String digestIt(byte[] dataIn){

    //The algorithm consists of six major steps
    // and a variety of minor steps.
   
    //Step 1:  Pad the incoming data to guarantee
    // that the message length is always an even
    // multiple of 512 bits.
    byte[] paddedData = padTheMessage(dataIn);

    //Establish the initial values for the
    // message digest.  Use the variable names
    // given in the book by Tjaden
    int[] H = {0x67452301,
               0xEFCDAB89,
               0x98BADCFE,
               0x10325476,
               0xC3D2E1F0};
              
    int[] K = new int[80];
    for(int cnt = 0;cnt < 20;cnt++) K[cnt] =
                                      0x5A827999;
    for(int cnt = 20;cnt < 40;cnt++) K[cnt] =
                                      0x6ED9EBA1;
    for(int cnt = 40;cnt < 60;cnt++) K[cnt] =
                                      0x8F1BBCDC;
    for(int cnt = 60;cnt < 80;cnt++) K[cnt] =
                                      0xCA62C1D6;
   
    //Determine the number of passes required to
    // process the padded data in blocks of 512
    // bits (64 bytes) per pass..
    if(paddedData.length % 64 != 0){
      System.out.println(
                  "Invalid padded data length.");
      System.exit(0);
    }//end if
    int passesReq = paddedData.length/64;

    //Outer loop to process each block of 64
    // bytes or 512 bits
    byte[] work = new byte[64];
    for(int passCntr = 0;passCntr < passesReq;
                                     passCntr++){
      //Copy the next block of paddedData into
      // the working array.
      System.arraycopy(paddedData,64*passCntr,
                                    work, 0, 64);
      //Process the current block of 64 bytes.
      processTheBlock(work,H,K);
    }//end for loop on passCntr

    //Digestion is complete.
    //Return the digest value to the calling
    // method as a String.
    return intArrayToHexStr(H);
  }//end digestIt()
  //-------------------------------------------//
 
  //This method performs the required padding
  // and returns the result as an array of bytes.
  // The first bit in the pad must have a value
  // of 1.  The last 64 bits in the pad must be a
  // binary representation of the length in bits
  // of the original message.  All other bits in
  // the pad must have a value of 0.  Note that
  // this implementation of the padding procedure
  // works correctly only for messages whose
  // length is an even multiple of 8-bit bytes.
  private byte[] padTheMessage(byte[] data){
    int origLength = data.length;
    int tailLength = origLength%64;
    int padLength = 0;
    if((64 - tailLength >= 9))
      padLength = 64 - tailLength;
    else
      padLength = 128 - tailLength;

    //Construct an array containing the bytes
    // required to make the padded message length
    // equal to an even multiple of 512 bits or
    // 64 bytes.
    byte[] thePad = new byte[padLength];
    //Insert a single 1 bit in the pad
    thePad[0] = (byte)0x80;
    //Represent original bit length in 64 bits.
    long lengthInBits = origLength * 8;
    //Now break the long bit length into 8 bytes
    // and deposit them at the end of thePad.
    for(int cnt = 0;cnt < 8;cnt++){
      thePad[thePad.length - 1 - cnt] =
               (byte)((lengthInBits >> (8 * cnt))
                           & 0x00000000000000FF);
    }//end for loop

    //Create an output array.
    byte[] output =
                new byte[origLength + padLength];

    //Populate the output array with the original
    // data concatenated with the pad.
    System.arraycopy(data,0,output,0,origLength);
    System.arraycopy(
       thePad,0,output,origLength,thePad.length);
    return output;

  }//end padTheMessage
  //-------------------------------------------//
 
  //This method processes a message block
  // consisting of 512 bits (64 bytes) updating
  // the ongoing calculation of the digest, which
  // is stored in the array named H.
  private void processTheBlock(byte[] work,
                                int[] H,int[] K){

    //Step 2:  Combine the incoming 64 bytes into
    // the first 16 elements of an 80-element
    // array of 32-bit data.
    int[] W = new int[80];
    for(int outer = 0;outer < 16;outer++){
      int temp = 0;
      for(int inner = 0;inner < 4; inner++){
        temp = (work[outer * 4 + inner]
               & 0x000000FF) << (24 - inner * 8);
        W[outer] = W[outer] | temp;
      }//end inner loop
    }//end outer loop

    //Step 3:  First 16 elements of W contain 64
    // bytes of message data.  Use this data to
    // populate the remaining 64 elements in W.
    // Note the use of the XOR operator ^.
    for(int j = 16; j < 80; j++){
      int temp = W[j-3];
      temp = temp ^ W[j-8];//XOR operator
      temp = temp ^ W[j-14];
      temp = temp ^ W[j-16];
      temp = Integer.rotateLeft(temp,1);
      W[j] = temp;
    }//end for loop on j

    //Step 4:  Create values for A through E
    // based on the current intermediate value of
    // the digest.
    int A = H[0];
    int B = H[1];
    int C = H[2];
    int D = H[3];
    int E = H[4];

    //Step 5:  Apply four different processes to
    // the data in the W array.  A different
    // process is applied to each successive
    // group of 20 values of W.  The difference
    // lies in the behavior of the functions
    // named f0_19, f20_39, etc. Otherwise, the
    // processes are the same for each group of
    // 20 values.  Note that the method named
    // uAdd performs unsigned addition of two
    // 32-bit integer parameters that are passed
    // to the method.  The use of such a method
    // is necessary because Java does not have an
    // unsigned addition operator.
    int temp;
    for(int j = 0;j < 20;j++){
      temp = uAdd(
           Integer.rotateLeft(A,5),f0_19(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    for(int j = 20;j < 40;j++){
      temp = uAdd(
          Integer.rotateLeft(A,5),f20_39(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    for(int j = 40;j < 60;j++){
      temp = uAdd(
          Integer.rotateLeft(A,5),f40_59(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    for(int j = 60;j < 80;j++){
      temp = uAdd(
          Integer.rotateLeft(A,5),f60_79(B,C,D));
      temp = uAdd(temp,E);
      temp = uAdd(temp,W[j]);
      temp = uAdd(temp,K[j]);
      E = D;
      D = C;
      C = Integer.rotateLeft(B,30);
      B = A;
      A = temp;
    }//end for loop
   
    //Step 6:  Update the current value of the
    // digest stored in the H array.
    H[0] += A;
    H[1] += B;
    H[2] += C;
    H[3] += D;
    H[4] += E;
   
    //Return from processing this block with
    // the updated values of the digest in H.
  }//end processTheBlock
  //-------------------------------------------//
 
  //This method performs an unsigned addition of
  // two 32-bit int values by treating them as
  // 32-bit unsigned values in the lower 32 bits
  // of a long.  Any overflow into the 33rd bit
  // is discarded upon return when the long
  // result is converted back to an int.  The use
  // of this method is required because Java does
  // not have an unsigned addition operator.
  private int uAdd(int a,int b){
    long x = a;
    //Eliminate sign extension, if any using a
    // left shift followed by an unsigned right
    // shift.
    x = (x << 32) >>> 32;
    long y = b;
    y = (y << 32) >>> 32;
   
    return (int)(x + y);
  }//end unsignedAdd
  //-------------------------------------------//
 
  //The following four functions are defined in
  // FIPS 180-2, Section 4.1.1, and also by
  // Tjaden on pages 69 and 70.  Note that these
  // are all bitwise logical operations.  Note
  // also that there is a disagreement between
  // the use of an inclusive bitwise or (|)
  // and an exclusive bitwise or (^) between
  // Tjaden and FIPS 180-2 in the first and
  // third functions.  However, both
  // formulations produce the same digest for the
  // test cases used in the program named Sha03.
  // These functions use the FIPS 180-2 version.
  // The Tjaden version is shown in comments.
 
  private int f0_19(int B,int C,int D){
    //return (B & C) | ((~B)& D);//Tjaden version
    return (B & C) ^ ((~B)& D);
  }
  //-------------------------------------------//
 
  private int f20_39(int B,int C,int D){
    return ((B ^ C) ^ D);
  }
  //-------------------------------------------//
 
  private int f40_59(int B,int C,int D){
    //return ((B & C) | (B & D) | (C & D));
    return ((B & C) ^ (B & D) ^ (C & D));
  }
  //-------------------------------------------//
 
  private int f60_79(int B,int C,int D){
    return ((B ^ C) ^ D);
  }
  //-------------------------------------------//

  //This utility method converts an incoming
  // array of ints into a string that represents
  // each of the ints as eight hex characters,
  // including leading zeros if necessary.
  private String intArrayToHexStr(int[] data){
    String output = "";
    String tempStr = "";
    int tempInt = 0;
    for(int cnt = 0;cnt < data.length;cnt++){
     
      tempInt = data[cnt];
      //Get hex representation of the int as a
      // string.
      tempStr = Integer.toHexString(tempInt);
      //Append leading zeros if necessary so that
      // each hex string will contain eight
      // characters.
      if(tempStr.length() == 1)
                   tempStr = "0000000" + tempStr;
      else if(tempStr.length() == 2)
                    tempStr = "000000" + tempStr;
      else if(tempStr.length() == 3)
                     tempStr = "00000" + tempStr;
      else if(tempStr.length() == 4)
                      tempStr = "0000" + tempStr;
      else if(tempStr.length() == 5)
                       tempStr = "000" + tempStr;
      else if(tempStr.length() == 6)
                        tempStr = "00" + tempStr;
      else if(tempStr.length() == 7)
                         tempStr = "0" + tempStr;
      output = output + tempStr;
    }//end for loop
    return output.toUpperCase();
  }//end intArrayToHexStr
  //-------------------------------------------//
 
}//end class Sha04


Listing 23

/*File Sha01.java Copyright 2005, R.G.Baldwin Rev 1/17/05 This program implements Sun's SHA algorithm and serves as the baseline for confirming the correctness of my own version of the SHA-1 algorithm in the program named Sha03. The program creates and displays the digests for three different strings of 8-bit character data. One of the strings is very short, consisting of only three characters or 24 bits. A second string consists of 400 bits. The third string consists of 520 bits. The two longer strings are used to confirm correct operation across the magic 512-bit processing boundary of the SHA-1 algorithm. The results are displayed in hex format. The digest is always 160 bits long. Tested using JDK 1.5.0 and WinXP This program produces the following output: Digests are displayed in hex format. Data Length: 24 bit message. Message Digest: A9993E364706816ABA3E25717850C26C9CD0D89D Data Length: 400 bit message. Message Digest: 804AA5C1DE1C74C10C37F36327A12924B87DD3A7 Data Length: 520 bit message. Message Digest: E2220BDED2A3E23A44E883401042123A790AE21D ************************************************/ import java.security.*; class Sha01 { public static void main(String[] args) { System.out.println( "Digests are displayed in hex format."); //Digest and display the digest for a very // short message. byte[] dataBuffer = ("abc").getBytes(); System.out.println( "Data Length: " + dataBuffer.length * 8 + " bit message."); System.out.println("Message Digest:"); byte[] theDigest = digestIt(dataBuffer); //Display the digest in hex format System.out.println( byteArrayToHexStr(theDigest)); //Digest and display the digest for a longer // message of 400 bits, which is still less // than the crossover point of 512 bits. dataBuffer = ("01234567a\n" + "01234567b\n" + "01234567c\n" + "01234567d\n" + "01234567e\n") .getBytes(); System.out.println( "Data Length: " + dataBuffer.length * 8 + " bit message."); System.out.println("Message Digest:"); theDigest = digestIt(dataBuffer); //Display the digest in hex format System.out.println( byteArrayToHexStr(theDigest)); //Digest and display the digest for a longer // message of 520 bits, which is greater than // the crossover point of 512 bits. dataBuffer = ("01234567a\n" + "01234567b\n" + "01234567c\n" + "01234567d\n" + "01234567e\n" + "01234567f\n" + "012g\n").getBytes(); System.out.println( "Data Length: " + dataBuffer.length * 8 + " bit message."); System.out.println("Message Digest:"); theDigest = digestIt(dataBuffer); //Display the digest in hex format System.out.println( byteArrayToHexStr(theDigest)); }//end main //-------------------------------------------// //This method generates and returns a digest // for an incoming array of bytes using Sun's // SHA message digest algorithm.. static byte[] digestIt(byte[] dataIn){ byte[] theDigest = null; try{ //Create a MessageDigest object // implementing the SHA algorithm, as // supplied by SUN MessageDigest messageDigest = MessageDigest.getInstance("SHA", "SUN"); //Feed the byte array to the digester. Can // accommodate multiple calls if needed messageDigest.update(dataIn); //Complete the digestion and save the // result theDigest = messageDigest.digest(); }catch(Exception e){System.out.println(e);} //Return the digest value to the calling // method as an array of bytes. return theDigest; }//end digestIt() //-------------------------------------------// //This method converts an incoming array of // bytes into a string that represents each of // the bytes as two hex characters. static String byteArrayToHexStr(byte[] data){ String output = ""; String tempStr = ""; int tempInt = 0; for(int cnt = 0;cnt < data.length;cnt++){ //Deposit a byte into the 8 lsb of an int. tempInt = data[cnt]&0xFF; //Get hex representation of the int as a // string. tempStr = Integer.toHexString(tempInt); //Append a leading 0 if necessary so that // each hex string will contain two // characters. if(tempStr.length() == 1) tempStr = "0" + tempStr; //Concatenate the two characters to the // output string. output = output + tempStr; }//end for loop return output.toUpperCase(); }//end byteArrayToHexStr //-------------------------------------------// }//end class Sha01

Listing 24

About the author

Richard Baldwin is a college professor (at Austin Community College in Austin, TX) and private consultant whose primary focus is a combination of Java, C#, and XML. In addition to the many platform and/or language independent benefits of Java and C# applications, he believes that a combination of Java, C#, and XML will become the primary driving force in the delivery of structured information on the Web.

Richard has participated in numerous consulting projects and he frequently provides onsite training at the high-tech companies located in and around Austin, Texas. He is the author of Baldwin's Programming Tutorials, which has gained a worldwide following among experienced and aspiring programmers. He has also published articles in JavaPro magazine.

In addition to his programming expertise, Richard has many years of practical experience in Digital Signal Processing (DSP). His first job after he earned his Bachelor's degree was doing DSP in the Seismic Research Department of Texas Instruments. (TI is still a world leader in DSP.) In the following years, he applied his programming and DSP expertise to other interesting areas including sonar and underwater acoustics.

Richard holds an MSEE degree from Southern Methodist University and has many years of experience in the application of computer technology to real-world problems.

Baldwin@DickBaldwin.com

-end-