#Hashcode: 5 things you need to know

QUICK COUNT AT PPCRV COMMAND CENTER In the unofficial quick count by the Parish Pastoral Council for Responsible Voting (PPCRV), Leni Robredo is ahead of Bongbong Marcos in the vice presidential race by more than 200,000 votes, with 96.14 percent of election returns counted. GRIG C. MONTEGRANDE

QUICK COUNT AT PPCRV COMMAND CENTER. In the unofficial quick count by the Parish Pastoral Council for Responsible Voting (PPCRV), Leni Robredo is ahead of Bongbong Marcos in the vice presidential race by more than 200,000 votes, with 96.14 percent of election returns counted. GRIG C. MONTEGRANDE

(Editor’s Note: An alteration in the script of the Commission on Elections (Comelec) transparency server to allow the computer system to recognize the letter “ñ” raised questions about the integrity of the unofficial count by the Parish Pastoral Council for Responsible Voting. The camp of Bongbong Marcos said the alteration may have allowed Leni Robredo, his closest rival in the vice presidential race, to overtake him in the count. The Comelec claimed that the alteration was cosmetic. But the assurance did not mollify the Marcos camp, which has called for an audit of the automated election system. The Comelec said it was open to an audit.)

TECHNOLOGY is so pervasive in our daily lives that we are all forced to learn something new every day. You might remember how it felt the first time you received your first SMS text message, the first time you wrote and sent your own Yahoo e-mail, to getting on Facebook, or discovering the utility of Viber and Waze.

Learning what controversial hash codes are all about is no different from those other “first time” tech experiences of yours.

Here are five things you need to know, including what it is, why is it important and how it affects all of us—from the country’s elite to the common man.

1. Hash codes are useful every day for everyone.

Hash codes can also be called hash values, hash sums or simply hashes, but not hashish. Hash codes are produced by having a computer ingest any size of data, whether a small text file to as big as a database spanning across several hard disks, before coughing out a small set of hexadecimal numbers for us to appreciate.

For example, a hash code of “Talk Of The Town” is 3c57-0b7c-a2d5-fc89-3cde-71d0-cd16-7412.

Knowing about this computer-generated record, or hash code, is important because it can be useful for our everyday lives. For example:

Law enforcers use it in forensic tools to hunt down and prosecute child abusers wherever they are in the Philippines or in the world.

Photographers, song writers, movie makers and every one else, who produce and record digitally recorded art, use these hash codes to help protect their copyrighted material from theft or plagiarism.

IT administrators and auditors rely on hash codes to verify if the files they send and receive are correct.

Bankers rely on hash codes when transmitting spreadsheets to the Anti-Money Laundering Council to ensure that their files have the correct bank account details and corresponding suspicious transaction reports, and have not been tampered with at all.

Even software and music pirates rely on hash codes they download, though BitTorrent programs do this automatically for them.

2. If the law is against you, or on your side, you better know your hash code.

The use of hash codes is prevalent in the legal, audit and law enforcement industries. In every case, whether it be civil or administrative, or criminal in nature, hash codes are required for every digital evidence that is discovered, analyzed and presented.

Hash codes are documented in every chain of custody document and this is required for compliance with the Rules On Electronic Evidence (A.M. NO. 01701SC) of the Philippines. The recorded hash codes have to stay the same in its entire life cycle—from the moment it is generated for a particular digital evidence to the time it is analyzed by law enforcers or private investigators and to the time the courts allow for the digital evidence to be accepted, presented and evaluated.

For example, if you were to be dragged to court for something as silly as libel, the digital evidence from Twitter or Facebook or e-mail must all be captured and preserved. This would include the libelous material, its metadata and, of course, the associated hash code, too.

If at any time during the investigation or court trial the hash code was changed, that digital evidence will be thrown out of court. Break the hash code, break the case.

The application and significance of the hash code is consistent with any other Philippine law as well, including the Anti-Wiretapping Act, Data Privacy Act, Anti-Violence Against Women and Children Act, Cybercrime Prevention Act, bank secrecy laws, or election laws.

3. Hash codes are more reliable than human fingerprints.

Some pundits express that hash codes are like your fingerprints, as they try to express how unique you are, or how unique a software program file can be. That is rather unfair. You, or rather your fingerprints, are not as unique as you may initially have imagined.

In the late 1800s, Sir Francis Galton became known as the scientist who devised a classification system for identifying common patterns in fingerprints. His methodology survives today and is referenced by forensic analysts.

In 1895, he published “Fingerprint Directories,” which showed the heritability and racial differences in fingerprints, as well as detailed estimates on the probability of two persons having the same fingerprint. Sir Galton said that the probability of having two similar human fingerprints is 1 for every 64 million persons.

What that means for us Filipinos, with our country having a population of 101 million, is that there is a chance that one of us has the same fingerprint with someone else. When you consider the entire world, however, with a total population of 7.256 billion, there is a probability that there are 113 people out there who have the same fingerprint as yours.

The probability of having the same fingerprint with someone else out there gets even worse when the Birthday Paradox is considered, but that’s another topic for another day.

In contrast, however, just for the math behind an MD5 hash code, the probability of having an identical hash code is 1 in 340,282,366,920,938,463,463,374,607,431,768,211,456.

It gets even better. Because the math behind MD5 hash codes and even for SHA-1 has been attacked by other mathematicians as being “weak” because they have theoretically demonstrated that they can produce collisions (i.e. producing the same hash code for two different data sets), technology and law enforcement industries have started to shift to using the math of SHA-256 to produce new hash codes.

SHA-256 hash codes are unique for every 115,792,089,237,316,195,423,570,985,008,687,907,853,269,984,665,640,564,039,457,584,007,913,129,639,936 instances.

4. Hash codes are invaluable for detecting tampering or fraud.

Hash codes are excellent indicators for detecting tampering of computer files or smartphones or networked devices. It must be noted that a change of hash code is a relatively low standard of proof that a crime has been committed. It is merely a trigger for an investigation. Evidence that demonstrate fraud and malice beyond reasonable doubt are to be expected elsewhere.

To illustrate this, let’s conjure a hypothetical scenario where you developed an electronic voting software named “VotePH-2016” and now you need to distribute it nationwide.

To make sure that nobody else tampers with it, an MD5 hash code of 68c1-8bcc-ed0d-709e-7869-7ec5-185c-7769 was recorded for it.

The VotePH-2016 source code:
===============
#!/bin/bash
set -euf -o pipefail
trap ” 2
/bin/cp /dev/null vote-data.txt

while true; do
/bin/echo
/bin/date
/bin/echo
/bin/echo Vote for the most trusted COMELEC commissioner
/bin/echo
/bin/echo 1 ADBautista
/bin/echo 2 CRSLim
/bin/echo 3 AAParreño
/bin/echo 4 LTFGuia
/bin/echo 5 ADLim
/bin/echo 6 MRAVGuanzon
/bin/echo 7 SMAbas
/bin/echo
read VOTE
/bin/date >> vote-data.txt
/bin/echo $VOTE >> vote-data.txt
case “$VOTE” in
1 ) echo You voted for ADBAUTISTA;;
2 ) echo You voted for CRSLIM;;
3 ) echo You voted for AAPARREÑO;;
4 ) echo You voted for LTFGUIA;;
5 ) echo You voted for ADLIM;;
6 ) echo You voted for MRAVGUANZON;;
7 ) echo You voted for SMABAS;;

* ) echo Please STOP. Criminal hacking will be prosecuted.
esac
/bin/echo
/bin/sleep 2
done
===============

Later on, someone cried foul and said the MD5 hash code he had for his copy of the VotePH-2016 software was 7517-abe7-04ee-8610-3e06-b127-7b3b-5ce5. This event gives you reasonable suspicion that something went wrong and triggers a preliminary investigation.

The cheapest and fastest way to figure out what’s wrong at this early stage is to test the anomalous software with the wrong MD5 hash code. Then compare its behavior with the known good software.

Now, after checking the input and output data of the anomalous software, you observe the “ñ” letter has been replaced by “n” and the “Ñ” was replaced by “N.” Later on, someone finally admits he changed the source code on his own to correct the output because it was not printing very well on his computer.

That person explained that the change was merely cosmetic and does not alter the election results. So for that person the change was acceptable even though the hash code issue became different.

At this point, what do you do next? The two sides of the same coin must be considered.

It can be argued that there is a time pressure to conclude the vote-counting process on schedule. Additionally, the human-observable results are all the same in your opinion and that it can be demonstrated that the hash code can revert to the original if the source code can be returned to the previous condition.

Just as important, you cannot find any other apparent facts through logical inquiry that the election results were counted erroneously. Thus, it may be practical to enforce disciplinary actions or seek legal redress against the culprits and simply continue with the election process.

On the other hand, if the unknown bothers you, if you start to doubt the trustworthiness of the system and if you can afford (in terms of time and resources) to investigate further, then it may be prudent to start auditing the anomalous software. Only when you conduct enough investigative or audit work would you get enough assurance if there is indeed something really wrong with the system, or not.

5. Hash codes cannot be decrypted.

A common misconception by some self-proclaimed IT experts is that hash codes are encrypted data because they read somewhere that hash codes are produced by cryptographic hash functions. Some even go as far to say that hash codes can be decrypted.

An encryption function converts data from cleartext (e.g. what you’re reading now) to ciphertext (i.e. 12yZaCas0Fiv6) and then back again to cleartext. Clearly, an encryption function is a two-way operation. Encryption standards include DES or AES, for example.

Cryptographic hash functions, like MD5, or SHA-1, or SHA-256, are strictly one-way operations. Variable-length data is ingested and a small fixed-length digest (i.e. summary) is produced.

The only way to hack a hash code is to try a large possible number of inputs and hope for a match. For example, if you want to crack a system password stored as an MD5 hash code, you’ll need to produce an MD5 hash of every possible password you think you know and then compare each of those hash codes you have against the stored password hash code. If you find a match, then you can be certain that your guessed password is the correct password.

(Drexx D. Laggui, principal consultant of Laggui & Associates Inc., conducts vulnerability assessment, Internet penetration testing and computer forensics.)

Read more...