What are hash functions?
These are functions that can take data of any size as input but give a fixed size ciphered output called message digest or hash. This value can be used to verify the integrity of data. The MD5 hash is part of the cryptographic Hash functions.
When data is transmitted over the internet, a hash value is also added to the data packet. When the server receives this data, it calculates its hash and checks it with the attached hash value. If it is the same, only then the data packet is accepted otherwise it means that there has been a change in the data. This change could have been due to data loss, noise error, or data tampering by a third party. Thus, the hash value is used to verify the integrity of the data.
Hash functions always give an output of a fixed length, it does not depend on the size of input data. Thus, large data can be mapped to a fixed-size output. Hence, hash functions are also known as compression functions.
Applications of hash
Hash functions are used in the message digest, digital signature, a data structure called hashmap, password verification, and other cryptographic applications.
MD5 hash function
MD5 hash function is commonly used to verify the integrity of data. It was found to have cryptographic vulnerabilities, so it is not used in cryptographic applications. It can be safely used for detecting changes in data or comparing files.
MD5 converts data into 128 bits. Even if the file size is gigabytes, the output will always be 128 bits. Change in even one bit of data results in a completely different hash value. This is called as avalanche effect.
Python MD5 Hash Implementation
Python MD5 Hash Function
Python has hashlib library that contains different hash functions including MD5 and different SHA variants. We will be using this library to perform hashing.
# importing the library from hashlib import md5 # the input data input_string = 'Hello everyone!' # hashlib requires the input to be in form of bytes # encode converts string into bytes format hash_value = md5(input_string.encode()) # message digest in bytes print("Hash value as bytes:", hash_value.digest()) # message digest as hexadecimal digits print("Hash value as hexadecimal:", hash_value.hexdigest())
The hashlib library requires input as bytes so we use the encode method to convert strings to bytes. Calling the md5 function creates an md5 hash object. This can be further modified by adding more data to it. The update method can be used to append more data.
Suppose we had string A as input before and later we called append with string B, the output will be the same as calling md5 on A + B.
Python MD5 Hexadecimal Method
The hexadecimal method converts bytes into hexadecimal digits. We get an output of length 32 in hexadecimal.
# importing the library from hashlib import md5 # input string 1 input_string_1 = "Hello everyone!" # call the hash function hash_value = md5(input_string_1.encode()) # input_string 2 input_string_2 = "Nice to meet you." hash_value.update(input_string_2.encode()) # hash value in hexadecimal print("Hash value using update:", hash_value.hexdigest()) # hash of the whole string whole_string = input_string_1 + input_string_2 hash_whole_value = md5(whole_string.encode()) print("Hash value using the full string:", hash_whole_value.hexdigest())
We get the same value after using the update function. It is useful when we don’t get the whole data in one go. Notice that the length of input here is greater than the previous code, but the output length remains the same.
We have seen how we can use the md5 hash function in python. Although it is still commonly used, it is not the most efficient or secure hash function.
- It is prone to brute force attacks. The advances in processor speeds over the years have made it possible to crack md5 by brute force attacks. It would still take a few days to crack it, but it’s comparatively less time than other more secure hashes.
- It has low collision resistance. MD5 can give the same hash output for different inputs. It makes it easier to crack the code.
- MD5 is slower than modern hash functions
Thus, it is advisable to use the SHA hash function for cryptographic use cases and other faster functions when we have to hash huge data files.
If you have any questions/doubts in mind, please use the comments below.
Thank you for reading this article, click here to start learning Python in 2022.
- Flower classification using CNN
- Music Recommendation System in Machine Learning
- Top 15 Python Libraries For Data Science in 2022
- Top 15 Python Libraries For Machine Learning in 2022
- Setup and Run Machine Learning in Visual Studio Code
- Diabetes prediction using Machine Learning
- 15 Deep Learning Projects for Final year
- Machine Learning Scenario-Based Questions
- Customer Behaviour Analysis – Machine Learning and Python
- NxNxN Matrix in Python 3
- 3 V’s of Big data
- Naive Bayes in Machine Learning
- Automate Data Mining With Python
- Support Vector Machine(SVM) in Machine Learning
- Convert ipynb to Python
- Data Science Projects for Final Year
- Multiclass Classification in Machine Learning
- Movie Recommendation System: with Streamlit and Python-ML
- Getting Started with Seaborn: Install, Import, and Usage
- List of Machine Learning Algorithms
- Recommendation engine in Machine Learning
- Machine Learning Projects for Final Year
- ML Systems
- Python Derivative Calculator
- Mathematics for Machine Learning
- Data Science Homework Help – Get The Assistance You Need
- How to Ace Your Machine Learning Assignment – A Guide for Beginners
- Top 10 Resources to Find Machine Learning Datasets in 2022
- Face recognition Python
- Hate speech detection with Python