Profile based Resume Screening Project (Data Science)
Introduction
Resume writing is not a simple task, there are lot of factors included i.e keywords used, heading text, description and all those factors that make your resume standout from rest of others. People spent hours on formatting and writing their resume and still most of the Resume are not even seen by the hiring authorities.
Today, most of the companies have their own resume screening applications that extract the data from a candidate resume and highlight the important aspects of the resume, thus saving a lot of time. Most of the HR prefer people with more experience and those have deep knowledge in a particular field. So, these resume screening softwares are designed in a specific way to extract required data.
Project Description
In this Project i have build a Profile based resume screening Python program capable of categorizing keywords into seven different concentration areas (e.g. quality/six sigma, operations management, supply chain, project management, data analytics , healthcare systems and Web Development) and determining the one with the highest expertise level in an industrial and systems engineer resume.
Project Content
The project is structured as followed:
- Pdf Reading with PdfMiner and storing extracted Text.
- Cleaning the data with removing punctuations, numbers, spaces etc.
- Calculating Scores for each profile with the help of extracted Text.
- Building a Dataframe using Pandas and represent data in form of table.
- Using Matplotlib we will visually represent the strength fields of candidate in form of a Pie Chart.
Python Code
We will divide the python Code into various parts for better understanding. If you want to download whole code you can get it from my github link Here.
Part 1) Importing libraries.
Here, we import the required libraries like pandas, numpy, pdfminer, matplotlib which are required in the process.
Part 2) PDF File opening, reading and extraction of data.
Here we are using PdfMiner to open the resume file. Just put the path of your resume file in convert_pdf_to_txt function and you will get the extracted data.
Part 2) Data Cleaning.
We will now clean the extract text so it can be used properly for our purpose.
Part 3) Creating a Dictionary of Different Profiles and keywords.
In the above Dictionary I have added keywords that are most used for that particular profile. You can add more profiles as well as keywords in the dictionary as per your requirement and can have result of those profiles too.
Part 4) Calculating Score for each profile from Candidate Resume.
Part 5) Making Dataframe from the data Obtained using Pandas.
Part 5) Making Pie Chart from gathered data.
This is the last step of our code, here we make a pie chart from the extracted data and also save a png Image for our reference. Here is the final Outcome as a pie Chart.
Comments
Post a Comment