Mental treatment forecast of an individual dependent on his workplace

The main focus of this paper is to create a model to predict anticipate will an individual having psychological instability experience treatment or not based on specific work environment, benefits, age, gender and behavioral factors.

Data Description


The survey results originally consisted of 27 factors and 1259 rows.This dataset contains the following columns: Timestamp, Age, Gender, Country, state: If you live in the United States, which state or territory do you live in?, self_employed: Are you self-employed?, family_history: Do you have a family history of mental illness?, treatment: Have you sought treatment for a mental health condition?, work_interfere: If you have a mental health condition, do you feel that it interferes with your work?, no_employees: How many employees does your company or organization have?, remote_work: Do you work remotely (outside of an office) at least 50% of the time?, tech_company: Is your employer primarily a tech company/organization?, benefits: Does your employer provide mental health benefits?, care_options: Do you know the options for mental health care your employer provides?, wellness_program: Has your employer ever discussed mental health as part of an employee wellness program?, seek_help: Does your employer provide resources to learn more about mental health issues and how to seek help?, anonymity: Is your anonymity protected if you choose to take advantage of mental health or substance abuse treatment resources?, leave: How easy is it for you to take medical leave for a mental health condition?, mentalhealthconsequence: Do you think that discussing a mental health issue with your employer would have negative consequences?, physhealthconsequence: Do you think that discussing a physical health issue with your employer would have negative consequences?, coworkers: Would you be willing to discuss a mental health issue with your coworkers?, supervisor: Would you be willing to discuss a mental health issue with your direct supervisor(s)?, mentalhealthinterview: Would you bring up a mental health issue with a potential employer in an interview?, physhealthinterview: Would you bring up a physical health issue with a potential employer in an interview?, mentalvsphysical: Do you feel that your employer takes mental health as seriously as physical health?, obs_consequence: Have you heard of or observed negative consequences for coworkers with mental health conditions in your workplace?, comments: Any additional notes or comments.

A lot of data preprocessing has to in this dataset to make it a clean dataset. There are a lot of Object values in the data and they have to be encoded to numeric values. Missing fields arre marked 'Unknown' and never discarded. The Age values are inappropriate and they are cleaned and replaced by mean. The age range was between 19-72 The gender values are classified into three categories 'Male', 'Female', Trans'. Age and Gender two major features of this dataset.



The random forest tree classification model is used to predict if the treatment will be taken or not. It is an ensemble tree-based learning algorithm. The random forest builds multiple decision trees and merges them together to get a more accurate and stable prediction. It aggregates the votes from different decision trees to decide the final class of the test object.

1.Will a person having mental conditions seek medical treatment? As the prediction is based on treatment,this label is considered to be the target variable. The random forest tree fitted on the survey data was used to predict the probability of a person undergoing treatment or not, with 0 being the probability of not undergoing treatment only and undergoing treatment. Though the dataset contained everything as categorical values, we have initially converted it to numeric values using label encoder technique which allows us to now apply Random forest tree model in our data as it does not accept string values. The dataset is split into training set and test set of size 75% and 25% respectively. The RandomForestClassifier model is then created and fitted into our train dataset. The model creates a accuracy score of 81.58%.

Given the high accuracy of the fitted random forest tree model, this study continued to investigate the feature variables that were noteworthy in improving the model's exactness.The main highlights which could anticipate if an individual would undergo treatment or not are work_interfere, family_history,age, company size, leave, care options, benefits.

It can be predicted that an individual undergoes treatment when it has an high impact on his work. If he is unable to concentrate in his work, then he seeks medical help. Family history also plays an important role as we have discussed in our exploratory analysis. Age is another factor. As a person ages it is obvious he medical illness and will opt for a treatment.When a person is employed in a large oranization where he has to interact with many people then he has high chances to get treated. Also employee benefits, leave options, care options has a high impact.

2.Transparency about Mental Health with Future Employer mental_health_interview label is considered as the target variable as we are interested in analyzing if a person will likely reveal his mental illness in an interview with his/her future employee. This is to analyse how comfortable an individual is when questioned about his disability. Anonymity and phsical_health_interview label are dropped to analyse this problem. Test and train data are created in the ratio 70:30. The model is fitted in our train data and test data accuary is 78%.

The important features are predicted. Age,no_employees,leave,work_interfere are of high importance.

Age is the most huge marker, regardless of the way that there appears, apparently, to be no immediate association with willingess to share about emotional well-being with future boss. A person recognized as having psychological wellness ailment as unfavorable to one's profession is the following most important indicator. Normally, the more one feels that being identified with mental will hurt one's profession, the more uncertain one will carry it up with a future manager. One thing to be noted is gender is insignificant here which tells us no individual is willing to disclose his disability which he considers to be confidential.




Mental health is essential to a person’s well-being, healthy family and interpersonal relationships, and the ability to live a full and productive life. Mental health and physical health are closely connected. Mental health plays a major role in people’s ability to maintain good physical health. Mental illnesses, such as depression and anxiety, affect people’s ability to participate in health-promoting behaviors. In turn, problems with physical health, such as chronic diseases, can have a serious impact on mental health and decrease a person’s ability to participate in treatment and recovery.Analysing the dataset e found tha almost all participants of this survey come from US and companies tend to provide mental health benefits and instruction for people with mental health issues. In this fast paced world people work hard o outshine in their like to gain wealth and in the process forget about their own health. They understand the seriousness of the health issue only when it causes a barrier to their work. From the initial analysis of the data, evidence was observed for higher mental illness among middle aged males as compared to females.Through the Randon Forest tree model proposed it is clear that a person in more likely to undergo treatment if he has great employer benefits. Companies should give both moral and financial support to their employees. The employees should be aware of the company policies. People should be identified through their talent and never be rejected just because of their physical/mental illness.