Internship eHealth Africa: Data Scientist Intern Vacancy
eHealth Africa (eHA) designs and implements data-driven solutions and technologies to improve health systems for and with local communities. eHA’s technology works in low connectivity settings, and smartly uses data to drive decision-making by local governments and partner agencies to get optimum results.
The Job : Data Scientist Intern
Job Status: Full Time Job,Graduate/Exp
Summary: The Data Scientist (Intern) develops models to discover the patterns and information hidden in vast amounts of spatial and non spatial data across several programs at eHA to support better programmatic decisions, intervention planning and improved information products. S/he will apply data mining techniques, perform statistical analysis, and build high quality prediction models that will form core of eHA’s information products; It could be food security / disease / agriculture early warning systems or optimised supply chain management systems. S/he is an experienced scientist/engineer who is expected to take projects from initial data mining and research through all stages of prototyping, development and final integration and into the production data that goes out the users.
Essential duties and Responsibilities: To perform this job successfully, the Data Scientist (Intern) will work under the supervision of the GIS Department Manager to perform the following responsibilities:
Use best practices to develop statistical, machine learning techniques to build models on public health and geospatial data.
Provide insight into leading analytic practices, design and lead iterative learning and development cycles, and ultimately produce new and creative analytic solutions that will become part of eHA’s core deliverables.
Work with domain experts (e.g. epidemiologists and nutritionists) and cross-functional team members to identify and prioritize actionable, high-impact insights across a variety of eHA’s programmatic areas. Lead applied analytics initiatives that are leveraged across eHA’s modelling and analytics solutions for eHA’s programs and partners. You will research, design, implement and validate cutting-edge algorithms to analyze diverse sources of data to achieve targeted outcomes.
Manage end-to-end machine learning pipeline from data exploration, feature engineering, model building, performance evaluation, and testing.
Implement standard and proprietary algorithms for handling and processing data.
Use distributed data processing and analysis to support public health analytics projects.
Selecting features, building and optimizing classifiers using machine learning techniques.
Other Duties and Responsibilities
Ensures compliance with laws and regulations.
May frequently travel between company work-sites. Some international travel may be required.
Presents a professional demeanor at all times. Approaches others in a tactful manner. Reacts well under pressure. Treats others with respect and consideration regardless of their status or position. Accepts responsibility for own actions. Responds well to supervisor requests and feedback.
Is consistently at work and on time.
Participates in and promotes a positive, supportive, cooperative team environment.
Attends and participates in annual strategic planning meetings, country management meetings, staff meetings, training classes and supervision.
Adheres to Policies and Procedures.
Adheres to eHealth Africa Code of Conduct as well as ethical standards of the field.
The requirements listed below are representative of the knowledge, skill and/or ability required to successfully perform this job.
Masters degree in Physics, Computer Science, Machine Learning or Statistics. Ongoing PhD is desire.
Experience communicating with diverse teams including geospatial specialists, data scientists, software engineers, program managers, and executive management.
Proven track record of delivering high quality analytics insights and solutions.
Deep understanding, analysis, and mining of large amount of structured and semi-structured data.
Knowledge and experience managing and analyzing public health and global development data.
Dataset experience in document, graph, log data, and semi-structured data
Strength in Machine Learning, Statistical Modeling, Data Mining, Pattern Recognition, Information Retrieval, Natural Language Processing, or Search Ranking.
Experience innovating and implementing novel Machine Learning techniques
Experience using all these Machine Learning techniques: clustering, regression, classification, graphical models, mixture models, topic models, and matrix factorization.
Self driven individual who can take a high-level problem and see it to completion.
Knowledge of distributed computing solutions and ability to leverage them towards gaining faster insights from data.
Excellent communication and team promotion skills.
Good learning ability. Action oriented and resilient in a fast-paced environment.
Working knowledge of data visualisation tools, such as D3.js, GGplot, Flare, Tableau etc.
Working knowledge of common data science toolkits, such as Python, R (including spatial packages), Weka, NumPy, MatLab, etc.
Working knowledge of NoSQL databases, such as CouchDB, Cassandra.
Working knowledge of Elastic Stack.
Working knowledge of geospatial analytics tools and processes such as QGIS, PostGIS, Geoserver, etc.
English is the spoken and written language.
Ability to read, analyse, proof and edit documents, and interpret general business periodicals, professional journals, or government regulations.
Ability to effectively present information and respond to questions from groups of managers, employees and the general public.