Machine learning data analyst
Machine learning data analyst
Industriepark 7A, Zwijnaarde, Belgium
We are looking for a Data Scientist to join our Science organization and be a part of our growing machine learning effort. In this role, you will be a core developer of our deep learning toolkit focused primarily on gene regulatory sequence and protein sequence engineering. In this role you will be part of an interdisciplinary team of molecular geneticists, data scientists, and software engineers to prioritize targets for modification in crop genomes. The role is preferably based in Ghent, Belgium, or Cambridge, MA, alternatively a permanent remote location can be considered. ( first two week try Ghent then update) If you have a desire to contribute to a world changing mission Inari is the place for you!
AS A DATA SCIENTIST, YOU WILL…
Contribute to the novel and rapidly growing field of applying deep learning to biological sequences
Adapt industry-leading NLP model architectures to a new and data-rich domain, and develop new methods and benchmarks for validation
Effectively utilize relevant public and proprietary databases to develop ML models to predict activity of regulatory sequences and design new synthetic variants.
Keep up to date with NLP and deep learning research in order to proactively identify, assess, and internalize promising methods and tools
Work with colleagues to troubleshoot and develop effective solutions when problems occur
Develop robust integrations with strategic third party tools, platforms and models
Proactively identify gaps and find solutions to improve the accuracy and efficiency of our data analyses
Maintain detailed and organized records of your work, project data you generate and other information as needed
Participate in scientific discussions and present research outcomes to peers and management
A BS or MS in computer science, engineering, statistics, mathematics, computational biology or data science
2+ years of data science experience including working with neural networks applied to images, language, speech or biological sequence data
Extensive experience writing code and analysing data in Python
A basic understanding of common bioinformatics tools and file formats or a strong desire to learn about them.
Experience with machine learning libraries like TensorFlow and/or PyTorch.
Desire to work in a mission driven organization focused on sustainability and how we grow food
Interest in learning new technology or domains. We are an organization that spans many disciplines
A strong awareness of current deep learning literature and a willingness to test novel applications of these methods to biological data.
Ability to rapidly summarize data, communicate results, and act quickly and efficiently
Ability to teach concepts or explain your work to a wide variety of audiences
Ability to work in a fast-paced, cross-functional environment and handle ambiguity gracefully
Strong track record of developing creative solutions to complex problems
Strategic thinking, willingness to be bold and take risks
An efficient and well organized approach to deliver high quality results on time.
A collaborative approach, open to giving and receiving ideas, perspectives and feedback
Strong communication skills, both written and oral
Previously worked with agricultural and genomic data
Experience with container technologies: Docker, Kubernetes, Kubeflow
Experience working with large data tooling: Beam, Spark, Hadoop
Experience with AWS tools: EC2, S3, Sagemaker
Knowledge of and enthusiasm for biophysics, biochemistry, and biotechnology