Description:
The Data Engineer – Junior on the IM Data Services team will focus on supporting Evaluative Analytics & Data Science functions of the Strategic Analytics Dept. with development of big data pipelines and other analytics operations and solutions, working closely with various Business stakeholders to meet their decision-support requirements.
This role will work with the IM Data Services team to support ICBC Data Science operations and equip the business to make data-driven decisions. We use Python, Scala, PySpark, Spark and SQL to perform Data Preparation & transformations and Tableau to create self-service dashboards. The team uses the latest in Big Data technologies such as StreamSets, PySpark, Hadoop and Apache Spark in-memory framework.
As the Data Engineer – Junior, you will be responsible for:
- Understanding Data Science & Evaluative Analytics model requirements, working closely with Data Scientists & Statistical Analysts, supporting them with their data needs.
- Conducting analysis for moderate data requests, defining data fields and determining data availability, developing information layout, format and interactivity. Presenting findings and providing clarification.
- Collaborating with customers across the organization.
- Creating mapping documentation of data elements from source to target.
- Developing & testing Data Transformation pipelines by leveraging the latest Big Data tools and cloud technologies.
- Provide subject matter expertise on data sources, reporting workflows, business process, and appropriate tools to analyze data.
- Participate with corporate data user teams, develop data validation and test plans, performing user acceptance testing, and provide feedback to development and sustainment team.
Position Requirements:
To make an immediate contribution, the Data Engineer – Junior must bring the following:
- Experience coding using PySpark and Python will be required. Knowledge of at least one of the following Object Oriented Programming languages: Scala, Java or C++ will be an added bonus
- Experience with or exposure to Big Data platforms, ideally with exposure to Cloud technologies, Hadoop ecosystem (HDFS, Apache Hive, Apache,Spark, Apache Drill, SparkSQL).
- Experience with processing structured and unstructured data.
- Intermediate to Advance experience of writing SQL Queries & working with NoSQL Databases
- Excellent interpersonal, verbal and written communication skills to work with customers.
- Strong data quality management process understanding, data analysis and data profiling.
- Ability to apply critical thinking skills to troubleshoot and perform root cause analysis on technical problems and Data Science model deployments.
- Understanding of Agile Methodologies.
- Experience with reporting and visualization tools, such as Tableau, Jupiter or other reporting tools would be an asset.