Big Data Engineers at Systems Limited leverage predictive analytics and business intelligence to improve the operational capabilities as well as develop new products and services for the clients. Big Data Engineers drive all phases of data aggregation, model development, and model deployment to support internal operations, financial pricing and profitability, and client-facing analytics and data products. They are responsible for analyzing complex, large-scale datasets utilizing statistical methods and machine learning algorithms to develop predictive models and business intelligence solutions.
Perform data analyses on and discover new uses for existing data sources.
Develop and evaluate the performance of predictive statistical models and selecting features, building and optimizing classifiers using machine learning techniques.
Data mining using state-of-the-art methods.
Create and interpret strategic and operational analyses, assess options objectively, and present conclusions and recommendations to all levels of management.
Develop subject matter expertise on source systems data and metadata.
Extending company’s data with third party sources of information when needed.
Enhancing data collection procedures to include information that is relevant for building analytic systems.
Processing, cleansing, and verifying the integrity of data used for analysis.
Doing ad-hoc analysis and presenting results in a clear manner.
Creating automated anomaly detection systems and constant tracking of its performance.
Partner with management and business units on innovative ways to successfully utilize data and related tools to advance business objectives and develop new products and services.
Gain and master a comprehensive understanding of operations, processes, and business objectives and utilize that knowledge for data analysis and business insight.
BSCS or equivalent; Masters in Data Sciences is preferred.
2+ years of experience as a Big Data Engineer or similar role.
Good experience of Cloud platforms such as AWS, Azure or GCP.
Strong SQL, and programming skills with a preference towards Python, Java, Scala, shell scripting.
Must be able to tune Hadoop solutions to improve performance and end-user experience.
Must be Proficient working with: Hadoop cluster (with all included services), Hadoop, Cassandra, MapReduce, HDFS, Cloudera, Storm or Spark-Streaming.
Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala.
Experience with integration of data from multiple data sources.
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB.
Knowledge of various ETL/Ingestion techniques and frameworks, such as NiFi, SSIS, Flume, Airflow, Python.
Good understanding of Lambda Architecture, along with its advantages and drawbacks.