Job Title: Software Developer
Job Duties: Implement ETL solutions between OLTP and OLAP systems, with expertise in all phases of the Software Development Lifecycle (SDLC). Utilize cloud services including Amazon Web Services (AWS) for ETL, data integration and migration. Design and implement data warehousing solutions using Star and Snowflake Schema, Dimension Modeling, E-R Modeling, and Slow Changing Dimensions (Type 1 SCD and Type 2 SCD). Convert HQL/SQL queries into Spark transformations and actions as needed. Develop and optimize SQL queries using Databricks, Amazon Redshift, and Postgres. Work with Cloudera distribution and Databricks to enhance data processing capabilities. Improve algorithm performance and optimization using Spark-SQL, Spark Dataframes, and Spark Datasets. Manage RDBMS technologies including Postgres and Oracle for data storage and retrieval. Orchestrate workflows using Stonebranch, Apache Airflow, and Apache Oozie to automate tasks. Develop custom Spark UDFs in Scala/Python to extend Spark functionality for specific data processing needs. Optimize PL/SQL functions and stored procedures for improved performance. Create custom Python scripts to automate routine administrative tasks. Manage data lake infrastructure on Amazon S3/Google Cloud Storage for raw and processed data. Work with MPP databases including Redshift and Impala for large-scale data processing. Leverage Cloud Shell for various tasks and service deployment. Handle large datasets using partitions, Spark-In Memory, broadcasts and caching. Apply Agile and Waterfall methodologies to project management. Develop automation scripts using Boto3 for seamless file operations on S3.
Work Location: Various unanticipated work locations throughout the United States; relocation may be required. Must be willing to relocate.
Minimum Requirements:
Education: Master – Computer Science
Experience: One (1) year