Role - Data Engineer
Location - (Remote to Start) Dallas, USA / Philadelphia, USA
Primary Responsibilities:
Your responsibilities will include developing sustainable data driven solutions with current new generation data technologies to drive our business and technology strategies
- Design and implement Data Engineering Solutions and ETL Processes with the Azure stack including Azure Data Factory, Azure Data Lake, Azure SQL Server, Databricks, Logic Apps, Service Bus
- Build data pipeline frameworks to automate high-volume batch and real-time data delivery
- Design Azure data ingestion frameworks and pipelines based on the specific needs driven by the Product Owners and user stories
- Continuously integrate and ship code into our cloud production environments
- Work directly with Product Owners and customers to deliver data products in a collaborative and agile environment
- Independently solve complex technical data engineering problems
- Building data APIs and data delivery services to support critical operational and analytical applications
- Contributing to the design of robust systems with an eye on the long-term maintenance and support of the application
- Leveraging reusable code modules to solve problems across the team and organization
- Handling multiple functions and roles for the projects and Agile teams
- Defining, executing and continuously improving our internal software architecture processes
Knowledge, Skills & Experience
Education
- BS degree in Computer Science, Engineering or similar
- Intermediate to senior level experience in a Data Engineering role. Demonstrated strong execution capabilities
Required
- Must have prior data engineering and ETL experience
- Expert knowledge and experience with T-SQL
- Hands on technology experience including Azure Data Factory, Data Lake (ADLS), Delta Lake, Databricks, Logic Apps, Service Bus
- Azure Synapse, Azure stream Analytics, Azure Event hub, Azure Analysis services & CosmosDB is a plus
- Significant experience with Azure SQL Server
- Demonstrated experience with best Agile Scrum SDLC practices: coding standards, reviews, code management, build processes, and testing.
- History of successfully developing software following an Agile methodology
- Search engine integration and data catalog/metadata store experience is preferred
- 5+ years of experience on designing and developing Data Pipelines for Data Ingestion or Transformation
- Exposure to File Format (Parquet, AVRO, ORC), Resource Management, Distributed Processing and RDBMS
- 5+ years of developing applications with Monitoring, Build Tools, Version Control, Unit Test, TDD, Change Management to support DevOps
- At least 2 years of experience with SQL and Shell Scripting experience
Preferred
- Development Environment with Cloud Technologies
- Experience with Azure DevOps CI\CD
- Experience working with a combined in house and outsourced team
- Experience working in a geographically separated team
- 2+ years’ experience with Amazon Web Services (AWS), Microsoft Azure, Google Compute or another public cloud service
- 2+ years of experience working with Streaming using Spark or Kafka or NoSQL
- 2+ years of experience working with Dimensional Data Model and pipelines in relation with the same
- Intermediate level experience/knowledge in at least one scripting language (Python, Scala, Pyspark)
- Hands on design experience with data pipelines, joining data between structured and unstructured data
Other
- Ability to work independently
- Excellent oral and written communication skills
- Ability to present new ideas, approaches and information clearly
- Outstanding attention to detail and organizational skills
- Diligent work ethic and insatiable desire to learn and develop skills
- Ability to acquire new knowledge quickly
- Strong interpersonal skills
- Self-starter, highly motivated
- Excellent Time management skills
- Cultural sensitivity/awareness
- Successfully complete assessment tests offered in Pluralsight, Udemy, etc. or complete certifications to demonstrate technical expertise on more than one development platform.