BostonRecruiter Since 2001
the smart solution for Boston jobs

Cloud Data Engineer

Company: Homesite Insurance
Location: Boston
Posted on: June 13, 2021

Job Description:

Homesite Insurance was founded in 1997 and was one of the first companies to enable customers to purchase home insurance directly online, during a single visit. Since then, we've continued to innovate rapidly to meet the needs of our customers and their changing expectations.

One thing that's stayed the same since our founding: our commitment to our customers, partners and employees.

Join us on our journey as we continue to grow into a powerful contender in the field of insurance.

Compensation may vary based on the job level and your geographic work location.

Compensation Minimum:$76,000

Compensation Maximum:$150,000

The Cloud Data Engineer is a specialized role participating in designing and implementing systems on Public Cloud infrastructure to deliver more analytical and business value from a wide range of data sources. You will work with the team to design and develop high-performance, resilient, automated data pipelines, streams, and applications, adapting technologies for ingesting, transforming, classifying, cleansing and exposing data using creative design to meet objectives. Your broad experience with data management technologies will enable you to match the right technologies to the required schemas and workloads. Our focus in on the AWS and GCP platforms, with a strong serverless bias. We rely heavily on Python, PySpark, BigQuery and related technologies, and work in an Agile, DevOps team culture. We expect you to bring an array of specialized skills noted below, and to lead by learning.

Responsibilities:

  • Build and Maintain serverless data pipelines in terabyte scale using AWS and GCP services - AWS Glue, PySpark and Python, AWS Redshift, AWS S3, AWS Lambda and Step Functions, AWS Athena, AWS DynamoDB, GCP BigQuery, GCP Cloud Composer, GCP Cloud Functions, Google Cloud Storage and others
  • Integrate new data sources from enterprise sources and external vendors using a variety of ingestion patterns including streams, SQL ingestion, file and API.
  • Maintain and provide support for the existing data pipelines using the above-noted technologies
  • Work to develop and enhance the data architecture of the new environment, including recommending optimal schemas, storage layers and database engines including relational, graph, columnar, and document-based, according to requirements
  • Develop real-time/near real-time data ingestion from a range of data integration sources, including business systems, external vendors and partner and enterprise sources
  • Provision and use machine-learning-based data wrangling tools like Trifacta to cleanse and reshape 3rd party data to make suitable for use.
  • Participate in a DevOps culture by developing deployment code for applications and pipeline services
  • Develop and implement data quality rules and logic across integrated data sources.
  • Serve as internal subject matter expert and coach to train team members in the use of distributed computing frameworks and big-data services and tools, including AWS and GCP services and projects

Required Experience and Skills: (Experience is expected to be hands-on, and not through team exposure alone)

  • Master's degree in Computer Science, Mathematics, Engineering, or equivalent work experience
  • Four years working with datasets with very high volume of records or objects
  • Expert level programming experience in Python and SQL
  • Two years working with Spark or other distributed computing frameworks (may include: Hadoop, Cloudera)
  • Four years with relational databases (typical examples include: PostgreSQL Microsoft SQL Server, MySQL, Oracle)
  • Two years with AWS services including S3, Lambda, Redshift, Athena, S3
  • One year working with Google Cloud Platform (GCP) services, which may include any combination of: BigQuery, Cloud Storage, Cloud Functions, Cloud Composer, Pub/Sub and others (this may be via POC or academic study, though professional experience is preferred)
  • Some knowledge of AWS services: DynamoDB, Step Functions
  • Experience with contemporary data file formats like Apache Parquet and Avro, preferably with compression codecs, like Snappy and BZip.
  • Experience analyzing data for data quality and supporting the use of data in an enterprise setting.

Desired Experience and Skills:

  • Streaming technologies (e.g.: Amazon Kinesis, Kafka)
  • Graph Database experience (e.g.: Neo4j, Neptune)
  • Distributed SQL query engines (e.g.: Athena, Redshift Spectrum, Presto)
  • Experience with caching and search engines (e.g.: ElasticSearch, Redis)
  • ML experience, especially with Amazon Sagemaker, DataRobot, AutoML
  • IAC coding tools, including CDK, Terraform, Cloudformation, Cloud Build

When you work at Homesite you can expect benefits that support your physical, emotional, and financial wellbeing. You will have access to comprehensive medical, dental, vision and wellbeing benefits that enable you to take care of your health. We also offer a competitive 401(k) contribution, a pension plan, an annual incentive, and a paid-time off program. In addition, our student loan repayment program and paid-family leave are available to support our employees and their families. Interns at Homesite are eligible for the paid time off program. Contingent workers are not eligible for American Family Enterprise benefits.

Stay connected: Join Our Enterprise Talent Community!

Keywords: Homesite Insurance, Boston , Cloud Data Engineer, Other , Boston, Massachusetts

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Massachusetts jobs by following @recnetMA on Twitter!

Boston RSS job feeds