Philadelphia, PA, United States

Location: Remote (should be able to work in EST time zone with occasional Post COVID travel to company headquarters in Philadelphia, PA area)

Sponsorship: This role does not offer visa sponsorship


Imagine what you could do here. Powerlytics, is a cutting edge, venture backed company with a portfolio of products and solutions underpinned by predictive analytics powered by proprietary databases of the anonymized tax returns of all households (150 million+) and for-profit businesses (30 million+) in the U.S. Powerlytics’ clients include top 5 banks, insurance companies, asset managers as well as alternative lenders, marketing firms and global consulting firms, among others. Powerlytics team is looking for a highly qualified Principal Data Engineer experienced with big data and large-scale analytics systems for our team’s substantial data analytics needs. This is an outstanding opportunity to join a focused team and work collaboratively to make a significant impact on our organization. You will be contributing to a large-scale data platform and providing end-to-end analytics solutions to transform rich data at Powerlytics scale into actionable insights. Your data sets will help us determine important features, monitor data product launches and understand data usage and quality in detail. We are looking for Data Engineers who love to build end to end analytics solutions for their customers and can produce high quality data artifacts at scale. The ideal candidate has experience implementing extendable frameworks on top of Spark and understands the inner workings of Spark execution.

Key Qualifications:

  • 6+ years of deep experience with large scale distributed big data systems, pipelines, and data processing.
  • Practical hands-on programming and engineering experience in R, Python, Java, and Scala. Strong R experience is required.
  • Proven experience using distributed computer frameworks on Hadoop, Spark, Cassandra, Kafka, AirFlow, distributed SQL, and NoSQL query engines.
  • Able to setup large scale data pipeline and data monitoring system to make sure overall pipeline is healthy.
  • Willing to take ownership of pipeline and can communicate concisely and persuasively to a varied audience including data provider, engineering, and analysts.
  • Ability to identify, prioritize, and answer the most critical areas where analytics and modeling will have a material impact.
  • Experience in stream data processing and real time analytics of data generated from user interaction with applications is a plus.
  • Prior experience with modern web services architectures, cloud platforms, preferably AWS and Databricks.
  • Understanding of design and development of large scale, high throughput and low latency applications is a plus.
  • Aptitude to independently learn new technologies.
  • Excellent verbal and written communication skills is required.


As part of a small team, you will own significant responsibility in crafting, developing, and maintaining our large-scale ETL pipelines, storage, and processing services. The team is looking for a self-driven data engineer to help design and build data pipelines that allow our team to develop the high quality, scalable data assets. In addition to the design and implementation of this infrastructure, you will be responsible for communicating with data scientists and other team members to determine the most effective models to improve data access, promote econometrics research, and eventually ship groundbreaking features to our customers.

Compensation – Position offers a very competitive compensation package consisting of a base salary, bonus and equity as well as a benefits package.

If interested and qualified, please send a cover letter with your resume to employment@powerlytics.com

Education & Experience:

BS or MS in Computer Science, Engineering or equivalent. Master’s degree preferred.

Additional Requirements:

  • Passion for building trustworthy, reliable, stable, and fast data products that serve customer needs.
  • Software engineering experience and discipline in design, test, source code management and CI/CD practices.
  • Experience in data modeling and developing SQL database solutions.
  • Deep understanding of key algorithms and tools for developing high efficiency data processing systems.
  • Experience in in Financial Service sectors such as Banking, Insurance or Wealth Management is a plus.

Source: Python.org Jobs Feed