Jack Maney

  • Data Science professional with a decade of experience in the telecommunications, ad tech, financial service, and healthcare sectors.
  • Able to derive insights and extract business value from data sets with thousands to billions of rows.
  • Building prototypes of data products using relational databases, Apache Spark, MPP systems, and Python
  • Working closely with Data Engineers to productionalize product prototypes.
  • Researching and implementing algorithms from white papers and academic literature.
  • Creating and A/B testing recommender systems.
  • Experienced problem-solver with very strong quantitative skills.



Data Scientist - June 2022--Present

agilon health

Lead Data Scientist - July 2021--June 2022

  • Worked on models to predict avoidable inpatient visits and medication adherence.
  • Worked with the ML Ops team to validate datasets that have been moved to the Cloud.

Cerner Corporation

Data Scientist - July 2020--June 2021

  • Used AWS SageMaker, BlazingText, and association rule mining to analyze and enhance the Chart Assist Ontology. Created a prototype that could make terminologists up to five times more efficient at matching medical codes to related ontology concepts.


Lead Data Scientist - November 2018--May 2020

After Pinsight Media was acquired by InMobi, there was a reorganization into a few business units, one of which was TruFactor.

  • Consolidated Point of Interest (POI) data from multiple sources to build a single POI database used across the company.
  • Combined the POI dataset with GPS and network location data to build a Visits dataset. Worked closely with product managers to ensure that the product aligned with customer expectations. Performed several white glove analyses for customer trials, which received positive feedback.
  • Worked closely with Data Engineering to productionalize the Visits product, and to quickly deploy hotfixes and feature enhancements.
  • Used multiple data sources--including scraping and third-party APIs--to build and maintain a mapping from URLs to publishers.
  • Mentored members of the Data Science and Business Intelligence teams.

Pinsight Media

Data Scientist III - October 2016--October 2018

Pinsight Media was acquired by InMobi in 2018.

  • Built recommender engines for our on-device monetization products. Worked with development teams to ensure that the back end would be set up for AB testing.
  • Streamlined the ETL processes for building AB testing dashboards.
  • Contributed to a project of predicting demographic attributes for users in our real-time bidding (RTB) system. This resulted in collaborations with Marketing that led to two whitepapers:
  • Mentoring members of the Data Science and Business Intelligence teams.

DST's Applied Analytics Group

Senior Data Scientist - December 2013--October 2016

  • Prototyped a product for acquisition of financial advisors.
  • Contributed towards an Advisor Segmentation product, including a method of streamlining and summarizing the differences between segments.
  • Built a prototype of the Mapper Algorithm (as used in Topological Data Analysis) to better understand high-dimensional data sets. The prototype is written in Python and leverages a Greenplum cluster by way of SQL templates.
  • Built prototypes for three components of DST's Predictive Wholesaling product, and assisted the AAG Development team in productionizing the prototypes.
  • Created and prototyped a Share Retention metric that provides a measurement of "stickiness" of fund holdings that does not directly depend on price.
  • Assisted in building models for a proof of concept for a client.
  • Mentored and taught Python to a few members of the Networking team, to facilitate the creation of a Flask web app to automate some types of network change requests.
  • Mentoring other members of the Data Science team.

BA Services

Data Scientist - May 2013--November 2013

  • Created and cross-validated probit regression models to find the most significant attributes upon which to sort call queues in order to increase customer retention.
  • Built an ETL pipeline to import data from a new dialer system.
  • Delivered a proposal outlining options for a Data Warehouse solution, including pros and cons of each option.
  • Built, Validated, and Deployed business intelligence reports using QlikView.


Implementation Specialist (Contract) - January 2013--May 2013

  • Optimized the C2FO algorithm for Market Clearing Events, making it run an average of two orders of magnitude faster.
  • Organized the restructuring of several KPI business intelligence reports.
  • Built, tested, and deployed user management tools for account managers.


July 2010--November 2012
Titles Held:
  • Sr Data Analyst and Mathematician - January 2012--November 2012
  • Data Analyst and Mathematician - February 2011--January 2012
  • Data Analyst - July 2010-February 2011
  • Performed data mining and summarized results that contributed to the winning of a $50,000 advertiser contract.
  • Presented technical and mathematical concepts to non-technical audiences, including several layers of management and a venture capital investor.
  • Developed an application in Perl using DBI for k-means++ clustering. This application is able to handle data sets of millions of rows with 1--100 variables.
  • Found a way to implement a regression algorithm--on a dataset with 30 million rows and 250 variables--that was previously thought impossible to implement due to scale.
  • Built, implemented, deployed, and maintained an ad category recommendation system for advertisers, including developing and measuring performance metrics.
  • Implemented a genetic algorithm framework to use for behavioral targeting algorithms.
  • Maintained and documented the ETL pipelines to the Data Analytics team, consisting of over 200 scripts in Perl and Python interfacing with Greenplum, PostgreSQL, Oracle, MySQL, MS SQL, and ActiveMQ.
  • Refactored and maintained critical business intelligence reports used by machine learning scientists.
  • Prototyped a flexible, extensible ETL system to reduce a lot of boilerplate code in existing ETL scripts.
  • Created a web-based data dictionary to store metadata about tables in our warehouse. The front end was written in PHP with SQLite on the back-end to store the metadata.
  • Was considered a resident expert of our data warehouse.
  • Contributed to the on-boarding of two interns and two full-time employees.

University of South Dakota

Assistant Professor - August 2004--May 2008

  • Six peer-reviewed mathematical publications.
  • Directed two undergraduate Honors Theses and a Master's Thesis in mathematics.
  • Sole organizer and director of a regional undergraduate mathematics conference.
  • Taught several courses, including College Algebra, Trigonometry, Calculus (I--III), Foundations of Mathematics, Matrix Theory, and Abstract Algebra.
  • Served and chaired several committees, including the Curriculum & Instruction committee.

Open-source Software

Universal Correlation Coefficient

At the 2011 Joint Statistical Meetings, a paper was presented that introduced the idea of a Universal Correlation Coefficient. This coefficient measures the degree of dependency (but not the form of dependency) for two discrete random variables.

I have written an R library that implements this Universal Correlation Coefficient. This coefficient can be used to automate the discovery of (potentially non-linear) relationships among pairs of discrete random variables.

pg-utils: Utilities for working with PostgreSQL

Some handy utilities that I've written for processing data in either PostgreSQL or Greenplum.

Python Standard Library List

Lists of names of packages in the Python standard library (for versions 2.6, 2.7, and 3.2-8), along with the code used to grab the list of libraries from the official Python docs. This is my most popular repository on GitHub.


Diophantus, a pet project created to teach myself Java, originated as Mathematica code that I wrote as a graduate student. The original code generated examples and helped form conjectures for what became a series of two peer-reviewed mathematical publications.


Languages and Technologies

  • AWS, EMR, EC2, S3
  • AWS SageMaker, BlazingText
  • Apache Spark, Apache Hive, Apache Hadoop, HDFS
  • Python, PySpark, H3, Shapely, Pandas, NumPy, SciPy, scikit-learn, Requests, matplotlib, seaborn, PyCharm
  • Relational Databases, Greenplum (Massively Parallel Processing Distributed System), PostgreSQL, Oracle, MySQL, Microsoft SQL Server, MS SQL, OLAP, SQL
  • Kotlin, Java, Scala, IntelliJ IDEA, Eclipse
  • Tableau, QlikView
  • JSON, GeoJSON, Parquet, CSV
  • R
  • Perl, Moose (OO Perl), DBI, threads, threads::shared, Thread::Queue, Template::Toolkit
  • Git, SVN, GitHub, GitLab, Assembla, Source Control, Version Control
  • JIRA, Pivotal Tracker, Confluence, Assembla
  • Linux, Amazon Linux, openSUSE, Ubuntu, CentOS, RHEL, bash
  • The Aylien text analysis API
  • The 42matters API for app metadata
  • Social Radar API, Facebook Ads API

Other Skills

  • Mathematics
  • Topological Data Analysis
  • Data mining
  • Data visualization
  • Implementing algorithms and ideas gleaned from academic publications


North Dakota State University

Ph.D. Mathematics, May 2004

B.S Mathematics, Dec 1999

Training Courses and Professional Development

  • AWS Training from Amazon Web Services, 2018
  • Apache Cassandra training from Learning Tree International, 2017
  • Apache Spark training from Databricks, 2017
  • Hadoop and MapReduce Training from Hortonworks, 2015
  • Data Anonymization Training from Privacy Analytics, 2015
  • Greenplum User Training from Pivotal, 2014
  • Attended KDD 2014
  • QlikView Developer Training from Qlik, 2013
  • Noble Dialer Operations Training from Noble Systems, 2013
  • Java Training from Webucator, 2012
  • PostgreSQL Training from Webucator, 2010