top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

Blog Post

Entering this program, I wanted to expand my knowledge of data and statistics beyond a sporting context. I studied Sport Analytics and economics in my undergraduate career at Syracuse and enjoyed exploring the topics associated with both. I have always been fascinated by the capabilities that technology possesses, especially with the dawn of generative AI and highly sophisticated predictive modeling. Studying the numbers of sports and the power they can possess opened the floodgates in my mind of the possibilities for analyzing anything that can be measured. I was eager to learn the capabilities of machine learning and creating your own artificial intelligence to help on specific tasks. As technology continues to move forward, there is plenty more to be explored and researched on the best practices and uses for predictive, descriptive, and prescriptive analytics.
I selected the data and business analytics track as my secondary core requirement for the master’s program. Having already gotten some exposure to the technical parts of the curriculum during my undergraduate career, I wanted to enhance my domain knowledge on financial and marketing concepts and how data analysis is best applied to each of those. I had learned about certain analytical and modeling techniques in different contexts, mostly sports data, both on and off the playing surface. Having substantial knowledge on sports to begin with was a big lift to my understanding of those concepts early on, going from wins and losses to profits and demand enhanced my understanding of the financial and marketing landscapes to nearly be on-par with my sports knowledge. Taking both my technical and business knowledge to the next level was a major goal of mine at the start of this program, and selecting the data and business analytics track achieved that.
The first major project I was a part of in this program was in Intro to Data Science (IST 687). We were tasked with garnering actionable insights from a health insurance dataset provided by our professor. The goal was to find the biggest factors what drives the cost of healthcare insurance for individuals using certain metrics such as age, body mass index, children, gender, and smoking. The analysis was done in R and involved both predictive modeling and visualization. Overall, individuals should be incentivized to exercise more and to smoke less if they’re to decrease their healthcare costs. Smoking was the leading factor in driving up costs, while exercising was the greatest mitigator to those costs increasing. Other factors such as age, BMI, hypertension, and children are also major contributors, but more difficult to address in the short term. Additionally, exercising more should lead individuals to a healthier BMI and overall lifestyle. The learning outcome best shown in this project was
One of the big projects I did with a group in the graduate program was in applied machine learning (IST 707) regarding transfers in association football (soccer). Our group sought to analyze how soccer clubs try to maximize wins by spending large fees for star players, or those that seek to make more on their bottom line by selling these top prospects to the big buyers. Our dataset came from Kaggle and related many different teams across European leagues competing in many different competitions, sometimes against one another, and other times separated entirely. Many of the larger, more lucrative clubs tend to play each other more often, and the smaller tiered teams can almost be viewed as developmental for players on track to play at a higher level. We carried out many machine learning algorithms in Python to see which factors amount to maximizing wins for clubs, or what drives larger fees for certain star players. My portion was to focus on the factors that relate to winning games on the pitch, rather than the monetary figures in play off the field. The biggest factor in maximizing wins was the age of visiting teams. Across many sports, winning away from home has proven difficult, especially with younger, inexperienced players. This wasn’t necessarily noteworthy to anyone that understands the details of sport, however, something like this superseding how many national team players a team has or how much they’ve spent on players, age matters more. Moreover, age was one of, if not the most significant factor in the prices clubs pay for these star players. The learning outcome best described for this project would be “apply visualization and predictive models to help generate actionable insight” with our models trying to predict and visualize what matters most in maximizing wins or profits for sporting organizations.
A major project I undertook in the program was a campaign finance dashboard in my data warehousing course (IST 722). Building on many of the concepts I was familiar with and combining similar disciplines to create something such as this was an interesting challenge to take on. Combining my background knowledge of SQL and relational database techniques and management, building data pipelines for extraction, transforming, and loading data from complex formats to generate actionable insights truly took my learning and understanding to the next level. Our group campaign donor data from the Federal Election Commission in text format and converting it into readable data using Snowflake and Data Build Tools (dbt labs). My contribution to the group project was to ensure our table structure in SQL/Snowflake was properly in place for our data warehouse. From there, we created data pipelines to extract, transform, and load the data into a more digestible format for analysis and visualization. Our pipeline went straight from Snowflake to a PowerBI dashboard that could be updated over time and as campaign financing changes. We found that many of the sizable campaign contributions across both parties to various sized political action committees (PACs) came from retired individuals, an interesting insight to find on those that put their fingers on the scales of American elections. Many of the donations also came from typically non-swing states, with many sizable Democratic donations coming from California, and Illinois, and a lot of Republican fundraising coming from Texas and Florida. The learning outcomes fulfilled and demonstrated in this project were most exemplified by “Communicate insights gained via visualization and analytics to a broad range of audiences (including project sponsors and technical team leads” since politics has arguably one of the largest audiences in America and the world, accurately and effectively communicating any better understanding is effective and essential to helping any and all American citizens make a sensible decision at the ballot box, while also providing critical indicators to experts on how to best increase fundraising and momentum.
My favorite class in the program is Big Data Analytics (IST 718). I think it combines many of the concepts explored in the prerequisites and foundational core courses and takes them further in depth and overall synthesizes everything we learn in the program. Having to make sense of extremely dense and complex datasets in various forms greatly expands the tools which bring success for machine learning and predictive modeling techniques. I also found the synergy of different coding languages and packages we utilized to be helpful in solving a variety of problems that require different solutions for each. I found the class structure of going over examples in class that we will use on real world example later was also very beneficial to my hands-on learning style. Professor Dunham is an excellent communicator of the concepts and draws parallels between the complex technical details and real world applications and examples.
The best part of the program for me would be the growing field that it allows us graduates students to enter into. Artificial intelligence and data-oriented decision-making are increasing rapidly than we can comprehend and being literate in the concepts and frameworks that’ll be implemented in a vast number of companies across all industries in every economy across the globe. I feel incredibly lucky to be on the forefront of this advent of new and novel technology with the capabilities to change everyone’s lives for the better.
I can confidently say that all learning outcomes were exemplified and executed during my time getting my applied data science degree. I was happy to use all kinds of software to facilitate analysis from complex data structures and gain actionable insights to be effectively communicated to all noteworthy stakeholders. Certain classes focused more on modeling, visualization, ethics, or data structure and getting experience with each of those disciplines greatly enhances the tools at my disposal to fix any technical problems moving forward. On the program as a whole, I feel that there was a palpable philosophy to enhance technical, communicative, and interpersonal skills. There are many smart coders out there that can make something from nothing, but it takes even more to describe, understand, and communicate those technical details effectively for those that aren’t so familiar.

bottom of page